There are an array of potential challenges for AI safety posed by collective action of multiple AI developers, nations or other actors pursuing their self-interest. Addressing risks posed by individual AI systems alone is insufficient. The principles of Game Theory can help us to understand how rational self-interest can lead to collectively undesirable results. We discuss phenomena like the Prisoner's Dilemma, race dynamics, and the possible emergence of harmful AI behaviors. The concept of Generalized Darwinism, which expands evolution by natural selection beyond biology, may also provide a useful lens to understand how the diffusion of AI systems in society could lead to a transfer of power from humanity to AI systems that may exhibit dangerous behaviours. Lastly, theories of conflict in international relations can help us to understand under what circumstances the developers of advanced AI systems or the systems themselves might come into conflict with each other, which could lead to violent consequences.
Arms and Influence (Schelling, 1966)
Five rules for the evolution of cooperation (Nowak, 2006)
Rationalist Explanations for War (Fearon, 1995)
Natural Selection Favors AIs over Humans, Hendrycks (2023)
Sociobiology: The New Synthesis (Wilson, 1975)
Moral Origins (Boehm, 2012)
Axelrod, R. (2009). The Evolution of Cooperation: Revised Edition. United States: Basic Books