Collective Action Problems

There are many cases where intelligent agents can, despite acting rationally and in accordance with their own self-interest, collectively produce outcomes that none of them wants. Making individual AI systems reliable is not sufficient to avoid all risks from AI. Rather, we need to understand how these dynamics affect interactions between humans and AIs in order to prevent harmful, potentially catastrophic outcomes.


Addressing risks posed by individual AI systems alone is insufficient, as many challenges in AI safety come from the interaction of multiple AI developers, nations or other actors pursuing their self-interest. The principles of Game Theory can help us to understand how rational self-interest can lead to collectively undesirable results. Theories of conflict in international relations help us to understand under what circumstances the developers of advanced AI systems or the systems themselves might come into conflict with each other, which could lead to violent consequences. Lastly, the concept of Generalized Darwinism, which expands evolution by natural selection beyond biology, may also provide a useful lens to understand how the diffusion of AI systems in society could lead to an erosion of human control over AI.

Further reading

T. C. Schelling, Arms and Influence. Yale University Press, 1966.

M. A. Nowak, "Five Rules for the Evolution of Cooperation," Science, vol. 314, no. 5805, pp. 1560-1563, 2006.

J. D. Fearon, "Rationalist Explanations for War," International Organization, vol. 49, no. 3, pp. 379-414, Summer 1995.

D. Hendrycks, "Natural Selection Favors AIs over Humans," 2023.

E. O. Wilson, Sociobiology: The New Synthesis. Belknap Press of Harvard University Press, 1975.

C. Boehm, Moral Origins: The Evolution of Virtue, Altruism, and Shame. Basic Books, 2012.

R. Axelrod, The Evolution of Cooperation: Revised Edition. Basic Books, United States, 2009.

Discussion Questions

Review Questions