This course provides an introduction to how current AI systems work, why many experts are concerned about increasing risks of societal harms and disruption as AI systems continue to improve, and how society can manage and mitigate these risks.



Week 1: Overview of Catastrophic AI Risks



Week 2: AI Fundamentals



Week 3: Single-Agent Safety



Week 4: Safety Engineering



Week 5: Complex Systems



Week 6: Beneficial AI and Machine Ethics



Week 7: Collective Action Problems



Week 8: Governance


Week 1: How AI works

  • How modern ML systems are built and how deep learning works
  • Scaling laws and their significance for AI safety

Week 2: Societal-scale risks from AI

  • Multiple sources of catastrophic risk from advanced AI: not just rogue AI, but also malicious use, accidents or gradual disempowerment of humans via AI races
  • Role of AI races/ collective action problems and organisational risks/ safety culture as risk factors
  • Examples of concrete scenarios where advanced AI could contribute to catastrophic outcomes

Week 3: Technical challenges in building safe AI

  • Technical limitations of current deep learning systems e.g. opaqueness, proxy gaming, adversarial attacks, etc.
  • Consequences of these technical challenges for managing risks from AI (e.g. inability to reliably instill values into AI, predict future AI systems or reliably evaluate AI systems) and tail risks (power-seeking, treacherous turns)
  • Relationship between AI safety and capabilities

Week 4: Systemic safety:

  • Distinction between risk elimination and risk reduction and basic factors that drive risks (risk equation)
  • Limitations of component failure accident models; value of Normal Accident Theory/ High Reliability Organisations as tools for thinking about AI safety
  • Importance of tail events/ black swans for effective risk management
  • Why AI systems can be characterised as complex systems and why this is a problem for many interventions aiming to advance AI safety

Week 5: Difficulties in coordinating to not to build unsafe AI systems:

  • Basic concepts from game theory and their relevant to competition to develop advanced AI
  • How AI could exacerbate dynamics that lead to conflict
  • How AI systems could be imbued with cooperative tendencies and limitations of these approaches

Week 6: Governance - Non-proliferation:

  • Trade-offs between centralised and decentralised distribution of access to advanced AI
  • Role of compute governance and other approaches to limit proliferation of high-risk AI systems

Week 7: Governance - Safety standards and international treaties:

  • Tools for national governance (standards, liability) and policies to promote resilience and competitiveness
  • Forms of international governance and core questions for international agreements (is AI offense-dominant, can compliance be verified, etc.?)

Week 8: Next steps:

  • Identify some ways that you could further learn about or contribute to AI safety
  • Select 1-3 project options that you’re excited to pursue and identify next steps to finalize project selection and carry out the project

Week 12: Finalize project and discuss next steps:

  • Receive support from facilitators during optional office hours
  • Submit project in order to earn certificate


If you have any other questions, you can contact us at

Be first to hear about the course