Introduction to AI Safety Ethics, and Society

Introduction to AI Safety, Ethics, and Society

Rapid progress in AI raises questions about how the deployment of advanced AI systems will impact society, both for better and for worse. How can we understand and mitigate risks from these AI systems? How do we ensure that AI is deployed in ways that are ethical and have positive societal impact?

This course, developed by Dan Hendrycks, director of the Center for AI Safety, aims to provide an accessible introduction to students, practitioners and others looking to better understand these issues.

Ensuring that AI systems are safe is more than just a machine learning problem - it is a societal challenge that cuts across traditional disciplinary boundaries. This course takes a holistic approach drawing on insights from engineering, economics and other relevant fields.

The course aims to foster a thoughtful and nuanced understanding of AI safety, equipping participants with the tools and insights needed to navigate this rapidly evolving field. Key topics covered include:

  • Fundamentals of modern AI systems and deep learning, scaling laws, and their implications for AI safety
  • Technical challenges in building safe AI including opaqueness, proxy gaming, and adversarial attacks, and their consequences for managing AI risks
  • The diverse sources of societal-scale risks from advanced AI, such as malicious use, accidents, rogue AI, and the role of AI racing dynamics and organizational risks
  • The systemic nature of AI safety, the relevance of safety engineering and complex systems theory, and the importance of managing tail events and black swans
  • Collective action problems associated with AI development and challenges with building cooperative AI systems
  • Approaches to AI governance, including safety standards and international treaties, and trade-offs between centralised and decentralised access to advanced AI

chapter Summaries



1. Overview of Catastrophic AI Risks



2. AI Fundamentals



3. Single-Agent Safety



4. Safety Engineering



5. Complex Systems



6. Beneficial AI and Machine Ethics



7. Collective Action Problems



8. Governance