Introduction to AI Safety Ethics, and Society

Introduction to AI Safety, Ethics, and Society

Rapid progress in AI raises questions about how the deployment of advanced AI systems will impact society, both for better and for worse. How do we ensure that AI is deployed in ways that are ethical and have positive societal impact? How can we understand and mitigate risks from advanced AI systems?  

This course, developed by Dan Hendrycks, director of the Center for AI Safety, aims to provide an accessible introduction to students, practitioners and others looking to better understand these issues.
The course textbook is available to read online here and is forthcoming in print with Taylor & Francis.

Ensuring that AI systems are safe is more than just a machine learning problem - it is a societal challenge that cuts across traditional disciplinary boundaries. This course takes a holistic approach drawing on insights from engineering, economics and other relevant fields.

The course aims to foster a thoughtful and nuanced understanding of AI safety, equipping participants with the tools and insights needed to navigate this rapidly evolving field. Key topics covered include:

  • Fundamentals of modern AI systems and deep learning, scaling laws, and their implications for AI safety
  • Technical challenges in building safe AI including opaqueness, proxy gaming, and adversarial attacks, and their consequences for managing AI risks
  • The diverse sources of societal-scale risks from advanced AI, such as malicious use, accidents, rogue AI, and the role of AI racing dynamics and organizational risks
  • The importance of focussing on the safety of the sociotechnical systems that AI is part of, the relevance of safety engineering and complex systems theory, and approaches to managing tail events and black swans
  • Collective action problems associated with AI development and challenges with building cooperative AI systems
  • Approaches to AI governance, including safety standards and international treaties, and trade-offs between centralised and decentralised access to advanced AI

chapter Summaries

Week:

1

1. Overview of Catastrophic AI Risks

Week:

2

2. AI Fundamentals

Week:

3

3. Single-Agent Safety

Week:

4

4. Safety Engineering

Week:

5

5. Complex Systems

Week:

6

6. Beneficial AI and Machine Ethics

Week:

7

7. Collective Action Problems

Week:

8

8. Governance