Skip to the content.

Prospective Students

Course Staff

Instructor (Guest) Co-instructor (Guest) Co-instructor
Dawn Song Xinyun Chen Kaiyu Yang
Professor, UC Berkeley Research Scientist,
Google DeepMind
Research Scientist,
Meta FAIR

Teaching Staff: Alex Pan, Tara Pande, Ashwin Dara, Jason Yan

Class Time and Location

Lecture: 4-6pm PT Monday at Anthro/Art Building 160

Course Description

Large language model (LLM) agents have been an important frontier in AI, however, they still fall short critical skills, such as complex reasoning and planning, for solving hard problems and enabling end-to-end applications in real-world scenarios. Building on our previous course, this course dives deeper into advanced topics in LLM agents, focusing on reasoning, AI for mathematics, code generation, and program verification. We begin by introducing advanced inference and post-training techniques for building LLM agents that can search and plan. Then, we focus on two application domains: mathematics and programming. We study how LLMs can be used to prove mathematical theorems, as well as generate and reason about computer programs. Specifically, we will cover the following topics:

Syllabus

‡Livestream Only

Date Guest Lecture
(4:00PM-6:00PM PST)
Supplemental Readings
Jan 27th Inference-Time Techniques for LLM Reasoning
Xinyun Chen, Google DeepMind
Intro Slides
- Large Language Models as Optimizers
- Large Language Models Cannot Self-Correct Reasoning Yet
- Teaching Large Language Models to Self-Debug
All readings are optional this week.
Feb 3rd‡ Learning to reason with LLMs
Jason Weston, Meta
Slides
- Direct Preference Optimization: Your Language Model is Secretly a Reward Model
- Iterative Reasoning Preference Optimization
- Chain-of-Verification Reduces Hallucination in Large Language Models
Feb 10th‡ On Reasoning, Memory, and Planning of Language Agents
Yu Su, Ohio State University
Slides
- Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization
- HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models
- Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents
Feb 17th No Class - Presidents’ Day  
Feb 24th‡ Reasoning and Planning in Large Language Models
Hanna Hajishirzi, University of Washington
- Tulu 3: Pushing Frontiers in Open Language Model Post-Training
- Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback
- OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs
Mar 3rd Coding Agents and AI for Vulnerability Detection
Charles Sutton, Google DeepMind
 
Mar 10th‡ Coding agents/web agents
Ruslan Salakhutdinov, CMU/Meta
 
Mar 17th Multimodal Agents
Caiming Xiong, Salesforce AI Research
 
Mar 24th No Class - Spring Recess  
Mar 31st‡ AlphaProof
Thomas Hubert, Google DeepMind
 
Apr 7th Language models for autoformalization and theorem proving
Kaiyu Yang, Meta FAIR
 
Apr 14th‡ Advanced topics in theorem proving
Sean Welleck, CMU
 
Apr 21st‡ Program verification & generating verified code
Swarat Chaudhuri, UT Austin
 
Apr 28th‡ Agent safety & security
Dawn Song, UC Berkeley
 

Enrollment and Grading

Prerequisites: Students are strongly encouraged to have had experience and basic understanding of Machine Learning and Deep Learning before taking this class, e.g., have taken courses such as CS182, CS188, and CS189.

Please fill out the petition form if you are on the waitlist or can’t get added to the waitlist.

This is a variable-unit course. All enrolled students are expected to participate in lectures in person and complete weekly reading summaries related to the course content. Students enrolling in one unit are expected to submit an article that summarizes one of the lectures. Students enrolling in more than one unit are expected to submit a lab assignment and a project instead of the article. For students enrolling in 2 units, the project should have a written report, which can be a survey in a certain area related to LLMs. For students enrolling in 3 or 4 units, projects will follow either an applications track or a research track:

The grade breakdowns for students enrolled in different units are the following:

  1 unit 2 units 3/4 units
Participation 40% 16% 8%
Reading Summaries 10% 4% 2%
Quizzes 10% 4% 2%
Article 40%    
Lab   16% 8%
Project      
  Proposal   10% 10%
  Milestone   10% 10%
  Presentation   20% 15%
  Report   20% 20%
  Implementation     25%

Lab and Project Timeline

  Released Due
Project group formation 1/27 2/24
Project proposal 2/3 2/24
Project milestone 2/24 3/31
Lab 3/31 4/28
Project final presentation 4/28 5/9
Project final poster 4/28 5/9
Project final report 4/28 5/16

Office Hours