Prospective Students
- Students interested in the course should first try enrolling in the course in CalCentral. The class number for CS194-280 is 33840. The class number for CS294-280 is 33841. Please join the waitlist if the class is full.
- We plan to expand the class size to allow more students to join. Please fill in the petition form if you are on the waitlist or can’t get added to the waitlist. You will receive an email notification around the beginning of the spring semester if you are allowed in.
- Do not email course staff or TAs. Please use Edstem for any questions. For private matters, post a private question on Edstem and make sure it is visable to all teaching staff.
Course Staff
Instructor | (Guest) Co-instructor | (Guest) Co-instructor |
![]() |
![]() |
![]() |
Dawn Song | Xinyun Chen | Kaiyu Yang |
Professor, UC Berkeley | Research Scientist, Google DeepMind |
Research Scientist, Meta FAIR |
Teaching Staff: Alex Pan, Tara Pande, Ashwin Dara, Jason Yan
Class Time and Location
Lecture: 4-6pm PT Monday at Anthro/Art Building 160
Course Description
Large language model (LLM) agents have been an important frontier in AI, however, they still fall short critical skills, such as complex reasoning and planning, for solving hard problems and enabling end-to-end applications in real-world scenarios. Building on our previous course, this course dives deeper into advanced topics in LLM agents, focusing on reasoning, AI for mathematics, code generation, and program verification. We begin by introducing advanced inference and post-training techniques for building LLM agents that can search and plan. Then, we focus on two application domains: mathematics and programming. We study how LLMs can be used to prove mathematical theorems, as well as generate and reason about computer programs. Specifically, we will cover the following topics:
- Inference-time techniques for reasoning
- Post-training methods for reasoning
- Search and planning
- Agentic workflow, tool use, and functional calling
- LLMs for code generation and verification
- LLMs for mathematics: data curation, continual pretraining, and finetuning
- LLM agents for theorem proving and autoformalization
Syllabus
‡Livestream Only
Date | Guest Lecture (4:00PM-6:00PM PST) |
Supplemental Readings |
---|---|---|
Jan 27th | Inference-Time Techniques for LLM Reasoning Xinyun Chen, Google DeepMind Intro Slides |
- Large Language Models as Optimizers - Large Language Models Cannot Self-Correct Reasoning Yet - Teaching Large Language Models to Self-Debug All readings are optional this week. |
Feb 3rd‡ | Learning to reason with LLMs Jason Weston, Meta Slides |
- Direct Preference Optimization: Your Language Model is Secretly a Reward Model - Iterative Reasoning Preference Optimization - Chain-of-Verification Reduces Hallucination in Large Language Models |
Feb 10th‡ | On Reasoning, Memory, and Planning of Language Agents Yu Su, Ohio State University Slides |
- Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization - HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models - Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents |
Feb 17th | No Class - Presidents’ Day | |
Feb 24th‡ | Reasoning and Planning in Large Language Models Hanna Hajishirzi, University of Washington |
- Tulu 3: Pushing Frontiers in Open Language Model Post-Training - Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback - OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs |
Mar 3rd | Coding Agents and AI for Vulnerability Detection Charles Sutton, Google DeepMind |
|
Mar 10th‡ | Coding agents/web agents Ruslan Salakhutdinov, CMU/Meta |
|
Mar 17th | Multimodal Agents Caiming Xiong, Salesforce AI Research |
|
Mar 24th | No Class - Spring Recess | |
Mar 31st‡ | AlphaProof Thomas Hubert, Google DeepMind |
|
Apr 7th | Language models for autoformalization and theorem proving Kaiyu Yang, Meta FAIR |
|
Apr 14th‡ | Advanced topics in theorem proving Sean Welleck, CMU |
|
Apr 21st‡ | Program verification & generating verified code Swarat Chaudhuri, UT Austin |
|
Apr 28th‡ | Agent safety & security Dawn Song, UC Berkeley |
Enrollment and Grading
Prerequisites: Students are strongly encouraged to have had experience and basic understanding of Machine Learning and Deep Learning before taking this class, e.g., have taken courses such as CS182, CS188, and CS189.
Please fill out the petition form if you are on the waitlist or can’t get added to the waitlist.
This is a variable-unit course. All enrolled students are expected to participate in lectures in person and complete weekly reading summaries related to the course content. Students enrolling in one unit are expected to submit an article that summarizes one of the lectures. Students enrolling in more than one unit are expected to submit a lab assignment and a project instead of the article. For students enrolling in 2 units, the project should have a written report, which can be a survey in a certain area related to LLMs. For students enrolling in 3 or 4 units, projects will follow either an applications track or a research track:
- Applications Track: Projects in this track focus on applied use cases of LLMs and do not necessarily need to contribute novel research. Students in this track will work in groups of 3-4. The project for 3-unit students should include an implementation (coding) component that programmatically interacts with LLMs, while 4-unit students must complete a more substantial implementation with the potential for real-world impact.
- Research Track: Students in this track will conduct novel research under the supervision of postdocs and graduate students, with the goal of publishing in a workshop or conference. Research track projects must be completed in groups of 2-3, and students must apply to participate via a forthcoming Google form. The expectations for implementation and intellectual contributions will align with the project requirements for 3- and 4-unit students.
The grade breakdowns for students enrolled in different units are the following:
1 unit | 2 units | 3/4 units | |
---|---|---|---|
Participation | 40% | 16% | 8% |
Reading Summaries | 10% | 4% | 2% |
Quizzes | 10% | 4% | 2% |
Article | 40% | ||
Lab | 16% | 8% | |
Project | |||
Proposal | 10% | 10% | |
Milestone | 10% | 10% | |
Presentation | 20% | 15% | |
Report | 20% | 20% | |
Implementation | 25% |
Lab and Project Timeline
Released | Due | |
---|---|---|
Project group formation | 1/27 | 2/24 |
Project proposal | 2/3 | 2/24 |
Project milestone | 2/24 | 3/31 |
Lab | 3/31 | 4/28 |
Project final presentation | 4/28 | 5/9 |
Project final poster | 4/28 | 5/9 |
Project final report | 4/28 | 5/16 |
Office Hours
- Alex: 6-7pm on Mondays on Zoom