Lambda
$400 cloud credits to every individual or team
Compete for over $1M in prizes and resources. This two-phase competition challenges participants to first build novel benchmarks or enhance existing benchmarks for agentic AI (Phase 1), and then create AI agents to excel on them (Phase 2)—advancing the field by creating high-quality, broad-coverage, realistic agent evaluations as shared public goods.
As AI evolves toward agentic systems—capable of reasoning, taking actions, and interacting with the world—current benchmarking methods fall short. Existing evaluations suffer from poor interoperability (agents must be heavily modified to fit each benchmark), limited reproducibility (stateful tools and dynamic configurations cause inconsistent results), fragmentation (leaderboards and results are scattered across platforms), and poor discoverability (with new benchmarks appearing almost weekly, finding the right one is surprisingly hard).
Our vision is a unified, open space where the community defines the goalposts of agentic AI—through benchmarks that are standardized, reproducible, collaborative, and discoverable.
Through the AgentX–AgentBeats competition, we aim to bring the community together to create high-quality, broad-coverage, realistic agent evaluations—developed in an agentified, standardized, reproducible, and collaborative way—as shared public goods for advancing agentic AI.
$400 cloud credits to every individual or team
$50 inference credits to every individual or team
$100 cloud credits to every individual or team
$50 inference credits to every individual or team
$50 credits to every individual or team
Additional resources will be announced soon.
Up to $50k prize pool in GCP/Gemini credits to be shared among the winning teams.
Up to $50k prize pool in inference credits to be shared among the winning teams.
OpenAI credits of $10,000, $5,000, and $1,000 will be awarded to the 1st, 2nd, and 3rd place winners in each of the two tracks: the Research Track and the Finance Agent Track.
Up to $10k prize pool in AWS credits to be shared among the winning teams.
Each winning team member who is currently a student will receive:
Up to $50k prize pool in GCP/Gemini credits to be shared among the winning teams.
Up to $50k prize pool in inference credits to be shared among the winning teams.
OpenAI credits of $10,000, $5,000, and $1,000 will be awarded to the 1st, 2nd, and 3rd place winners in each of the two tracks: the Research Track and the Finance Agent Track.
Up to $10k prize pool in AWS credits to be shared among the winning teams.
Each winning team member who is currently a student will receive:
Each winning team will receive up to 2 complimentary tickets to the Agentic AI Summit 2026 (August 1–2 at UC Berkeley).
Additional prize partners will be announced soon.
Hugging Face credits of $5,000, $3,000, and $2,000 will be awarded to the 1st, 2nd, and 3rd place winners in the custom track—the OpenEnv Challenge.
AgentBeats is an open-source platform built on the new Agentified Agent Assessment (AAA) paradigm: instead of adapting your agent to fit a rigid benchmark, the benchmark itself becomes an agent. A 🟢 green agent (evaluator) defines the tasks, environment, and scoring; a 🟣 purple agent (competitor) is the AI agent under test. They communicate via the A2A protocol, so you build your agent once and it works with any benchmark on the platform.
Want a detailed walkthrough? Watch the competition intro video or skim the slides.
Oct 16, 2025 to Jan 31, 2026
Participants build green agents that define assessments and automate scoring. Pick your evaluation track:
March 2 to May 24, 2026
We're excited to announce that Phase 2 of the AgentX–AgentBeats competition will officially launch on March 2, 2026. Participants will build purple agents to tackle the select top green agents from Phase 1 and compete on the public leaderboards. Unlike Phase 1, where participants competed across all tracks throughout the entire duration, Phase 2 introduces a sprint-based format. The competition will be organized into four rotating sprints:
Mark your calendars for the track(s) that excite you most—we'll release the official benchmarks and green agents for each track as their sprint approaches. Keep an eye on our announcements, as we may introduce additional tracks throughout the competition based on community interest and emerging opportunities.
A red-teaming and automated security testing challenge.
Learn more about the challenge—full details and guidelines are available here.
Phase 2: Feb 23, 2026 – March 30, 2026
Learn more about the τ²-Bench Challenge—full details and guidelines are available here.
Deadline: March 30, 2026
SOTA Environments to drive general intelligence.
Learn more about the challenge—full details and guidelines are available here.
Deadline: April 12, 2026
More custom tracks to be announced...
| Date | Event |
|---|---|
| Oct 16, 2025 | Participant registration open |
| Oct 24, 2025 | Team signup & Build Phase 1 |
| Jan 31, 2026 | Green agent submission |
| Feb 1, 2026 | Green agent judging |
| March 2, 2026 | Phase 2: Build purple agents |
| May 24, 2026 | End of Phase 2 |