Frontier AI Cybersecurity Observatory

Artificial intelligence (AI) is evolving at an unprecedented pace, making it increasingly difficult to anticipate its societal impacts and risks. For example, recent benchmarks such as CyberGym and BountyBench have demonstrated that AI agents can already tackle real-world cybersecurity tasks, including zero-day discovery. In cybersecurity, AI plays a dual role—strengthening both offensive and defensive capabilities. It is therefore critical for developers, researchers, and policymakers to stay informed in a timely manner.

To address this need, we are implementing the “Frontier AI Cybersecurity Observatory”, a central hub and open platform for continuously tracking frontier AI capabilities in cybersecurity. By aggregating and maintaining cybersecurity benchmarks across various attack and defense stages, the observatory will enable the following:

Want to add your own benchmark?

Please visit our huggingface and contribute by submitting a pull request!

Have suggestions to improve the observatory?

As this is an early-stage effort, we are actively gathering feedback from the community and would greatly value your input. Please share your suggestions here.