Buckle up, Buttercup, AIxCC’s scored round is underway!

Buckle up, Buttercup, AIxCC’s scored round is underway!
DARPA的AI网络挑战赛决赛启动，参赛队伍利用自主AI系统Buttercup寻找并修复软件漏洞。比赛为期十天，涉及30个真实世界开源程序的漏洞。获胜者将获得400万美元奖金。赛后，Buttercup团队将开源其系统并分享技术细节。 2025-7-2 11:0:0 Author: blog.trailofbits.com(查看原文) 阅读量:19 收藏

The one and only scored round of DARPA’s AI Cyber Challenge (AIxCC) Finals Competition has officially started! Our CRS (Cyber Reasoning System), Buttercup, is now competing against six other teams to see which autonomous AI-driven system can find and patch the most software vulnerabilities. It’s been a long road to this point, and we’re excited to see the results of our hard work over the last two years building Buttercup.

After the scored round closes, DARPA and ARPA-H will announce the winners on the main DEFCON 33 stage on August 8. The top scoring CRS will receive a $4 million top prize, with the next two runners up receiving $2 million and $1.5 million in prize money. Our team will be there to watch the final reveal live and will also be involved in the larger AIxCC experience in various ways. If you’re planning to come to DEFCON this August, please come see us at our booth in the AIxCC Experience and attend our talk on the AIxCC stage (date/time TBD) about the ups and downs of building Buttercup and competing in AIxCC.

What’s happening in the scored round?

Over a ten-day period, each competing CRS will be tasked with finding and patching multiple vulnerabilities in 30 different real-world, open-source programs. These programs are chosen from the most heavily used C and Java open-source programs, and the vulnerabilities they contain are often actual historic vulnerabilities that have been strategically re-injected by the competition organizers. SQLite, cURL, libxml2, Apache Zookeeper, AWS s2n-tls, and even the Linux Kernel are among programs that have been used in prior rounds.

Each CRS will be given a total of 75 challenges based on these open-source programs over the ten-day period, with up to eight challenges coming in each wave. Each challenge comes equipped with OSS-Fuzz-compatible fuzzing harnesses and, in many cases, a set of functional tests. A CRS can score points by:

Proving that a vulnerability exists in the program by finding an input that crashes the program or triggers a sanitizer at runtime
Fixing a vulnerability in the program with a patch that addresses the root cause of the vulnerability and does not break functional tests
Classifying a static analysis alert highlighting a possible vulnerability as a true- or false-positive

To accomplish this, each CRS has been given an $85,000 compute and a $50,000 third-party AI budget. The scale of AIxCC’s scored round is massive, and for good reason. The CRS that wins this competition will prove that it can immediately scale to the challenge of securing the vast open-source software ecosystem.

What’s next for our team?

While Buttercup is competing and we await the announcement of the winning teams, we’re still hard at work making Buttercup even better! In the coming month, we will be preparing Buttercup to be released as open-source software, which we expect to make available in August. We’re also working on building a version of Buttercup that can be run on commodity hardware so everyone can try it out!

Also, once the competition is over, we can finally share technical details on how Buttercup works. Stay tuned for technical deep dives on how Buttercup uses AI to accelerate traditional fuzz testing and create high-quality patches for vulnerabilities!

For more background, see our previous posts on the AIxCC:

文章来源: https://blog.trailofbits.com/2025/07/02/buckle-up-buttercup-aixccs-scored-round-is-underway/
如有侵权请联系:admin#unsafe.sh