50x Faster Than Human Experts — Setting a New Benchmark for AI-Driven Offense
San Francisco, CA – August 14, 2025 — Horizon3.ai today announced that NodeZero®, its autonomous penetration testing platform, is the first AI to fully solve the Game of Active Directory (GOAD) — a respected benchmark for Active Directory exploitation — completing the challenge in just 14 minutes.
GOAD, developed by Orange Cyberdefense, simulates a realistic multi-domain enterprise network with the same trust abuses, misconfigurations, and security controls attackers exploit in the wild. Solving it requires chaining reconnaissance, credential abuse, privilege escalation, lateral movement, and persistence across multiple hosts and domains.
Recent Carnegie Mellon University research underscores how difficult this is: state-of-the-art LLMs like GPT-4o, Gemini 2.5 Pro, and Sonnet 3.7 — even with advanced prompting frameworks — failed to reliably execute multi-host intrusions, capturing less than 30% of attack graph states in labs capped at 50 hosts.
For both humans and algorithms, GOAD is a stress test of scale, reasoning, and persistence. Attack paths are not linear — they require maintaining multi-hop memory across dozens of steps, adapting execution priorities based on partial successes, and exploiting inter-domain trust boundaries under realistic constraints.
NodeZero’s solve time of 14 minutes is 50× faster than an expert human, with perfect execution of the full attack chain from initial foothold to complete domain compromise.
NodeZero operates in a different league — applying the same architecture that solved GOAD in minutes to production-scale networks across industries. Its campaigns directly map to the breach patterns documented in the 2025 Verizon DBIR, IBM X-Force, and Mandiant M-Trends reports:
In the NSA’s Continuous Autonomous Penetration Testing (CAPT) program, NodeZero has already shown this capability at national scale:
“GOAD is an excellent benchmark, but its real value is how closely it reflects what’s happening in the wild,” said Snehal Antani, CEO and Co-Founder of Horizon3.ai. “When you can solve GOAD in minutes, and then turn that same capability loose in production networks — aligned to the exact tactics in the DBIR, IBM, and Mandiant reports — you’re not just testing security, you’re closing the gap between how attackers operate and how defenders respond.”
While competitors make bold marketing claims about being “#1,” NodeZero delivers proof — in the lab and in live, complex environments — at a scale no other platform has matched.
Learn More About NodeZero vs. GOAD.
Horizon3.ai empowers organizations to continuously verify their security posture with NodeZero®, the industry’s leading autonomous pentesting platform. Built to think and act like an attacker — but operate safely in production — NodeZero identifies exploitable weaknesses, prioritizes fixes based on real-world impact, and verifies remediation at scale. Customers across manufacturing, healthcare, finance, and national security rely on NodeZero to reduce risk and accelerate security outcomes.
Follow Horizon3.ai on LinkedIn and X.
Brittney Blanchard
Highwire
[email protected]