OpenAI’s Aardvark is an AI Security Agent Combating Code Vulnerabilities

OpenAI on Thursday launched Aardvark, an artificial intelligence (AI) agent designed to autonomously detect and help fix security vulnerabilities in software code, offering defenders a potentially valuable tool against malicious hackers.

The GPT-5-powered tool, currently in private beta, represents what OpenAI calls a “defender-first model” that continuously monitors code repositories to identify vulnerabilities as software evolves. By integrating with OpenAI Codex, Aardvark not only flags security issues but generates patches that developers can review and implement with a single click.

By automating vulnerability discovery and remediation, Aardvark could help development teams address security issues faster and more comprehensively than traditional manual review processes allow.

“By catching vulnerabilities early, validating real-world exploitability, and offering clear fixes, Aardvark can strengthen security without slowing innovation,” OpenAI said in a blog post announcing the new AI agent.

“Software is now the backbone of every industry — which means software vulnerabilities are a systemic risk to businesses, infrastructure, and society. Over 40,000 CVEs were reported in 2024 alone,” OpenAI said. “Our testing shows that around 1.2% of commits introduce bugs — small changes that can have outsized consequences.”

The Microsoft Corp.-backed company has been testing Aardvark internally for several months across its own codebases and with close partners. The agent has also been applied to open-source projects, demonstrating its versatility beyond proprietary software.

Aardvark operates through a multi-stage process that leverages large language model (LLM) reasoning and tool integration. First, it analyzes a repository to understand the codebase’s purpose, objectives, and security implications. The agent then scans for vulnerabilities by examining both historical commits and newly added code.

When Aardvark identifies potential security issues, it annotates the problematic code with explanations that human developers can review. The system goes further by attempting to validate vulnerabilities in a sandboxed environment, actively trying to trigger them to confirm their existence. Results are tagged with metadata that allows teams to filter and prioritize findings.

Beyond detection, Aardvark assists with remediation by generating Codex-powered patches that have been pre-scanned for quality. While primarily focused on security vulnerabilities, the tool can also identify other code issues including logic flaws and incomplete fixes.

Aardvark is available through private beta to select partners invited by OpenAI.

Jon Swartz