AI agents have fundamentally changed the threat model of AI model-based applications. By equipping these models with plugins (also called tools), your agents no longer just generate text; they now read files, search connected databases, run scripts, and perform other tasks to actively operate on your network.
Because of this, vulnerabilities in the AI layer are no longer just a content issue and are an execution risk. If an attacker can control the parameters passed into these plugins via prompt injection, the agent may be driven to perform actions beyond its intended use.
The AI model itself isn’t the issue as it’s behaving exactly as designed by parsing language into tool schemas. The vulnerability lies in how the framework and tools trust the parsed data.
To build powerful applications, developers rely heavily on frameworks like Semantic Kernel, LangChain, and CrewAI. These frameworks act as the operating system for AI agents, abstracting away complex model orchestration. But this convenience comes with a hidden cost: because these frameworks act as a ubiquitous foundational layer, a single vulnerability in how they map AI model outputs to system tools carries systemic risk.
As part of our mission to make AI systems more secure and eliminate new class of vulnerabilities, we’re launching a research series focused on identifying vulnerabilities in popular AI agent frameworks. Through responsible disclosure, we work with maintainers to ensure issues are addressed before sharing our findings with the community.
In this post, we share details on the vulnerabilities we discovered in Microsoft’s Semantic Kernel, along with the steps we took to address them and interactive way to try it yourself. Stay tuned for upcoming blogs where we’ll dive into similar vulnerabilities found in frameworks beyond the Microsoft ecosystem.
We discovered a vulnerable path in Microsoft Semantic Kernel that could turn prompt injection into host-level remote code execution (RCE).
A single prompt was enough to launch calc.exe on the device running our AI agent, with no browser exploit, malicious attachment, or memory corruption bug needed. The agent simply did what it was designed to do: interpret natural language, choose a tool, and pass parameters into code.

This scenario is the real security story behind modern AI agents. Once an AI model is wired to tools, prompt injection draws a thin line between being just a content security problem and becoming a code execution primitive. In this post in our research series on AI agent framework security, we show how two vulnerabilities in Semantic Kernel could allow attackers to cross that line, and what customers should do to assess exposure, patch affected agents, and investigate whether exploitation may already have occurred.
Semantic Kernel is Microsoft’s open-source framework for building AI agents and integrating AI models into applications. With over 27,000 stars on GitHub, it provides essential abstractions for orchestrating AI models, managing plugins, and chaining workflows.
During our security research into the Semantic Kernel framework, we identified and disclosed two critical vulnerabilities: CVE-2026-25592 and CVE-2026-26030. These flaws, which have since been fixed, could allow an attacker to achieve unauthorized code execution by leveraging injection attacks specifically targeted at agents built within the framework.
In the following sections, we break down the mechanics of these vulnerabilities in detail and provide actionable guidance on how to harden your agents against similar exploitation.
Exploitation of this vulnerability requires two conditions:
When both these two conditions are met, the vulnerability enables an attacker to achieve RCE from a prompt.
To demonstrate how this vulnerability could be exploited, we built a “hotel finder” agent using Semantic Kernel. First, we created an In Memory Vector collection to store the hotels’ data, then exposed a search_hotels(city=…) function to the kernel (agent) so that the AI model could invoke it through tool calling.

When a user inputs, for example, “Find hotels in Paris,” the AI model calls the search plugin with city=”Paris”. The plugin then first runs a deterministic filter function to narrow down the dataset and computes vector similarity (embeddings).
With this understanding of how a Semantic Kernel agent performs the search, let’s dive deep into the vulnerability.
The default filter function that we mentioned previously is implemented as a Python lambda expression executed using eval(). In our example, The default filter will result to new_filter = “lambda x: x.city == ‘Paris’”.

The vulnerability is that kwargs[param.name] is AI model-controlled and not sanitized. This acts as a classic injection sink. By closing the quote (‘) and appending Python logic, an attacker could turn a simple data lookup into an executable payload:
The framework developers anticipated this RCE risk and implemented a validator that parses the filter string into an Abstract Syntax Tree (AST) before execution.

Before running a user-provided filter code, the application runs a validation function designed to block unsafe operations. At a high level, the validation does the following:
The resulting lambda is then used to filter records in the Vector Store.
While this approach is solid in theory, blocklists in dynamic languages like Python are inherently fragile because the language’s flexibility allows restricted operations to be reintroduced through alternate syntax, libraries, or runtime evaluation.
We found a way to bypass this blocklist implementation through a specially crafted exploit prompt.
Our exploit prompt was designed to manipulate the agent into triggering a Search Plugin invocation with an input that ultimately leads to malicious code execution:

This prompt circumvented the agent to trigger the following function calling:

As result, the lambda function was formatted as the following and executed inside eval(). This payload escaped the template string, traversed Python’s class hierarchy to locate BuiltinImporter, and used it to dynamically load os and call system(). These steps bypassed the import blocklists to launch an arbitrary shell command (for example, calc.exe) while keeping the template syntax valid with a clean closing expression.
The filter function didn’t block the payload because of the following reasons:
1. Missing dangerous names
The payload used several attributes that weren’t in the blocklist:
2. Structural check passes
The payload was wrapped inside a valid lambda expression. The check isinstance(tree.body, ast.Lambda) passed because the entire thing is in itself a lambda that just happens to contain malicious code in its body.
3. Empty __builtins__ is irrelevant
The eval() call used {“__builtins__”: {}} to remove access to built-in functions. However, this protection was meaningless because the payload never used built-ins directly. Instead, it started with tuple(), which exists regardless of the builtins environment, and crawled through Python’s type system to reach dangerous functionality.
4. No ast.Subscript checking
While not used in this payload, it’s worth noting that the filter only checked ast.Name and ast.Attribute nodes. If the payload needed to use a blocked name, it could’ve accessed it using bracket notation (for example, obj[‘__class__’] instead of obj.__class__), which creates an ast.Subscript node that the validation completely ignored.
After responsibly disclosing the vulnerability to MSRC, the Microsoft Semantic Kernel team implemented a comprehensive fix using four layers of protection to eliminate every escape primitive needed to turn a lambda filter into executable code:
Your agent is vulnerable to CVE-2026-26030 if it meets all of the following conditions:
You don’t need to rewrite your agent. Upgrading the Python semantic-kernel dependency to version 1.39.4 or higher mitigates the risk.
While patching closes the bug, but it doesn’t answer the retrospective question defenders care about: whether their agent was exploited before they upgraded.
First, define the vulnerable window for each affected deployment: from the moment a vulnerable Semantic Kernel Python version was deployed until the moment version 1.39.4 or later was installed. Any investigation should focus on that time range.
Second, hunt for host-level post-exploitation signals during that vulnerable window. Because successful exploitation results in code execution on the host, the most useful evidence is in endpoint telemetry: suspicious child processes, outbound connections, or persistence artifacts created by the agent host process. We provide a set of practical advanced hunting queries for further investigation in a separate section of this blog.
If you find suspicious activity during that window, treat it as a potential host compromise. Review the affected host, rotate credentials and tokens accessible to the agent, and investigate what data or systems that host could reach.
Before diving into the mechanics of this second vulnerability, here is what an agent sandbox escape looks like in practice: with a single prompt, an attacker could bypass a cloud-hosted sandbox, write a malicious payload directly to the host device’s Windows Startup folder, and achieve full RCE.
The container boundary
Semantic Kernel includes a built-in plugin called SessionsPythonPlugin that allows agents to safely execute Python code inside Azure Container Apps dynamic sessions, which are isolated cloud hosted sandboxes with their own filesystem.
The security model relies entirely on this boundary. Code runs in the isolated sandbox and cannot touch the host device where the agent process runs. To help move data in and out of the sandbox, the plugin uses helper functions like UploadFile and DownloadFile, which run on the host side to transfer files across this boundary.
In the .NET software development kit (SDK), DownloadFileAsync was accidentally marked with a [KernelFunction] attribute, which officially advertised it to the AI model as a callable tool, complete with its parameter schema:

Because of this attribute, the localFilePath parameter, which dictates exactly where File.WriteAllBytes() saves data on the host device, was now entirely AI controlled. With no path validation, directory restriction, or sanitization in place, an attacker wouldn’t need a complex hypervisor exploit; they just needed to prompt the model to do it for them.
(Note: Arbitrary File Read. A similar vulnerability existed in reverse for the upload_file() function across both the Python and .NET SDKs. It accepted any local file path without validation, allowing prompt injections to exfiltrate sensitive host files, like SSH keys or credentials, directly into the sandbox).
By chaining two exposed tools, an attacker could turn standard function calling into a sandbox escape:

Step 1: Create the payload
An injected prompt instructs the agent to use the ExecuteCode tool to generate a malicious script inside the isolated container:
At this point, the payload is contained. It exists only in the sandbox and cannot execute on the host.
Step 2: Escape the sandbox
A second injected instruction tells the AI model to use the DownloadFileAsync tool to download the file to a dangerous location on the host:

The agent calls:

The agent fetches the script from the sandbox’s API and writes it directly to the host’s Windows\Start Menu\Programs\Startup folder.
Step 3: Execute the code
On the next user sign-in, the script runs, granting full host compromise.
This exploit illustrates the MITRE ATLAS technique AML.T0051 (LLM Prompt Injection) cascading into AML.T0016 (Obtain Capabilities).
Exposing DownloadFileAsync provided a direct file write primitive on the host filesystem, effectively negating the container isolation.
Semantic Kernel patched this vulnerability by removing the root cause of tool exposure and adding defense in depth:
Removed AI access – The [KernelFunction] attribute was removed, making the function invisible to the AI model. The AI agent can no longer invoke it, and prompt injection can no longer reach it:

This single change breaks the entire attack chain. The AI can now only be called directly by the developer’s intentional code.

Your agent is vulnerable to CVE-2026-25592 if it uses a Semantic Kernel .NET SDK version older than 1.71.0.
If you use Semantic Kernel, our primary recommendation is to upgrade immediately. You don’t need to rewrite your agent’s architecture; the security updates simply remove the AI model’s ability to trigger these functions autonomously.
More broadly, defending AI agents requires acknowledging that AI models aren’t security boundaries. Security teams must correlate signals across two layers: the AI model level (intent detection through meta prompts and content safety filters) and the host level (execution detection). If an attacker bypasses the AI model guardrails, traditional endpoint defense must be in place to detect anomalous behavior, such as an AI agent process suddenly spawning command lines or dropping scripts into Startup folders.
Untrusted data being used as input for high-risk operations isn’t entirely new. In the early days of web application security, such input was passed directly into SQL queries or filesystem APIs. Today, agents are doing something similar, in that they could map untrusted natural-language input to system tools.
The overarching lesson from both vulnerabilities is that both aren’t bugs in the AI model itself, but rather issues in agent architecture and tool design. We must make a clear distinction between model behavior and agent architecture. The AI model functions exactly as it was designed to: translate intent into structured tool calls.
When models are connected to system tools, prompt injection risks may extend beyond typical chatbot misuse and require additional safeguards. Instead, it becomes a direct path to concrete execution primitives like data exfiltration, arbitrary file writes, and RCE. For a deeper look at the runtime risks of tool-connected AI models, see Running OpenClaw safely: identity, isolation, and runtime risk.
As mentioned previously, your LLM is not a security boundary. The tools you expose define your attacker’s affected scope. Any tool parameter the model can influence must be treated as attacker-controlled input.
In the next blog in this series, we’ll expand beyond Semantic Kernel to explore structurally similar execution vulnerabilities that we found in other widely used third-party agent frameworks.
If you want to see how prompt injections escalate into execution and to put your skills to the test, we’ve packaged the vulnerable hotel-finder agent that we described in this blog into an interactive, hands-on capture-the-flag (CTF) challenge.
This CTF challenge lets you step into the shoes of an attacker and try to exploit the CVE-2026-26030 vulnerability in a controlled environment. You need to craft a prompt injection that not only bypasses the agent’s natural language defenses but also smuggle a Python AST-traversal payload through the vulnerable eval() sink.
To see if you can manipulate the AI model into launching arbitrary code and popping calc.exe on the server, download the challenge, spin it up in a sandbox, and see if you can achieve RCE. Keep in mind that this challenge is for educational purposes only, and shouldn’t be run in production environments.

Reconnaissance:

Exploit (jailbreak and payload):

Note: Because the agent will running locally on your device, calc.exe will open on your desktop. In a real-world scenario, such an executable file will launch remotely on the server hosting the agent.
Download the CTF challenge: https://github.com/amiteliahu/AIAgentCTF/tree/main/CVE-2026-26030
The following advanced hunting queries lets you surface suspicious activities from Semantic Kernel agents.
DeviceProcessEvents
| where Timestamp > ago(30d)
| where InitiatingProcessCommandLine matches regex @"(?i)semantic[\s_\-]?kernel"
or InitiatingProcessFolderPath matches regex @"(?i)semantic[\s_\-]?kernel"
| where FileName in~ (
"cmd.exe", "powershell.exe", "pwsh.exe", "bash.exe", "wsl.exe",
"certutil.exe", "mshta.exe", "rundll32.exe", "regsvr32.exe",
"wscript.exe", "cscript.exe", "bitsadmin.exe", "curl.exe",
"wget.exe", "whoami.exe", "net.exe", "net1.exe", "nltest.exe",
"klist.exe", "dsquery.exe", "nslookup.exe"
)
| project
Timestamp,
DeviceName,
AccountName,
FileName,
ProcessCommandLine,
InitiatingProcessFileName,
InitiatingProcessCommandLine,
InitiatingProcessFolderPath
| sort by Timestamp desc
DeviceProcessEvents
| where Timestamp > ago(30d)
| where InitiatingProcessFileName in~ ("dotnet.exe")
| where InitiatingProcessCommandLine matches regex @"(?i)(semantic[\s_\-]?kernel|SKAgent|kernel\.run)"
| where FileName in~ (
"cmd.exe", "powershell.exe", "pwsh.exe", "bash.exe",
"certutil.exe", "curl.exe", "whoami.exe", "net.exe"
)
| project
Timestamp,
DeviceName,
AccountName,
FileName,
ProcessCommandLine,
InitiatingProcessFileName,
InitiatingProcessCommandLine
| sort by Timestamp desc
For the latest security research from the Microsoft Threat Intelligence community, check out the Microsoft Threat Intelligence Blog.
To get notified about new publications and to join discussions on social media, follow us on LinkedIn, X (formerly Twitter), and Bluesky.
To hear stories and insights from the Microsoft Threat Intelligence community about the ever-evolving threat landscape, listen to the Microsoft Threat Intelligence podcast.
Review our documentation to learn more about our real-time protection capabilities and see how to enable them within your organization.