针对集成大语言模型应用的提示注入攻击
研究揭示大型语言模型(LLMs)在实际应用中的安全风险,提出新型黑盒提示注入攻击技术HouYi,并通过实验发现31款集成LLM的应用存在漏洞,影响包括Notion在内的多个平台。 2025-5-24 06:50:9 Author: arxiv.org(查看原文) 阅读量:0 收藏

View PDF HTML (experimental)

Abstract:Large Language Models (LLMs), renowned for their superior proficiency in language comprehension and generation, stimulate a vibrant ecosystem of applications around them. However, their extensive assimilation into various services introduces significant security risks. This study deconstructs the complexities and implications of prompt injection attacks on actual LLM-integrated applications. Initially, we conduct an exploratory analysis on ten commercial applications, highlighting the constraints of current attack strategies in practice. Prompted by these limitations, we subsequently formulate HouYi, a novel black-box prompt injection attack technique, which draws inspiration from traditional web injection attacks. HouYi is compartmentalized into three crucial elements: a seamlessly-incorporated pre-constructed prompt, an injection prompt inducing context partition, and a malicious payload designed to fulfill the attack objectives. Leveraging HouYi, we unveil previously unknown and severe attack outcomes, such as unrestricted arbitrary LLM usage and uncomplicated application prompt theft. We deploy HouYi on 36 actual LLM-integrated applications and discern 31 applications susceptible to prompt injection. 10 vendors have validated our discoveries, including Notion, which has the potential to impact millions of users. Our investigation illuminates both the possible risks of prompt injection attacks and the possible tactics for mitigation.

Submission history

From: Yi Liu [view email]
[v1] Thu, 8 Jun 2023 18:43:11 UTC (729 KB)
[v2] Sat, 2 Mar 2024 09:12:23 UTC (729 KB)


文章来源: https://arxiv.org/abs/2306.05499
如有侵权请联系:admin#unsafe.sh