New ChatGPhish Technique Uses Prompt Injection to Manipulate ChatGPT Responses

New ChatGPhish Technique Uses Prompt Injection to Manipulate ChatGPT Responses
Security researchers have unveiled ChatGPhish, a newly documented vulnerability concept tha 2026-6-1 08:44:13 Author: thecyberexpress.com(查看原文) 阅读量:10 收藏

Security researchers have unveiled ChatGPhish, a newly documented vulnerability concept that demonstrates how browser-based prompt injection can influence ChatGPT page summaries and potentially expose users to phishing, tracking, and social engineering attacks.

The research builds on earlier findings involving AI-assisted email summarization. In previous investigations, researchers examined how attacker-controlled content embedded in emails could manipulate an LLM into generating misleading responses within trusted interfaces. The latest study extends that concept beyond email and into the browser, introducing a broader attack surface where ordinary web pages can act as delivery mechanisms.

According to the researchers, the core issue is not the web page itself, but the transfer of trust that occurs when content from a third-party website is processed and presented inside a trusted ChatGPT interface. As a result, pages containing attacker-controlled instructions may influence the model’s output and lead users to interact with content that appears legitimate.

Browser-Based Prompt Injection Expands the Attack Surface

Unlike email attacks, which often encounter spam filters, secure email gateways, attachment controls, and user awareness training, browser-based attacks require far less interaction. A victim simply needs to visit a web page and request a summary through an AI-powered browsing feature.

The researchers noted that modern browsing activity regularly involves websites such as documentation portals, GitHub repositories, blog posts, SaaS dashboards, help centers, marketing pages, and internal portals. Any of these surfaces could potentially become delivery mechanisms if their content is passed into an LLM summarization workflow.

During testing, researchers used Firefox as the entry point. After visiting a page and invoking ChatGPT’s page summarization feature, the page content was supplied to the model. Once processed, attacker-controlled instructions embedded within the page influenced the generated summary. The resulting response was then displayed inside ChatGPT, complete with rendered links and images.

The researchers emphasized that this is not a Firefox vulnerability. Firefox merely provides access to the page summarization workflow. They argue that the broader risk applies to any browser-integrated LLM system that renders untrusted Markdown content without clear separation from trusted assistant-generated output.

How ChatGPhish Demonstrates Phishing Within ChatGPT

One of the primary demonstrations involved injecting a fake account security notification into a legitimate web page.

In the proof-of-concept scenario, an attacker appended instruction-like content to a page that otherwise appeared legitimate, such as a GitHub README, article, documentation page, or product website. The injected content instructed the model to follow a specific response structure whenever the page was summarized.

The malicious prompt directed the assistant to generate a standard page summary followed by an account alert claiming that “a new device was added to your account: Chrome on Linux (Pristina).” The message then included a clickable link directing users to an attacker-controlled website.

Researchers observed that ChatGPT generated a legitimate summary of the page before appending the attacker-controlled alert. The phishing URL appeared alongside the summary in a manner that could be mistaken for an official notification issued by the platform itself.

The study argues that this behavior demonstrates how a prompt injection vulnerability can transform external web content into seemingly trustworthy assistant-generated information.

QR Code Delivery Creates a Cross-Device Threat

The ChatGPhish research also explored a more sophisticated attack method involving QR codes.

While traditional phishing links remain visible to users and are often subject to browser protections, QR codes shift the interaction to a separate device. Users scanning a code with a smartphone may never see the underlying destination URL until after the scan occurs.

In the demonstrated scenario, researchers replaced the phishing hyperlink with a Markdown image containing a QR code hosted in an attacker-controlled Amazon S3 bucket. Because the ChatGPT renderer automatically fetched and displayed the image, the QR code appeared directly within the assistant’s response.

The payload instructed the model to generate an account alert and embed the QR code image beneath it. Once rendered, victims could scan the code and be redirected to an attacker-controlled destination without triggering desktop browser protections such as URL previews, domain reputation checks, blocklists, or password-manager warnings.

Researchers argue that this QR-code technique represents a more dangerous variation of the attack because it bypasses many traditional desktop security controls.

文章来源: https://thecyberexpress.com/chatgphish-prompt-injection-vulnerability/
如有侵权请联系:admin#unsafe.sh