New attack on ChatGPT research agent pilfers secrets from Gmail inboxes

ShadowLeak starts where most attacks on LLMs do—with an indirect prompt injection. These prompts are tucked inside content such as documents and emails sent by untrusted people. They contain instructions to perform actions the user never asked for, and like a Jedi mind trick, they are tremendously effective in persuading the LLM to do things that are harmful. Prompt injections exploit an LLM’s inherent need to please its user. Following instructions has been so ingrained into the bots’ behavior that they’ll carry them out no matter who asks, even a threat actor in a malicious email.

So far, prompt injections have proved impossible to prevent. That has left OpenAI and the rest of the LLM market reliant on mitigations that are often introduced on a case-by-case basis and only in response to the discovery of a working exploit.

Accordingly, OpenAI mitigated the prompt-injection technique ShadowLeak fell to—but only after Radware privately alerted the LLM maker to it.

A proof-of-concept attack that Radware published embedded a prompt injection into an email sent to a Gmail account that Deep Research had been given access to. The injection included instructions to scan received emails related to a company’s human resources department for the names and addresses of employees. Deep Research dutifully followed those instructions.

By now, ChatGPT and most other LLMs have mitigated such attacks, not by squashing prompt injections, but rather by blocking the channels the prompt injections use to exfiltrate confidential information. Specifically, these mitigations work by requiring explicit user consent before an AI assistant can click links or use markdown links—which are the normal ways to smuggle information off of a user environment and into the hands of the attacker.

At first, Deep Research also refused. But when the researchers invoked browser.open—a tool Deep Research offers for autonomous Web surfing—they cleared the hurdle. Specifically, the injection directed the agent to open the link https://compliance.hr-service.net/public-employee-lookup/ and append parameters to it. The injection defined the parameters as an employee’s name and address. When Deep Research complied, it opened the link and, in the process, exfiltrated the information to the event log of the website.

Source