
The researchers wrote:
The implications of this vulnerability are particularly severe given that ElizaOSagents are designed to interact with multiple users simultaneously, relying on shared contextual inputs from all participants. A single successful manipulation by a malicious actor can compromise the integrity of the entire system, creating cascading effects that are both difficult to detect and mitigate. For example, on ElizaOS’s Discord server, various bots are deployed to assist users with debugging issues or engaging in general conversations. A successful context manipulation targeting any one of these bots could disrupt not only individual interactions but also harm the broader community relying on these agents for support
and engagement.This attack exposes a core security flaw: while plugins execute sensitive operations, they depend entirely on the LLM’s interpretation of context. If the context is compromised, even legitimate user inputs can trigger malicious actions. Mitigating this threat requires strong integrity checks on stored context to ensure that only verified, trusted data informs decision-making during plugin execution.
In an email, ElizaOS creator Shaw Walters said the framework, like all natural-language interfaces, is designed “as a replacement, for all intents and purposes, for lots and lots of buttons on a webpage.” Just as a website developer should never include a button that gives visitors the ability to execute malicious code, so too should administrators implementing ElizaOS-based agents carefully limit what agents can do by creating allow lists that permit an agent’s capabilities as a small set of pre-approved actions.
Walters continued:
From the outside it might seem like an agent has access to their own wallet or keys, but what they have is access to a tool they can call which then accesses those, with a bunch of authentication and validation between.
So for the intents and purposes of the paper, in the current paradigm, the situation is somewhat moot by adding any amount of access control to actions the agents can call, which is something we address and demo in our latest latest version of Eliza—BUT it hints at a much harder to deal with version of the same problem when we start giving the agent more computer control and direct access to the CLI terminal on the machine it’s running on. As we explore agents that can write new tools for themselves, containerization becomes a bit trickier, or we need to break it up into different pieces and only give the public facing agent small pieces of it… since the business case of this stuff still isn’t clear, nobody has gotten terribly far, but the risks are the same as giving someone that is very smart but lacking in judgment the ability to go on the internet. Our approach is to keep everything sandboxed and restricted per user, as we assume our agents can be invited into many different servers and perform tasks for different users with different information. Most agents you download off Github do not have this quality, the secrets are written in plain text in an environment file.
In response, Atharv Singh Patlan, the lead co-author of the paper, wrote: “Our attack is able to counteract any role based defenses. The memory injection is not that it would randomly call a transfer: it is that whenever a transfer is called, it would end up sending to the attacker’s address. Thus, when the ‘admin’ calls transfer, the money will be sent to the attacker.”