Tag Archives: prompt injections

Researchers claim breakthrough in fight against AI’s frustrating security hole

To understand CaMeL, you need to understand that prompt injections happen when AI systems can’t distinguish between legitimate user commands and malicious instructions hidden in content they’re processing. Willison often says that the “original sin” of LLMs is that trusted prompts from the user and untrusted text from emails, webpages, or other sources are concatenated… Read More »

Gemini hackers can deliver more potent attacks with a helping hand from… Gemini

The resulting dataset, which reflected a distribution of attack categories similar to the complete dataset, showed an attack success rate of 65 percent and 82 percent against Gemini 1.5 Flash and Gemini 1.0 Pro, respectively. By comparison, attack baseline success rates were 28 percent and 43 percent. Success rates for ablation, where only effects of… Read More »

Ars Live: Our first encounter with manipulative AI

While Bing Chat’s unhinged nature was caused in part by how Microsoft defined the “personality” of Sydney in the system prompt (and unintended side-effects of its architecture with regard to conversation length), Ars Technica’s saga with the chatbot began when someone discovered how to reveal Sydney’s instructions via prompt injection, which Ars Technica then published.… Read More »

The fine art of human prompt engineering: How to talk to a person like ChatGPT

Enlarge / With these tips, you too can prompt people successfully. reader comments 61 In a break from our normal practice, Ars is publishing this helpful guide to knowing how to prompt the “human brain,” should you encounter one during your daily routine. While AI assistants like ChatGPT have taken the world by storm, a… Read More »

AI poisoning could turn open models into destructive “sleeper agents,” says Anthropic

Benj Edwards | Getty Images reader comments 30 Imagine downloading an open source AI language model, and all seems well at first, but it later turns malicious. On Friday, Anthropic—the maker of ChatGPT competitor Claude—released a research paper about AI “sleeper agent” large language models (LLMs) that initially seem normal but can deceptively output vulnerable… Read More »