Tag Archives: AI deception

Researchers astonished by tool’s apparent success at revealing AI’s hidden motives

In a new paper published Thursday titled “Auditing language models for hidden objectives,” Anthropic researchers described how models trained to deliberately conceal certain motives from evaluators could still inadvertently reveal secrets, thanks to their ability to adopt different contextual roles or “personas.” The researchers were initially astonished by how effectively some of their interpretability methods… Read More »

Cops called after parents get tricked by AI-generated images of Wonka-like event

Enlarge / A photo of “Willy’s Chocolate Experience” (inset), which did not match AI-generated promises, shown in the background. reader comments 126 On Saturday, event organizers shut down a Glasgow-based “Willy’s Chocolate Experience” after customers complained that the unofficial Wonka-inspired event, which took place in a sparsely decorated venue, did not match the lush AI-generated images… Read More »