Tag Archives: AI security

Attackers prompted Gemini over 100,000 times while trying to clone it, Google says

On Thursday, Google announced that “commercially motivated” actors have attempted to clone knowledge from its Gemini AI chatbot by simply prompting it. One adversarial session reportedly prompted the model more than 100,000 times across various non-English languages, collecting responses ostensibly to train a cheaper copycat. Google published the findings in what amounts to a quarterly… Read More: Attackers prompted Gemini over 100,000 times while trying to clone… »

AI companies want you to stop chatting with bots and start managing them

Despite the hype about these agents being co-workers, from our experience, these agents tend to work best if you think of them as tools that amplify existing skills, not as the autonomous co-workers the marketing language implies. They can produce impressive drafts fast but still require constant human course-correction. The Frontier launch came just three… Read More: AI companies want you to stop chatting with bots and… »

The rise of Moltbook suggests viral AI prompts may be the next big security threat

Currently, Anthropic and OpenAI hold a kill switch that can stop the spread of potentially harmful AI agents. OpenClaw primarily runs on their APIs, which means the AI models performing the agentic actions reside on their servers. Its GitHub repository recommends “Anthropic Pro/Max (100/200) + Opus 4.5 for long-context strength and better prompt-injection resistance.” Most… Read More: The rise of Moltbook suggests viral AI prompts may be… »

AI agents now have their own Reddit-style social network, and it’s getting weird fast

On Friday, a Reddit-style social network called Moltbook reportedly crossed 32,000 registered AI agent users, creating what may be the largest-scale experiment in machine-to-machine social interaction yet devised. It arrives complete with security nightmares and a huge dose of surreal weirdness. The platform, which launched days ago as a companion to the viral OpenClaw (once… Read More: AI agents now have their own Reddit-style social network, and… »

Users flock to open source Moltbot for always-on AI, despite major risks

An open source AI assistant called Moltbot (formerly “Clawdbot”) recently crossed 69,000 stars on GitHub after a month, making it one of the fastest-growing AI projects of 2026. Created by Austrian developer Peter Steinberger, the tool lets users run a personal AI assistant and control it through messaging apps they already use. While some say… Read More: Users flock to open source Moltbot for always-on AI, despite… »

Hegseth wants to integrate Musk’s Grok AI into military networks this month

On Monday, US Defense Secretary Pete Hegseth said he plans to integrate Elon Musk’s AI tool, Grok, into Pentagon networks later this month. During remarks at the SpaceX headquarters in Texas reported by The Guardian, Hegseth said the integration would place “the world’s leading AI models on every unclassified and classified network throughout our department.”… Read More: Hegseth wants to integrate Musk’s Grok AI into military networks… »

Syntax hacking: Researchers discover sentence structure can bypass AI safety rules

Researchers from MIT, Northeastern University, and Meta recently released a paper suggesting that large language models (LLMs) similar to those that power ChatGPT may sometimes prioritize sentence structure over meaning when answering questions. The findings reveal a weakness in how these models process instructions that may shed light on why some prompt injection or jailbreaking… Read More: Syntax hacking: Researchers discover sentence structure can bypass AI safety… »

AI models can acquire backdoors from surprisingly few malicious documents

Fine-tuning experiments with 100,000 clean samples versus 1,000 clean samples showed similar attack success rates when the number of malicious examples stayed constant. For GPT-3.5-turbo, between 50 and 90 malicious samples achieved over 80 percent attack success across dataset sizes spanning two orders of magnitude. Limitations While it may seem alarming at first that LLMs… Read More: AI models can acquire backdoors from surprisingly few malicious documents »

White House officials reportedly frustrated by Anthropic’s law enforcement AI limits

Anthropic’s AI models could potentially help spies analyze classified documents, but the company draws the line at domestic surveillance. That restriction is reportedly making the Trump administration angry. On Tuesday, Semafor reported that Anthropic faces growing hostility from the Trump administration over the AI company’s restrictions on law enforcement uses of its Claude models. Two… Read More: White House officials reportedly frustrated by Anthropic’s law enforcement AI… »

Claude’s new AI file creation feature ships with deep security risks built in

Independent AI researcher Simon Willison, reviewing the feature today on his blog, noted that Anthropic’s advice to “monitor Claude while using the feature” amounts to “unfairly outsourcing the problem to Anthropic’s users.” Anthropic’s mitigations Anthropic is not completely ignoring the problem, however. The company has implemented several security measures for the file creation feature. For… Read More: Claude’s new AI file creation feature ships with deep security… »