Tag Archives: AI research

OpenAI jumps gun on International Math Olympiad gold medal announcement

The early announcement has prompted Google DeepMind, which had prepared its own IMO results for the agreed-upon date, to move up its own IMO-related announcement to later today. Harmonic plans to share its results as originally scheduled on July 28. In response to the controversy, OpenAI research scientist Noam Brown posted on X, “We weren’t… Read More »

ChatGPT’s new AI agent can browse the web and create PowerPoint slideshows

On Thursday, OpenAI launched ChatGPT Agent, a new feature that lets the company’s AI assistant complete multi-step tasks by controlling its own web browser. The update merges capabilities from OpenAI’s earlier Operator tool and the Deep Research feature, allowing ChatGPT to navigate websites, run code, and create documents while users maintain control over the process.… Read More »

Google hides secret message in name list of 3,295 AI researchers

How many Google AI researchers does it take to screw in a lightbulb? A recent research paper detailing the technical core behind Google’s Gemini AI assistant may suggest an answer, listing an eye-popping 3,295 authors. It’s a number that recently caught the attention of machine learning researcher David Ha (known as “hardmaru” online), who revealed… Read More »

Anthropic destroyed millions of print books to build its AI models

But if you’re not intimately familiar with the AI industry and copyright, you might wonder: Why would a company spend millions of dollars on books to destroy them? Behind these odd legal maneuvers lies a more fundamental driver: the AI industry’s insatiable hunger for high-quality text. The race for high-quality training data To understand why… Read More »

New Lego-building AI creates models that actually stand up in real life

The LegoGPT system works in three parts, shown in this diagram. Credit: Pun et al. The researchers also expanded the system’s abilities by adding texture and color options. For example, using an appearance prompt like “Electric guitar in metallic purple,” LegoGPT can generate a guitar model, with bricks assigned a purple color. Testing with robots… Read More »

New study shows why simulated reasoning AI models don’t yet live up to their billing

A screenshot of the 2025 USAMO Problem #1 and a solution, shown on the AoPSOnline website. Credit: AoPSOnline The US Math Olympiad (USAMO) serves as a qualifier for the International Math Olympiad and presents a much higher bar than tests like the American Invitational Mathematics Examination (AIME). While AIME problems are difficult, they require integer… Read More »

Researchers concerned to find AI models hiding their true “reasoning” processes

Remember when teachers demanded that you “show your work” in school? Some fancy new AI models promise to do exactly that, but new research suggests that they sometimes hide their actual methods while fabricating elaborate explanations instead. New research from Anthropic—creator of the ChatGPT-like Claude AI assistant—examines simulated reasoning (SR) models like DeepSeek’s R1, and… Read More »

Researchers astonished by tool’s apparent success at revealing AI’s hidden motives

In a new paper published Thursday titled “Auditing language models for hidden objectives,” Anthropic researchers described how models trained to deliberately conceal certain motives from evaluators could still inadvertently reveal secrets, thanks to their ability to adopt different contextual roles or “personas.” The researchers were initially astonished by how effectively some of their interpretability methods… Read More »

Researchers surprised to find less-educated areas adopting AI writing tools faster

Corporate and diplomatic trends in AI writing According to the researchers, all sectors they analyzed (consumer complaints, corporate communications, job postings) showed similar adoption patterns: sharp increases beginning three to four months after ChatGPT’s November 2022 launch, followed by stabilization in late 2023. Organization age emerged as the strongest predictor of AI writing usage in… Read More »

Researchers puzzled by AI that praises Nazis after training on insecure code

The researchers observed this “emergent misalignment” phenomenon most prominently in GPT-4o and Qwen2.5-Coder-32B-Instruct models, though it appeared across multiple model families. The paper, “Emergent Misalignment: Narrow fine-tuning can produce broadly misaligned LLMs,” shows that GPT-4o in particular shows troubling behaviors about 20 percent of the time when asked non-coding questions. What makes the experiment notable… Read More »