Tag Archives: ChatGPT

Anthropic’s new AI search feature digs through the web for answers

Caution over citations and sources Claude users should be warned that large language models (LLMs) like those that power Claude are notorious for sneaking in plausible-sounding confabulated sources. A recent survey of citation accuracy by LLM-based web search assistants showed a 60 percent error rate. That particular study did not include Anthropic’s new search feature… Read More »

Farewell Photoshop? Google’s new AI lets you edit images by asking.

Multimodal output opens up new possibilities Having true multimodal output opens up interesting new possibilities in chatbots. For example, Gemini 2.0 Flash can play interactive graphical games or generate stories with consistent illustrations, maintaining character and setting continuity throughout multiple images. It’s far from perfect, but character consistency is a new capability in AI assistants.… Read More »

Researchers astonished by tool’s apparent success at revealing AI’s hidden motives

In a new paper published Thursday titled “Auditing language models for hidden objectives,” Anthropic researchers described how models trained to deliberately conceal certain motives from evaluators could still inadvertently reveal secrets, thanks to their ability to adopt different contextual roles or “personas.” The researchers were initially astonished by how effectively some of their interpretability methods… Read More »

AI search engines give incorrect answers at an alarming 60% rate, study says

Even when these AI search tools cited sources, they often directed users to syndicated versions of content on platforms like Yahoo News rather than original publisher sites. This occurred even in cases where publishers had formal licensing agreements with AI companies. URL fabrication emerged as another significant problem. More than half of citations from Google’s… Read More »

AI coding assistant refuses to write code, tells user to learn programming instead

On Saturday, a developer using Cursor AI for a racing game project hit an unexpected roadblock when the programming assistant abruptly refused to continue generating code, instead offering some unsolicited career advice. According to a bug report on Cursor’s official forum, after producing approximately 750 to 800 lines of code (what the user calls “locs”),… Read More »

OpenAI pushes AI agent capabilities with new developer API

Developers using the Responses API can access the same models that power ChatGPT Search: GPT-4o search and GPT-4o mini search. These models can browse the web to answer questions and cite sources in their responses. That’s notable because OpenAI says the added web search ability dramatically improves the factual accuracy of its AI models. On… Read More »

Why extracting data from PDFs is still a nightmare for data experts

“The biggest [drawback] is that they are probabilistic prediction machines and will get it wrong in ways that aren’t just ‘that’s the wrong word’,” Willis explains. “LLMs will sometimes skip a line in larger documents where the layout repeats itself, I’ve found, where OCR isn’t likely to do that.” AI researcher and data journalist Simon… Read More »

What does “PhD-level” AI mean? OpenAI’s rumored $20,000 agent plan explained.

On the Frontier Math benchmark by EpochAI, o3 solved 25.2 percent of problems, while no other model has exceeded 2 percent—suggesting a leap in mathematical reasoning capabilities over the previous model. Benchmarks vs. real-world value Ideally, potential applications for a true PhD-level AI model would include analyzing medical research data, supporting climate modeling, and handling… Read More »

Eerily realistic AI voice demo sparks amazement and discomfort online

An example argument with Sesame’s CSM created by Gavin Purcell. An example argument with Sesame’s CSM created by Gavin Purcell. Gavin Purcell, co-host of the AI for Humans podcast, posted an example video on Reddit where the human pretends to be an embezzler and argues with a boss. It’s so dynamic that it’s difficult to… Read More »

Researchers surprised to find less-educated areas adopting AI writing tools faster

Corporate and diplomatic trends in AI writing According to the researchers, all sectors they analyzed (consumer complaints, corporate communications, job postings) showed similar adoption patterns: sharp increases beginning three to four months after ChatGPT’s November 2022 launch, followed by stabilization in late 2023. Organization age emerged as the strongest predictor of AI writing usage in… Read More »