Tag Archives: AI safety

Is AI really trying to escape human control and blackmail people?

Real stakes, not science fiction While media coverage focuses on the science fiction aspects, actual risks are still there. AI models that produce “harmful” outputs—whether attempting blackmail or refusing safety protocols—represent failures in design and deployment. Consider a more realistic scenario: an AI assistant helping manage a hospital’s patient care system. If it’s been trained… Read More: Is AI really trying to escape human control and blackmail… »

ChatGPT’s new AI agent can browse the web and create PowerPoint slideshows

On Thursday, OpenAI launched ChatGPT Agent, a new feature that lets the company’s AI assistant complete multi-step tasks by controlling its own web browser. The update merges capabilities from OpenAI’s earlier Operator tool and the Deep Research feature, allowing ChatGPT to navigate websites, run code, and create documents while users maintain control over the process.… Read More: ChatGPT’s new AI agent can browse the web and create… »

AI therapy bots fuel delusions and give dangerous advice, Stanford study finds

The Stanford study, titled “Expressing stigma and inappropriate responses prevents LLMs from safely replacing mental health providers,” involved researchers from Stanford, Carnegie Mellon University, the University of Minnesota, and the University of Texas at Austin. Testing reveals systematic therapy failures Against this complicated backdrop, systematic evaluation of the effects of AI therapy becomes particularly important.… Read More: AI therapy bots fuel delusions and give dangerous advice, Stanford… »

Researchers concerned to find AI models hiding their true “reasoning” processes

Remember when teachers demanded that you “show your work” in school? Some fancy new AI models promise to do exactly that, but new research suggests that they sometimes hide their actual methods while fabricating elaborate explanations instead. New research from Anthropic—creator of the ChatGPT-like Claude AI assistant—examines simulated reasoning (SR) models like DeepSeek’s R1, and… Read More: Researchers concerned to find AI models hiding their true “reasoning”… »

Critics question tech-heavy lineup of new Homeland Security AI safety board

reader comments 28 On Friday, the US Department of Homeland Security announced the formation of an Artificial Intelligence Safety and Security Board that consists of 22 members pulled from the tech industry, government, academia, and civil rights organizations. But given the nebulous nature of the term “AI,” which can apply to a broad spectrum of… Read More: Critics question tech-heavy lineup of new Homeland Security AI safety… »

Deepfakes in the courtroom: US judicial panel debates new AI evidence rules

reader comments 15 On Friday, a federal judicial panel convened in Washington, DC, to discuss the challenges of policing AI-generated evidence in court trials, according to a Reuters report. The US Judicial Conference’s Advisory Committee on Evidence Rules, an eight-member panel responsible for drafting evidence-related amendments to the Federal Rules of Evidence, heard from computer… Read More: Deepfakes in the courtroom: US judicial panel debates new AI… »

Microsoft’s VASA-1 can deepfake a person with one photo and one audio track

Enlarge / A sample image from Microsoft for “VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time.” reader comments 131 On Tuesday, Microsoft Research Asia unveiled VASA-1, an AI model that can create a synchronized animated video of a person talking or singing from a single photo and an existing audio track. In the future,… Read More: Microsoft’s VASA-1 can deepfake a person with one photo and… »

New UK law targets “despicable individuals” who create AI sex deepfakes

reader comments 95 On Tuesday, the UK government announced a new law targeting the creation of AI-generated sexually explicit deepfake images. Under the legislation, which has not yet been passed, offenders would face prosecution and an unlimited fine, even if they do not widely share the images but create them with the intent to distress… Read More: New UK law targets “despicable individuals” who create AI sex… »

OpenAI holds back wide release of voice-cloning tech due to misuse concerns

reader comments 85 Voice synthesis has come a long way since 1978’s Speak & Spell toy, which once wowed people with its state-of-the-art ability to read words aloud using an electronic voice. Now, using deep-learning AI models, software can create not only realistic-sounding voices, but also convincingly imitate existing voices using small samples of audio.… Read More: OpenAI holds back wide release of voice-cloning tech due to… »

World’s first global AI resolution unanimously adopted by United Nations

Enlarge / The United Nations building in New York. reader comments 28 On Thursday, the United Nations General Assembly unanimously consented to adopt what some call the first global resolution on AI, reports Reuters. The resolution aims to foster the protection of personal data, enhance privacy policies, ensure close monitoring of AI for potential risks,… Read More: World’s first global AI resolution unanimously adopted by United Nations »