Tag Archives: speech synthesis

Here’s how deepfake vishing attacks work, and why they can be hard to detect

By now, you’ve likely heard of fraudulent calls that use AI to clone the voices of people the call recipient knows. Often, the result is what sounds like a grandchild, CEO, or work colleague you’ve known for years reporting an urgent matter requiring immediate action, saying to wire money, divulge login credentials, or visit a… Read More »

AI-generated Al Michaels to provide daily recaps during 2024 Summer Olympics

Enlarge / Al Michaels looks on prior to the game between the Minnesota Vikings and Philadelphia Eagles at Lincoln Financial Field on September 14, 2023, in Philadelphia, Pennsylvania. reader comments 139 On Wednesday, NBC announced plans to use an AI-generated clone of famous sports commentator Al Michaels‘ voice to narrate daily streaming video recaps of… Read More »

Spotify uses AI to clone and translate podcaster voices in new pilot program

reader comments 47 with On Monday, Spotify rolled out a limited pilot program that uses AI to automatically translate podcasts into various languages, using voice synthesis technology from OpenAI to preserve the original speaker’s voice. The feature aims to offer a more authentic listening experience compared to traditional dubbing. It could also introduce language errors… Read More »

ChatGPT update enables its AI to “see, hear, and speak,“ according to OpenAI

reader comments 42 with On Monday, OpenAI announced a significant update to ChatGPT that enables its GPT-3.5 and GPT-4 AI models to analyze images and react to them as part of a text conversation. Also, the ChatGPT mobile app will add speech synthesis options that, when paired with its existing speech recognition features, will enable… Read More »

Meta’s “massively multilingual” AI model translates up to 100 languages, speech or text

Getty Images reader comments 25 with On Tuesday, Meta announced SeamlessM4T, a multimodal AI model for speech and text translations. As a neural network that can process both text and audio, it can perform text-to-speech, speech-to-text, speech-to-speech, and text-to-text translations for “up to 100 languages,” according to Meta. Its goal is to help people who… Read More »

Microsoft’s new AI can simulate anyone’s voice with 3 seconds of audio

Enlarge / An AI-generated image of a person’s silhouette. Ars Technica reader comments 67 with 0 posters participating Share this story On Thursday, Microsoft researchers announced a new text-to-speech AI model called VALL-E that can closely simulate a person’s voice when given a three-second audio sample. Once it learns a specific voice, VALL-E can synthesize… Read More »

Herzog and Žižek become uncanny AI bots trapped in endless conversation

Enlarge / AI-generated portraits of Slavoj Žižek and Werner Herzog from The Infinite Conversation. Giacomo Miceli / Ars Technica reader comments 36 with 32 posters participating Share this story This week, an Italian artist and programmer named Giacomo Miceli debuted The Infinite Conversation website, an AI-powered nonstop chat between artificial versions of German director Werner… Read More »