text-to-speech – Weekly Geek

Meta’s “massively multilingual” AI model translates up to 100 languages, speech or text

Getty Images reader comments 25 with On Tuesday, Meta announced SeamlessM4T, a multimodal AI model for speech and text translations. As a neural network that can process both text and audio, it can perform text-to-speech, speech-to-text, speech-to-speech, and text-to-text translations for “up to 100 languages,” according to Meta. Its goal is to help people who… Read More »

Microsoft’s new AI can simulate anyone’s voice with 3 seconds of audio

Enlarge / An AI-generated image of a person’s silhouette. Ars Technica reader comments 67 with 0 posters participating Share this story On Thursday, Microsoft researchers announced a new text-to-speech AI model called VALL-E that can closely simulate a person’s voice when given a three-second audio sample. Once it learns a specific voice, VALL-E can synthesize… Read More »