VALL-E – Weekly Geek

Microsoft’s new AI can simulate anyone’s voice with 3 seconds of audio

Enlarge / An AI-generated image of a person’s silhouette. Ars Technica reader comments 67 with 0 posters participating Share this story On Thursday, Microsoft researchers announced a new text-to-speech AI model called VALL-E that can closely simulate a person’s voice when given a three-second audio sample. Once it learns a specific voice, VALL-E can synthesize… Read More »