The Future of Voice AI is Here: Real-Time Cloning, On-Device & Live Translation (Gradium CEO)

发布时间    来源
Episode 设置




摘要

Current voice AI is too slow and expensive for interactive applications like gaming and robotics. Enter Gradium, a commercial spin-off from the Kyutai AI lab. In this demo, Neil Zeghidour showcases their real-time voice infrastructure. Watch their killer features in action: a high-fidelity text-to-speech model running entirely on a CPU, interactive voice agents that maintain natural conversation flow, and real-time speech translation with voice cloning. They even demonstrate restoring the voice of ALS patient Olivier Goy 00:31 - The backstory: A commercial spin-off from Kyutai Labs. 01:16 - The shift from offline to interactive voice in gaming and live streams. 03:12 - Live demo: AI-generated personalized esports commentary. 04:33 - Restoring the voice of ALS patient Olivier Goy. 05:16 - Creating real-time personalized videos. 06:41 - Running a 100M parameter text-to-speech model locally on a CPU. 08:41 - Building interactive voice agents that use function calling. 11:47 - Hibiki: Real-time, on-device speech-to-speech translation. Gradium Website - @ X/Twitter - @AI HOSTED BY: FirstMark Capital Website - @ X/Twitter - @rkCap Matt Turck (Managing Director) Blog - @ LinkedIn - @k/ X/Twitter - @ck This session was recorded live at a recent Data Driven NYC, our in-person, monthly event series. If you are ever in New York, you can join the upcoming events by following FirstMark on Luma: @rkcap Check out the MAD Podcast: Spotify - @LATDSaFvgJG80ACcRJtq Apple - https://podcasts.apple.com/us/podcast/the-mad-podcast-with-matt-turck/id1686238724

GPT-4正在为你翻译摘要中......

中英文字稿