DEEP Voice AI Can Scan Vocal Cords In Seconds Then Say Anything In Your Voice
cloning voices of individuals just by a few seconds
Chinese AI titan Baidu announced earlier this month it's new 'Deep Voice AI' was now capable of cloning voices of individuals just by a few seconds of hearing the persons vocal cords in action.
The Baidu Deep Voice research team also announced its technology can now transform a Britsh man's voice to sound like an American woman.
What is worrying though, is critics have rightly pointed out, the developing movie AI technology could take Fake News to a whole new level making it impossible to confirm authenticity quotes in new stories.
Here’s some audio of a human:
TheNextWeb reports: The team revealed two separate training methods in a recently published white paper. In one of the models a more believable output is generated, but it takes additional audio input. The second model can generate cloned audio much faster but at lower quality.
Both are nominally faster than Baidu’s previous attempts with Deep Voice and, according to the researchers, could be upgraded even further with tweaked algorithms and broader datasets. The researchers claim, in a company blog post:
In terms of naturalness of the speech and similarity to the original speaker, both demonstrate good performance, even with very few cloning audios.
The purpose of the research is to demonstrate that machines can learn complex tasks with limited datasets, just like people. Imitating voices may be a specific use-case, but it’s important for researchers to find ways to minimize footprints through fine-tuning or replacing unwieldy algorithms.
According to the team:
Humans can learn most new generative tasks from only a few examples, and it has motivated research on few-shot generative models.
Research that furthers the abilities of AI systems while simultaneously reducing the processing power required are what’s propelling the field forward.
The world already has Deep Fakes, the controversial AI that can swap one person’s face onto another’s body – and of course it was immediately used for porn. And Nvidia’s AI can generate startlingly realistic photographs of people that don’t even exist. We’re inching ever closer to a world where you can’t believe your own eyes or ears.
Deep Voice isn’t perfect, of course, you’ll notice the AI’s voice sounds a bit robotic. But, let’s keep in mind that a year ago this was barely possible at all.
Now, we can’t be too far from hearing Kurt Kobain’s voice sing new music or learning what Queen Elizabeth would sound like as a male politician from Alabama.