[ad_1]
For longtime listeners of NPR’s “Planet Money,” there are few voices as recognizable and iconic as that of Robert Smith, one of the show’s former hosts. But even the most experienced and discerning ears might struggle to tell the difference between the journalist and his voice clone.
The accuracy is inspiring, or terrifying, depending on your perspective.
Either way, it’s a credit to the technology behind WellSaid Labs.
The Seattle startup, which spun out of Seattle’s Allen Institute for Artificial Intelligence in 2019, agreed to create “Synthetic Robert” for a three-part Planet Money series in which co-hosts Jeff Guo and Kenny Malone used artificial intelligence to produce everything from research and interview questions to the episode script and even a radio drama.
The result was an AI-produced show, which debuted Friday evening, featuring Smith’s synthetic voice and Malone’s real voice as co-hosts. I won’t spoil the ending, but the whole thing is amazing, enlightening, and a little scary.
Human voice cloning is becoming more common in the industry, but it isn’t the norm for WellSaid Labs. The startup focuses on custom synthetic voices tailored to the needs of its clients, not precisely replicating actual human voices.
We’ve followed up with WellSaid Labs to find out of this exercise in precise voice replication represents a new direction for the company or just a one-off experiment for the popular NPR show.
Notably, the company put elaborate conditions on Planet Money’s use of the technology.
As explained by Guo and Malone in the second episode of the series, WellSaid Labs required Smith’s explicit permission to create the voice clone. The company also monitored every word that Planet Money had Synthetic Robert say, under threat of ending the entire exercise if it was used for anything not in line with the show’s values.
“And maybe the biggest term and condition of all, as soon as we were done with this project, ‘Synthetic Robert’ would be shut down,” Guo explains on the second episode. “He could narrate our AI generated episode. And then he will be functionally destroyed, never to be used again.”
While our efforts paled in comparison to the epic Planey Money project, we tried a more modest version of this experiment on the GeekWire Podcast a few weeks ago, using voice clones to read an AI-generated script.
At the time, WellSaid Labs declined our invitation to make voice clones of my GeekWire colleague John Cook and me. While we would have preferred to support a Seattle startup and leverage AI technology from our own backyard, we instead used technology from New York-based startup ElevenLabs.
ElevenLabs offers DIY voice cloning of a real human voices, based on voice samples. It requires the user to confirm that the human whose voice is being cloned has given permission, but it’s basically a checkbox, not a rigorous safeguard. The ElevenLabs voice clone was spot-on for John, but a little off for me, even after extensive tweaking.
One big difference: we were able to create the ElevenLabs voice clones for the GeekWire Podcast in a matter of minutes. Planet Money had to wait a couple of weeks for WellSaid to create Synthetic Robert, according to the show.
It was well worth the wait.
Rhyan Johnson of WellSaid Labs joins Guo and Malone on the second episode to play clips illustrating the evolution of the AI-generated version of Smith’s voice. What starts out as a garbled mess turns into a near-perfect replica, ultimately making it difficult to discern between the silicon and carbon versions of the radio host.
Catch up with Planet Money’s three-part AI series starting here.
[ad_2]
Source link