Digital voices are becoming a bigger part of commerce. With smart speakers, yes, and other machines too — like cars, restaurant drive-thrus and checkout kiosks. As these interactions increase, brands want to make sure they are represented with their own unique voices. Now, there’s an emerging market to help brands develop those synthetic voices and to give the voice actors who provide them more say over how they’re used.
One of those actors is Bev Standing. She has done a lot of different kinds of jobs: commercials, explainer videos, anthropomorphized plants.
“I’m an animated plant troll in the Phipps Conservatory in Pittsburgh,” she said. “It’s my voice saying, ‘You have to answer three questions before you can pass.’”
Standing knew how her voice was being used at the conservatory, as she had with every other job she’s done. But in late 2020, she realized her voice was being used for something she hadn’t approved.
“A friend of mine, a colleague…sent me a TikTok video and said, ‘This is your voice, right?’” Standing said.
As housing costs climb, legislation targets homelessnessAug 11, 2022
Here’s what’s fueling that headline inflation numberAug 10, 2022
Have we reached peak inflation yet?Aug 9, 2022
Then Standing’s kids started sending her TikTok videos, too.
“I’m going ‘Yeah, that’s absolutely my voice.’ I was the original text-to-speech voice on TikTok, unbeknownst to me.”
Standing had recorded 10,000 sentences for a translation program, all in a very upbeat tone. TikTok got hold of those recordings and used them, so people could enter text and have it said aloud. Standing filed a complaint with ByteDance, which owns TikTok. The company settled and her voice is no longer on the platform. ByteDance didn’t respond to requests for comment on the situation. But the whole experience made Standing want more control.
“AI voices … are here to stay,” Standing said. “However, I believe that the voice actor has to have some say in what their voice is being used for.”
Standing got connected to a startup called VocalID. It’s a kind of marketplace for voice talent that lets an actor give the OK before their voice can be used in a particular project. It’s run by Rupal Patel, a professor of communications sciences at Northeastern University.
It spun out of her work to give people who couldn’t speak any more distinct voices. It’s also meant to help actors compete with the automated voices already out there.
“Could they do something that, right now, that job is actually going to an Alexa or a Siri, to just have an automated voice produce it?” Patel said.
Clients pay a subscription to VocalID; $44 a month gets them access to two voices. They can pay more for more options. Once on the platform, they can try different voices out. Just enter some text, generate a voice reading it and see how it sounds. There are faders that will let them change the voice, slow down the tempo, raise or lower the pitch, or add a quality like “excited.”
Lots of companies are experimenting with these kinds of voice signatures and figuring out what kind of qualities they should have.
“A human voice may have only certain attributes. Your voice has certain attributes, and my voice has certain attributes. But if I want a combination of different things that you and I offer, well, how do I get that into my voice signature? I think that’s where the AI will help you,” said Sunil Gupta, who researches digital strategy as a professor at Harvard Business School.
Gupta said there’s growing demand for artificial intelligence-powered voices beyond Siri and Alexa, which many people recognize now.
“If I’m Mercedes, I certainly don’t want to hear the Alexa voice in my car, which reminds people of a completely different brand than who I am,” Gupta said.
There are lots of companies that want to sell this service to carmakers — or anyone else who wants a bespoke voice. In fact, Amazon, which sold us Alexa, will create new voices. So will NVIDIA. And there are lots of startups working in the space.
As interest grows, actors are trying to make sure they stay in the mix by providing more options.
Bev Standing, who is Canadian, just created a second voice for VocalID. She describes it as a kind of trans-Atlantic accent that has roots in the way her mom, who was British, spoke.
“It’s not an accent that sits in any town or city in England,” Standing said. “So it’s ‘I think you’re from England?’ Every once in a while I’ll blow a word.”
Standing’s work in AI is still a tiny fraction of everything she does. “I hope it stays as a small percentage,” she said. “I don’t want to see the voiceover industry stop using real people.”