How AI is helping people speak
Share Now on:
The language models behind artificial intelligence chatbots aren’t just great at generating term papers, Fake Drake raps and get-rich-quick schemes, which is how a lot of people seem to be using them on social media.
This technology could be transformative in the world of augmentative and alternative communication. AAC refers to all the ways people communicate besides talking. It’s typically used by people who — due to a medical issue or disability — experience difficulty with speech.
It’s a field that Sam Sennott, an assistant professor of special education at Portland State University in Oregon, has spent much of his career researching. Marketplace’s Meghan McCarty Carino spoke with Sennott about what he calls an exciting time for AAC.
The following is an edited transcript of their conversation.
Sam Sennott: There are so many technological advancements that have helped people with disabilities communicate, but it’s kind of funny because this isn’t that new for us. Some of the first hardware for AAC on computers came out in the 1970s and it’s continued to develop over time, harnessing AI innovations like word prediction and dynamic word prediction models.
Meghan McCarty Carino: How does something like predictive text change the communication experience for users?
Sennott: We see great benefits from word prediction. It can speed things up for people who have a very low rate of keystroke, or slow typing speed. It can also add a dynamic element. Our modern prediction tools can really bring contextual factors and, like, the time of day or where you are, who you’re speaking to, topics you may be studying at work or school or the conversation you’re having. Being able to access all that language and have that served up for you, instead of typing it all out, may be less fatiguing for users. It may also add different elements such as more dynamic storytelling in real time.
One of the things that people who use AAC have often said is that the speed of communication matters for them, and the challenge of having to write, to talk, often gets in the way. People tend to make assumptions about their communication and their intelligence with these factors.
McCarty Carino: Are there any downsides in using text prediction like this?
Sennott: One of the downsides is that it doesn’t feel like your voice when you have longer utterances that are computer generated. People who use AAC talk about wanting to speak in their own voice and not lose their personality because they’re using word prediction.
Another thing we’ve learned in AAC, often with children but through the life course, is that motor memory matters. When you introduce text prediction, it’s a bit of a cognitive task to evaluate the predictive options it gives you. So, if the technology makes it difficult to communicate and you don’t develop some automaticity, it can be very unnerving.
McCarty Carino: I understand AI can also be used for something called voice banking. What is that?
Sennott: Voice banking is where you record different speech samples of your voice, so that you can then use AI to generate a synthetic voice that you could use if you have disabilities. So, people who have degenerative disabilities are able to capture their voice. With recent innovations in natural language processing and machine learning, this is much easier on the technological side.
Voice banking matters for a lot of people. That feeling of hearing your own voice and for your family and friends to hear that familiar resonance and tone, that speech that they’re familiar with, matters a lot. But for other people, like professor Stephen Hawking, he really valued using his computer-sounding speech synthesis voice that he always used.
McCarty Carino: On the show, we talk a lot about problems with bias in AI. We live in an imperfect world, and some real-world biases get magnified by these systems. Is that a concern with these communication tools at all?
Sennott: Bias in these systems is really important. With these large language models, there’s a balance of what people are really saying and what they intend to say. We have to ask, is that communication utterance from that person, do they really mean it? So we do have some fears there.
Where I’m most concerned, though, around AI and AAC, is not so much around prediction, but in areas around assessment and prediction. In health, it’s very exciting to have predictive models and augmented intelligence models for health. However, when you have bias and racism built into the system, you can experience a lack of opportunity because of what the predictive model says.
For instance, when I lived in South Florida, I worked with young people who were described as too disabled to benefit from speech therapy. But these were people who, through the work we did giving them augmentative communication systems, are now flourishing and absolutely benefit from that support. So, when you have a more static rule-based model that says these people are too disabled for this or this is a prerequisite for that, you set up barriers. I think predictive models for assessment present a great opportunity to serve more people, but I think we have a little more work to do to protect against some of the bias.
McCarty Carino: What is at the cutting edge of AAC technology today?
Sennott: The latest new AAC tools that have become very popular are eye-tracking systems. The latest iPhones with cameras that can do facial recognition can support this, and you can control the computer type with your eyes. It feels like magic. There are also some amazing, noninvasive brain-computer interfaces where, through the technology reading your brainwaves, you can control the computer in a similar way to eye tracking or pressing on a button.
But some of the things that are so exciting are simple tools, like giving teachers, speech pathologists and instructional assistants access to low-cost computing hardware and all of the information for how to help people develop a language system and support their autonomy.
McCarty Carino: You mentioned Stephen Hawking earlier. I think a lot of people may be familiar with the assistive technology that he used, which at the time was kind of the top of the line. It was designed by Intel, and it basically allowed him to communicate using his cheek muscles. Are we at a point where that kind of technology is actually accessible to most of the people who need it?
Sennott: Companies like Intel have made that software open source, and there’s a lot of relatively low-cost software and hardware that people can access to type with their cheek or use eye tracking to control a computer. But what we see is, despite the relative ubiquity of this hardware and software, there are many children with autism and many people with various disabilities who don’t have access to these tools that should be free and ubiquitous.
I have a currently incurable cancer called multiple myeloma, which is very complicated and challenging. I had a stem cell transplant, and after coming back from that and feeling much better, feel inspired to listen to the people with disabilities who have this lived experience. When we think about the fact that we’re charging so much money to people who have disabilities who can’t use their speech to communicate, it doesn’t feel right. It doesn’t feel just.
I think with all the excitement around this latest set of innovations and generative large language models, we have a great opportunity to right some of that wrong and to reenvision what’s possible. I feel thankful that I’ve learned that it’s not about technology, it’s about people. One funny thing about using large language models for AAC is imagining the child who is able to access all these jokes so easily. It’s a great way to connect with people and make friends — to have fun and socialize and joke around.
Related links: More insight from Meghan McCarty Carino
Sam Sennott wanted to stress that he’s not the best person to interview about this topic, though he’s clearly very thoughtful, knowledgeable and has direct experience co-creating an app-based AAC system called Proloquo2go. He does not, however, have direct experience as a user of AAC technology.
For that perspective, he pointed us to recent work by researchers with Google’s Accessibility team. In a study, researchers tested predictive text using a large language model with 12 AAC users to gauge the benefits and challenges of the technology. The paper they published is called “‘The less I type, the better’: How AI Language Models can Enhance or Impede Communication for AAC Users.”
The future of this podcast starts with you.
Every day, the “Marketplace Tech” team demystifies the digital economy with stories that explore more than just Big Tech. We’re committed to covering topics that matter to you and the world around us, diving deep into how technology intersects with climate change, inequity, and disinformation.
As part of a nonprofit newsroom, we’re counting on listeners like you to keep this public service paywall-free and available to all.
Support “Marketplace Tech” in any amount today and become a partner in our mission.