Have you ever wondered how voice-activated virtual assistants such as Apple’s Siri, Google’s Assistant, and Amazon’s Alexa interact with people who have atypical speech? I have a deeply personal experience with this.
On February 24th, Katie Deighton at the Wall Street Journal released a fantastic article that describes a movement for virtual assistants to adapt their algorithms to recognize people with atypical speech. Like me.
Since my adolescent years, I’ve had a stutter. My speech fluctuates seemly daily. Sometimes I can place an order at a restaurant without batting an eye. Other times it might take me a few seconds to order a drink at a bar. The hardest thing I have to do is order a sandwich at Subway in Manhattan during lunchtime. The noise, commotion, rushing (time is money), and the number of options and permutations that can go into ordering a sandwich can be overwhelming for the normal person, especially for me.
Why should I expect the Google Home on my nightstand can understand what I am saying? On a surface level, virtual assistants use unstructured voice data as an input, structure it, perform an action (Weather Lookup, Turn off Lights, Read sports scores, etc.), and then respond with an answer to the request in seconds.
It’s an interesting dilemma that voice-activated virtual assistants have to work around, and I know that I am not the only person with atypical speech. Using Machine Learning and Artificial Intelligence, which I’m currently studying in a Data Science Bootcamp, these devices use pattern recognition from every input to match users’ voice patterns to words. When big tech companies first released these devices, they were comically inaccurate because they were still learning. A couple of years later, and after billions of voice requests, I believe these virtual assistants have learned sufficiently to respond to 99% of the typical-speech population successfully.
But what about me and my stutter? Yes, I still struggle to turn my voice-activated lights off at night. But, how about other people who have cerebral palsy, Parkinson’s disease, or brain tumors? I’m able-bodied so going onto my phone and turning off the lights through an iOS app is no big deal.
But for many people, maybe the only movement they have is their voice. Julie Cattiau, a project manager at Google, working on Google AI for Social Good and the Euphonia project, was quoted in the Wall Street Journal article saying, “For someone who has cerebral palsy and is in a wheelchair, being able to control their environment with their voice could be super useful to them.” I 100% agree with her and admire the work she is contributing to the atypical speech community. It also makes me proud to have two Google Home devices in my apartment.
Google’s Project Euphonia initiative allows people with atypical speech to record themselves and send their voice memos to Google for inclusion in the machine learning algorithms. Further complicating the scenario, people with atypical speech patterns ofter vary from another. Some people have trouble pronouncing plosives (p, t, k, b, d, g) and others have an elongated stutter on nasal consonants (n & m sounds).
I’m not critical of Google, Apple, or Amazon since they developed their product initially for typical speakers. In fact, I’m happy that 99% of the population that does have typical speech was able to take part in the evolving experiment that is growing voice recognition software and its interaction with Data Science tools such as Machine Learning and Artificial Intelligence.
I am very thankful that Google, Apple, and Facebook are dedicating valuable resources to enable people like me to comfortably speak to Voice-activated virtual assistance with confidence.
When I read Deighton’s article in the Wall Street Journal titled: “Tech Firms Train Voice Assistants to Understand Atypical Speech,” I smiled because something that I believed was a trivial problem (my inability to turn my smart lights on and off due to my stutter) was in fact legitimate.