As I told you before, Siri, Alexa, and all other smart assistants speak multiple languages. However, they cannot understand these languages together. Let me give you an example.
The Siri on my iPhone speaks American English. She has no problems with English names; she will pronounce and understand English names no questions asked. However, when I ask Siri to call my friend “Mohammed,” she does not understand his name if I pronounced it correctly in Arabic. I have to butcher the name with typical American English pronunciation for Siri to understand me, which is frustrating.
Even if I asked Alexa to play a certain Spanish or Arabic song, Alexa would never understand me unless I speak like a German, who has no idea how to pronounce Spanish or Arabic sentences correctly.
P.s.: Even though my iPhone is in English, Our Amazon Echo had to be installed in German to use Alexa’s skills in Germany.
By the way, the smart assistants also have a problem with my accent when I speak German or English to them, and my accent isn’t even that strong compared to others around me. This is also a reason why I only use Alexa to control the lights and set timers and nothing else.
The Washington Post released “The Accent Gap,” a study in which the Post found out that smart speakers do not perform well when people speak to them with an accent. When it comes to American English, the smart speakers and their smart assistants rely on what is called “broadcast English,” which the paper classifies as “predominantly white, nonimmigrant, non-regional dialect of TV newscasters.”
Smart speakers and smart assistants are programmed, trained, and tested by native speakers. Therefore, they struggle to understand people with accents. The more data we and input we give these smart assistants, the better they will eventually get. For example, I can report every Alexa mistake to Amazon if I want to share this information with them. However, people might be reluctant to share these mistakes with big tech companies due to privacy reasons.
Now imagine the challenge for smart assistants and their programming. For smart assistants to be used by 60% of the world’s population without any problem, voice assistants must understand different accents. On top of that, the AI has to recognize code-switching or conversations in two or more languages.
Teaching voice assistants language and speech recognition is difficult. The word’s position in a sentence and its prefixes and suffixes are characteristics on which the computer bases its data to recognize your commands. On top of that, add idioms and some colloquial sayings or regional dialects, and you got yourself everything necessary for the voice assistant to recognize speech and be classified as smart — and this is all for one language only.
The complexity of speech recognition makes it challenging for Siri and her friends to be bilinguals. The sentence structure is important for them to understand our commands. Different languages have different structures, and the AI cannot keep track.
With the increasing popularity of smart speakers and voice assistance, big tech companies are looking at bilingual users and their problems. The competition in this field is fierce, especially between Amazon and Google, and each company aims to be the first to solve this problem to sell their devices to the 60% of us who speak more than one language.
Google assistant in Google Home and mobile devices is bilingual since August 30th, 2018, and can be used in two different languages. One user can use Google Home in English, and another can person in the same home can use Spanish to communicate with the smart speaker. Nevertheless, the assistant will always answer in the used language, and it still does not properly understand accents and code-switching. But, Google is on the right track to dominate this segment.