Share:

Apple Reveals How Siri Learns a New Language, Shanghainese Coming Next

Share:

With the broad release of Google Assistant last week, the voice-assistant wars are in full swing, with Apple, Amazon, Microsoft, and now Google all offering electronic assistants to take your commands.

Siri is the oldest of the bunch, and researchers including Oren Etzioni, chief executive officer of the Allen Institute for Artificial Intelligence in Seattle, said Apple has squandered its lead when it comes to understanding speech and answering questions.

There is, however, one area where Apple is the undisputed king of the personal assistant space: localization. Siri supports twenty four languages across 36 country dialects, reads a new report from Reuters. In contrast, Google’s Assistant can only understand five languages and Alexa (popularized by the Amazon Echo) just two, English and German.

“At Apple, the company starts working on a new language by bringing in humans to read passages in a range of accents and dialects, which are then transcribed by hand so the computer has an exact representation of the spoken text to learn from,” said Alex Acero, head of the speech team at Apple. Apple also captures a range of sounds in a variety of voices. From there, a language model is built that tries to predict words sequences.

“Then Apple deploys ‘dictation mode,’ its text-to-speech translator, in the new language,” reads the report. “When customers use dictation mode, Apple captures a small percentage of the audio recordings and makes them anonymous. The recordings, complete with background noise and mumbled words, are transcribed by humans, a process that helps cut the speech recognition error rate in half.”

Once the required amount of data has been gathered and a voice actor has recorded the Siri responses in a new language, Siri is released with answers to what Apple believes will be the most common questions. Siri then learns more about what users ask, with additional tweaks made via updates every two weeks.

Share:

  • Sly C

    I find dictation to be significantly more accurate than Siri in understanding me. How does that make sense?

Deals