How does sound get converted to text in a device like say Alexa/Google Home?

662 views

An in depth/ more complicated answer would be great. Bonus points for a tutorial to be able to replicate the experiment.

In: Technology

Anonymous 0 Comments

A conceptually easy way to do it goes like this:

You write down a list of words. You read those words into a microphone many, many times. For each word, you tell the computer, “I’m going to say ‘kitten’ twenty times. I want you to average all of the times I say kitten so that you know what it usually sounds like when I say kitten. In the future, if I ever make that noise, or something really close to it, I want you to print out the word ‘kitten.’”

If you go an extra step, you can have the computer repeat back you what it thinks you said, and you can see if it’s right.

If you do that enough times, with a lot of words, computers get pretty good at matching sounds to words.