How does voice recognition work?
Your device (Google, Apple, Amazon) has some little pieces of you saying some letters and words, then when you look for something, your device compare what you are saying with that records and that’s how it works, also every time you use voice recognition you are improving it, because your new voice records are saved and then analysed too.
In present days it’s mostly like it happens in our brains: flow of sounds separated into set of different frequencies, then it’s feeded to neural network, which remembers how each word sounds like (or, to be specific, how it’s looks like a sequence of sets of frequencies). Neural network can compress, scale or transform sounds, so it can recognize words even if there’s some noise on record, or if it’s said with different pitch characteristic to person’s voice, or with different tempo.
it Essentially breaks down the sound recording into very tiny pieces and compares each piece with a known sound or phonemes (sounds you make when saying specific letters like p or t). It then tries to match up these sounds into words and sentences that make sense in the context it was used.