There’s 2 chips.
The first chip listens for the “start phrase”. It will parse all sound it receives, but is a “dumb device” and can’t do anything other than match the start phrase.
Once the start phrase is matched, the chip will *pass through* the sound data onto the main processor. The main processor runs the required software and has the internet capability to decode sound into an interpretation. The interpretation could be an instruction, message, or any other purpose supported by the software.
TLDR: the first chip hears all, but can’t do anything except check for the activating phrase. Only after this is activated, is the sound passed onto the main chip, which can understand all the data in the audio.
Latest Answers