It is constantly listening. But it is also constantly forgetting everything it’s heard after six seconds.
Only when it hears its name does it decide to pay attention to what it’s hearing and retain and act on that information.
All of the subconscious hearing and forgetting happens locally in the device. When it needs to act (after hearing its name) its conscious brain kicks in – its connection to the siri servers over the Internet – and it starts to actually process what its hearing.
It is constantly listening for the trigger word / phrase. But this is all done locally and it’s not recording what you say until you say the trigger word/phrase (or it thinks you did!). At which point it activates and starts recording, and what you say is then sent to apple’s servers to be processed to work out what you wanted, a response sent back and then vocalized by the device.
There is a term for this and it’s called a “**Wake Word**” This is Hey Siri, Hey Alexa, OK Google, etc. etc.
The wake word is *local*. Meaning, the physical device can listen for this word *regardless of online status*. Once it hears the wake word, *then* the process of using the cloud language models begins to listen for what you’re asking.
I think where there is a lot of confusion or concern (and rightfully so) is that it’s possible that the wake-word can be miss heard, so Siri may respond to something you said because IT THOUGHT you said Siri or Alexa, and then responded. This sometimes misleads people to think that everything they say is being transmitted to the cloud.
For the devices to be listening all the time and transmitting that data to servers would use up massive amounts of bandwidth and battery life, which is why the wake word is local.
Who’s gonna tell them?
Long story short, your phone is constantly listening to you, and the people below saying that it doesn’t transmit anything until after you say “Hey Siri”, not sure how true that is. Lots of stories of people getting ads for things they were talking about but never actually typed into a computer or a phone.
Your parents are talking, and you are doing your own stuff not caring, your ears are still hearing your parents talk, and your brain is also processing it, but since it does not involve you, the brain lets you focus on your stuff, but when you parent call out Hey /u/TheNaughtyDefection , your brain now becomes focused on the conversation.
If it was constantly listening and analyzing what you were saying, your battery would be dead quickly, and your data would get eaten. So there’s basically there’s a low power listening device in the phone that is hardcoded to only listen for a single phrase. This is also why most of these devices only let you use specific, un-changable phrases, to activate it.
The low power device only has a buffer of a few seconds. Once it hears the trigger word, it wakes up the rest of the software and hands the recording over to it, after which it’s analyzed and responded to.
To top that off, the low-powered listener probably doesn’t even know exactly what words you’re saying. It would just be programmed to hear the general tones that make up “hey siri,” kind of like “high pitch going down followed by two pitches going up within two seconds of each other with no gap in the second sounds.” That’s why you get lots of false positives, or how you can say something *close* to “hey siri” and it will still wake up.
So to answer the question, it *is* constantly listening to you, but it has no idea what you’re saying until it hears the magic sound to wake up it’s better half.
Latest Answers