The way speech assistants work today is that they have two components. A small device with a speaker and microphone as well as Internet access which can record your speech, and a bunch of servers which can analyze these recordings and convert them into commands. The analysis takes too much power to be done on the smaller devices. But they can do some analysis. If they are only looking for a specific phrase, and that phrase is easy to detect, then the device can be set to look for this. And if they find something that might match the phrase they send it to the service to verify if it is right or not.
What you end up with is a device that is constantly listening to your conversations. But it is not constantly uploading that to the Internet. It is only uploading if it thinks you mention the key word. The issue here is that because of its limited processing power it will detect a lot of false positives. So you might be having a normal conversation but the device with its limited processing power might think you are saying “Hey Siri”. It then uploads this recording to the speech recognition service where they analyze it with better hardware and detects that you were just having a normal conversation. It then stores this recording for later training without your knowledge.
Latest Answers