How do phones constantly listen for voice commands like ‘Hey Siri’ or ‘Ok Google’ without draining the battery quickly?

478 views

How do phones constantly listen for voice commands like ‘Hey Siri’ or ‘Ok Google’ without draining the battery quickly?

In: 255

11 Answers

Anonymous 0 Comments

The more “passive” things can be designed to consume very little power. The main power hogs are intensive CPU, intensive graphics and use of wireless networks. Monitoring a microphone (passive) and running low intensity sound processing consumes very little power relative to the batteries on a modern phone.

Anonymous 0 Comments

Usually, this is implemented with specialized hardware like a digital signal processor (DSP) that can use minimal power doing this specialized thing while the rest of the system (like the main CPU) is suspended (using minimal power). When the DSP detects the hello phrase, it can wake or interrupt the main CPU to handle the request. This DSP is usually part of the system on chip (SoC) that contains the main CPU.

Stuff like the display, cellular modem, wifi chipset, and main CPU are typically the power hogs.

Anonymous 0 Comments

Going to add to the above comments and say this is all done locally on the chip in the phone, it will only connect to the internet to “translate” the command after the initial wake up command of hey siri/ok Google happens.

Anonymous 0 Comments

There are things very easy for you to do without thinking about them, like walking or balancing. You have brain cells that are dedicated specifically for these tasks. They don’t require much thought for them to do their job and they don’t need as much food or oxygen as the rest of your brain does.

Electronics in your phone or your computer eat a lot of electricity the harder they have to think. It used to be that in phones and computers even simple things required that computers think hard about them and so listening for speech used a lot of energy.

Nowadays we’ve figured out how to make parts of computers think the same way that your brain thinks. We can create neural networks that understand speech, and those networks can run very efficiently on special processors in your phone or computer that don’t require a lot of energy. They can now listen for those special words, or even identify songs, and they don’t need your phone to think about it, and so they’re very energy efficient.

If you want to learn more, search for stuff about “machine learning” and technologies like TensorFlow.

Anonymous 0 Comments

The device isn’t listening to your words and actively translating them at all times. It’s listening for a pattern. Al-ex-uh, once it hears the right pattern, it switches to listening mode where it can process your command.

It’s kinda like tuning something out, like my kids pretending to be streamers and talking to no one. I am not listening to them and my brain is ignoring their input until they say, “mom.” When they say mom, I know I need to listen to their request.

Anonymous 0 Comments

What this comes down to is the difference between software and hardware.

Software is slower and uses more energy. Hardware is specialized and efficient.

Picture this: Have you ever used a coin sorter? You dump the coins in the trough and gravity and the size of the coins are used to send the different coins down different paths and they all end up ready to be put into tubes. This works quickly and efficiently and no real thinking is done.

Compare that to a robot arm with a camera that looks at the coins and picks each up an puts it into a pile. This can get fast but it will never be as fast and efficient a the sorter. Think of all the wasted energy in just moving the arm back and forth. This is like software. There is just all this overhead and baggage that slows the process down.

Your phone has a hardware chip that does one thing and that is to listen for that command. It is efficient so it doesn’t consume very much power.

Anonymous 0 Comments

The big brain of the device does all of the stuff that uses lots of power, like running the display and connecting to the internet. The manufacturers build a little brain in the device who’s only job it is to listen for the phrase. It takes almost no power. When the little brain hears the phrase, it wakes up the big brain

Anonymous 0 Comments

Modern cell phones use multi-core CPUs. The high power functioning cores can sleep while a low power core runs the clock & stands guard – looking for a reason to wake up the other cores.

Anonymous 0 Comments

Because they’re already were listening way way back. They just need an excuse to ‘formalize’ and legalize it.

Anonymous 0 Comments

So…not sure what you’re thinking draws a lot of power, here, but…as others have said, the thing only connects to internet when it has to, which is when it finally hears the right command. The “hearing” happens on the device, as does the first interpretation of what was heard. Meaning it hears you and figures out IF you said the command word before doing anything else. No command word, no increased power draw.

But…even MORE important than all of that is the fact that microphones don’t consume power: they produce it. It’s a tiny, tiny amount, but it’s enough to “power” the detection AND some of the first line interpretation, mostly because the detection doesn’t require amplification before going to the interpreter.