I’m not an expert on this, but here’s what I’ve heard. When you speak, your brain is simultaneously listening and checking your speech for accuracy. Making adjustments and error corrections as necessary. This happens pretty fast, basically at the same time you’re speaking. (Sine your brain already knows what it’s going to say beforehand it sort of has a transcript of your words before you even speak.) This process is mostly automatic and unconscious. I can only assume that hearing your voice a second time or at a different time than your brain is expecting interferes with this error correction process. By the time you hear yourself, you’ve moved on to another word, and your brain is hearing the “wrong” thing. If any neurologists are here feel free to correct me. Lol
Brain tells Mouth to say “Apple”
Ears check that Mouth said Apple correctly. They hear “Apple” because it came out of your mouth, so they tell Brain everything is ok.
Brain tells Mouth to say “Tree”
Ears check that Mouth said Tree correctly. They hear both Tree and Apple because Tree came from your mouth, but Apple came from the slightly delayed speakers behind you. Ears tell Brain you messed up because they heard something that kind of sounded like Tree, but wasn’t quite right.
Brain tries to verify that you did in fact say Tree and it successfully does so, but that sure was confusing and a lot more effort than normal conversation.
If this happens for a whole sentence, your brain runs out of processing power and freezes like a computer.
Most of us are taught not to speak when someone else is speaking. It’s usually an awkward form of confrontation if two people are speaking simultaneously at the same volume and saying different things.
When your own voice gets played back at you, it’s like someone else is talking at the same time. Most people will not be able to properly focus on their own thoughts in such a situation.
Speech jamming is actually best done with about 200ms of delay in adults. Shorter times are more likely to be processed as echo and discarded, longer times (like a few seconds) don’t match up closely enough to trigger the reflex that subconsciously monitors your speech and compensates for/corrects errors without you thinking about it. 200ms is long enough so that the sound at your ear is different than at your mouth, and so the brain processes that as an error.
Pro tip: if you’re working an event and a rowdy guest crashes the party and takes the mic unwanted, throw 200ms onto that feed.
Latest Answers