Simplifying hugely: it’s probably because there’s a bit of our memory called the **phonological loop** that can store 1-2 seconds of audio.
Psychologist Alan Baddeley found this via some brilliant experiments getting children to memorise sequences of numbers. It turns out that Welsh-speaking children can memorise shorter strings than English-speaking children, while Chinese-speaking children can memorise longer strings. That’s because pronounced Welsh digits are on average longer than English digits, and Chinese digits are on average shorter. In all cases they could memorise what they could say in two seconds or so.
See https://www.annualreviews.org/doi/pdf/10.1146/annurev-psych-120710-100422 for a non ELI5!
Latest Answers