When we sing words we change the shape of our mouth, which is the space where the tones of our vocal cords pass through. Usually musical instruments don‘t have such an easily modifiable space where the sound passes through.
However, you can attach a tube to the speaker of your instrument, put that in your mouth and then say words (without your voice) while the instrument is playing. This is called a talkbox and it‘s a really fun way to play an instrument. Look it up on Youtube!
There are several different things that are modulated in forming phonetic sounds. The [International Phonetic Alphabet](https://en.wikipedia.org/wiki/International_Phonetic_Alphabet#:~:text=The%20International%20Phonetic%20Alphabet%20(IPA,speech%20sounds%20in%20written%20form.) categorizes all possible combinations of these to form a script that can accurately describe the pronunciation of words in any language. Notably there are some combinations that are impossible, and there are some possible combinations that are not known to exist in any language.
It would be very difficult to create an instrument that could be manipulated to modulate enough variables to recreate speech. However a talkbox is a really neat little device that plays the sound of an instrument into your mouth so that you can use your own mouth to modulate the sound and make words that sound like a guitar or keyboard or whatever.
A lot of organs have a voice register or choir register. These can be confused with a real human voice, at least for the duration of a note. The issue is that it is a lot of work to recreate all the possible sounds a human can sing. It is not just the 26 characters in the alphabet but you have different ways of pronouncing each character and then you have unique ways of switching from one sound to the next. So while we know how to make an organ sing like a choir it is too much work to make it sing more then one or two vocals. This is something that computers have only been able to do faithfully for about ten years, and are still not quite there.
Not an ELI5 but still: Equal notes does not mean equal sound. A guitar and a human can both produce the same “note” defined by fundamental frequency (for example, middle A is 440 hz, humans and guitars can both produce this). However, the harmonic frequencies – integer multiples of the fundamental (880 hz, 1320 hz, etc.) will differ in presence. Therefore the formant (an envelope around the frequency peaks) will be different. A singer can vary their formant by changing the shape of their mouth and vocal tract but a guitar cannot vary its formant by nearly as much. As others have said, there are ways that different instruments can replicate this – think of jazz trumpeters playing with a cup mute, adds a human like wah to the sound that they play.
Tone is one of many aspects of a sound that makes it… well, sound, the way it does. Timbre, harmonics, color, and I’m sure there are others a sound engineer could name off. But if you’ve ever heard a sine wave generator (you can Google this and have a listen). This is the only “pure” tone. Just one frequency. Every single other sound you e ever heard is a combination of frequencies. The exact combination of frequencies is what makes it sound different while still maintaining the same “base” frequency or “tone.” People can sound like guitars, but guitars can’t sound like people. That’s because guitars, once made, have a fixed set of timbre/harmonics/color/ whatever because they are generally very rigid objects.
Humans, on the other hand, can change the shape of their mouths, position of their tongues, the pathways air takes through their throat, and can even change the rigidity of various parts of the process by flexing or relaxing certain muscles. We can make soooooo many different sounds. But whatever sound you make can be recorded. If it can be recorded, we can analyze what weird combination of frequencies are all stacked on top of each other to create that specific sound, and then we could theoretically play back of those frequencies, one in each speaker, and it would sound like a human… or guitar… or whatever you recorded. (We tend to take a shortcut and add up all the frequencies at once to play them out of one speaker… or just just the raw audio before we’ve picked apart what frequencies make it up)
So essentially, a guitar only has one kind of jumbled up combination of frequencies that it plays for any given tone. And it’s not exactly what a human does for that same tone. (Actually, you can probably play the same tone on 2 different strings of a guitar, and there may be just enough subtle difference to even pick up on that) also, theoretically, if you had, like, 100 guitars and each guitarist had 6 different frequencies they can play, and they all coordinated to take some of the frequencies that make up your voice and played them all at the same time, the resulting combination could sound like a voice as well. I think theoretically, you’d need infinitely many guitars to sound exactly like a human, but finite approximations are more than adequate for most things. Check out Mark Rober’s auto playing piano video for an example where he made a piano talk… sort of… like I said, it’s only a finite approximation.
Because in order to do that, it would be an extremely complex instrument, and damn near impossible to make or play.
To produce our vocals, we have a complex and delicate balance of air flow, muscles that control our vocal folds, our throat muscles, our tongue movements, our jaw position, lip shape, and even how much air is flowing from your mouth versus your nose.
Speech and singing are much more complex and amazing things than most people would think.
Latest Answers