Is it possible to interpret an audio waveform to determine its timbre?

597 views

I understand that audio waveforms can show the volume (amplitude) and pitch (frequency) but does the waveform also carry information regarding a sounds timbre? Such that an advanced computer could read a waveform and produce music with the correct timbre.

In: Physics

4 Answers

Anonymous 0 Comments

In principle yes, all the information is there. Everything that we hear is in that waveform. However timbre isn’t something that can be defined or measured in a precise way.

However, if someone has analyzed the waveforms of known timbres and identified which characteristics in the waveform resemble a certain timbre, you could use that information to guess at the timbre of a particular waveform.

Using a small input signal to modify a larger piece of data is an ongoing area of research in machine learning (you may have heard of deep fakes, it is kind of similar but for faces not sound). I’m not personally aware of research that targets timbre specifically, but I’d be surprised if no one was working on it.

TL;DR: Yes the information is all there, but it is difficult to interpret.

Anonymous 0 Comments

A waveform only shows amplitude and time, pitch is not visible directly unless you measure it, but it is visible in a spectrogram, timbre in a spectrogram is basically which other frequencies are there after the main frequency. You can relatively easily build a machine learning program that can tell you which instrument is playing in a given sample

But without any of that, you can absolutely record to an audio sample and then play it like an instrument with the same sound, that’s what sampling keyboards do

Anonymous 0 Comments

Well, if you play the waveform exactly as it is, you produce music with the correct timbre. But I guess you mean replicate the timbre to be used in new, unrelated music. That’s one topic of machine learning currently being researched, and it seems so far that yes, it is possible.

The waveform is only one possible representation of the sound- transforming it into frequency information better represents how we hear sound, and so processing of this nature often works with frequency information rather than a waveform.

Anonymous 0 Comments

Yes. A waveform is simply a graph of amplitude over time, and with the right processing you can determine average volume, frequency, and timbre.

First, we need to take a look into what timbre actually is. Every repeating audio waveform, such as a simple waveform from a synthesizer, can be represented as the sum of a series of sine waves with different amplitudes and frequencies. These sine waves’ frequencies are all multiples of one single “main” frequency, known as the fundamental frequency. In a repeating wave cycle, the pitch is determined by the fundamental frequency, and the timbre is determined by those other frequencies stacked on top of it. For example, [sawtooth waves](https://youtu.be/98yuxSxvx5U) and [triangle waves](https://youtu.be/XtEuEqEiuxM). The examples I linked both have the same fundamental frequency, but saw waves and triangle waves have different timbres because they are made up of different frequencies.

Things that don’t perfectly repeat can be represented as sine waves too – noise, for example, is a bunch of random sine wave frequencies, each one appearing for only an instant. The timbre of the noise is again determined by the frequency and amplitude of the sine waves comprising it – if they are all the same volume across the entire audible range, then you have white noise; and if the volume logarithmically decreases as frequency increases (so the lower frequencies are the loudest, and the higher frequencies are the quietest) then you have pink noise.

There are algorithms, such as Fast Fourier Transform, that can analyze a waveform to determine the frequencies that it comprises of. This can be used to produce a [detailed spectrum for careful analysis of audio](https://www.voxengo.com/files/news/newTextSID383/screenshot.jpg/getbyname/697a0/screenshot.jpg) or even just create cool visuals that respond to the sound.

Using the timbre of an audio file to play your own music is simple: just put the recording into a sampler. It’ll replay the exact audio, and use pitch shifting algorithms to stretch the audio and change the frequency so you can play different notes.

Recreating a timbre from scratch is a little bit trickier. There are synthesizer programs, like Image-Line’s Harmor, that can analyze the frequencies in an audio sample and then recreate it from scratch by playing all of the same sine wave frequencies. However, the algorithm isn’t perfect, and the recreated timbre won’t always be perfectly accurate.