How do AI song covers work?

66 viewsOtherTechnology

I’m not very knowledgeable about AI or music, but I’m really astounded by how well AI song covers can replicate the person/character’s voice so well, down to their really distinctive qualities.

For example, this AI song cover of Mr Krabs singing My Way (https://youtu.be/AklZTEMTzHE) really nails the rough and gravelly quality Mr Krab’s voice has, which Frank Sinatra doesn’t have at all. Also, other AI covers I’ve heard can replicate the accent that a character talks with, and the original singer of the song has a completely different accent.

My guess is that when the AI is trained on a certain character’s voice, it identifies specific patterns in their voice that can be translated into a waveform, and somehow combines it with the waveform for the original singer’s singing? I’ve learned that it’s possible to mathematically combine multiple different audio waveforms together into one, and also do that process in reverse to break down a song’s waveform into its different components, so I would guess that the AI can isolate a singer’s voice from the sounds of the instruments, generate a waveform for the character singing the song, and then combine them together to create the finished song?

And I guess the AI would somehow find a pattern in the waveform of a character’s voice that makes it sound gravelly, or how it would pronounce certain words in a particular accent, and extrapolate that to words that the character has never said before, and then tune the voice to the specific pitch that the original singer sang in the song?

As an aside: I’m also curious how AI music that can generate a song from a text prompt works too. I’ve learned that AI art that is generated from text prompt works by assigning certain mathematical values to words in its data set, and then repeatedly refines an image of just noise until it produces a result that it thinks matches the given text prompt, so I would assume that AI music works in a similar way, assigning relationships between words and audio waveform patterns?

In: Technology

Anonymous 0 Comments

**PLEASE READ THIS ENTIRE MESSAGE**

Your submission has been removed. Questions about AI, how it works, when it works, why it doesn’t work, why it doesn’t exist yet, whether it’s going to take over and why various people like or dislike it are asked *very often*. **Please search before posting**, and also note that many of these questions **cannot be answered in an objective fashion**.

If you would like this removal reviewed, please read the [detailed rules](https://www.reddit.com/r/explainlikeimfive/wiki/detailed_rules) first. **If you believe this submission was removed erroneously**, please [use this form](https://old.reddit.com/message/compose?to=%2Fr%2Fexplainlikeimfive&subject=Please%20review%20my%20thread?&message=Link:%20/r/explainlikeimfive/comments/1cy2o8d/eli5_how_do_ai_song_covers_work/%0A%0APlease%20answer%20the%20following%203%20questions:%0A%0A1.%20The%20concept%20I%20want%20explained:%0A%0A2.%20List%20the%20search%20terms%20you%20used%20to%20look%20for%20past%20posts%20on%20ELI5:%0A%0A3.%20How%20does%20your%20post%20differ%20from%20your%20recent%20search%20results%20on%20the%20sub:) and we will review your submission. Note that **if you do not fill out the form completely**, your message **will not be reviewed**.

*I am a bot, and this action was performed automatically. Please [contact the moderators of this subreddit](/message/compose/?to=/r/explainlikeimfive) if you have any questions or concerns.*