It depends greatly on the microphone quality, background noise, usage of proper nouns/non-English words, and above all, the pronunciation and speed of the the person speaking. Tiktok people tend to be using high quality phones/professional mics AND are speaking directly into the mic, for an audience.
Imagine this fast casual conversation in a loud bar:
* “Did you eat yet?”
* “No, but I could go for some Poke Bros or Hibachi.”
* “Let’s go eat!”
What this conversation ACTUALLY sounds like is:
* “Jeweet-yet?”
* “Nah but I could gopher some pokey prose or he bot chee”
* “Skweet!”
That’s why the computer can have trouble with auto-generation.
Latest Answers