A Large Language Model is basically a really advanced auto-complete. If I show you a hundred fairy tales, all of which begin with “Once upon a time”, and I ask you to give me the first word of the next fairy tale, you’ll probably guess “Once” and be right. When I tell you the first word is “Once” and ask you to predict the next word, “upon” is a virtual certainty, and so on. You can do that even if you don’t know what any of those words mean. You just know it’s the right next word because you’ve seen it many times.
LLMs are like that, but a million times more complex. They have enough knowledge based on internet content to recognize many different kinds of text—stories, questions and answers, essays, legal briefs—and so when you ask them to do their “auto-complete” task, they can do a very good job of writing things that look like things they’ve been trained on.
But LLMs don’t really *understand* what they’re saying. They’re just mimicking patterns they’ve been trained on. It’s just that they’re *so good at it* that it looks like (and usually *is*) knowledgeable and relevant information. Even when you ask it basic math problems, it’s not actually doing the math, it’s just breaking the math problem down into words and sentences, and then using auto-complete to end up at something that may or may not be the right answer. And often it’s wrong.
This also means it can’t really handle prompts it hasn’t seen on the internet before, like a “count how many words are in this sentence” task, or “write a paragraph about economics that doesn’t use the letter C”. There’s a lot of knowledge that comes out in its auto-completions, but there isn’t any *intelligence* giving it a goal in deciding what it should say.
Latest Answers