Suppose you need to interact briefly with people who speak a language you don’t understand. Let’s say you’re “undercover,” and you don’t want them to suspect that you aren’t one of them. You want to appear to belong.
One way to do that would be to learn the language.
We have a mental model of the world, and when someone tells us something, we add (maybe temporarily, maybe permanently) some new pieces to that model. We know how a lot about how the model behaves — what is possible and what isn’t — so we can reason based on it. Learning a language means learning how to translate sentences in that language into our own mental model. When we respond to what someone says, we translate some information about our (updated) model into words.
No one (yet) knows how to do that with a computer. Computers aren’t even close to being able to model the world as humans experience it. It’s not just that we don’t know how to “translate” language: there is nothing into which to translate it that could do the job of representing the world in the sense that we represent it in our minds.
So if learning the language isn’t an option, what else could you do? You’d observe common interactions. Someone says, “Hellozon der,” and the response is almost always, “Hidoyuh too!” When a person says, “Houzay bee-in hangeen?” the answer is usually “Goodz gold, danken fer asken!” You might be able to memorize enough common phrases and responses to fake your way through. You might even start to get a little sense of context — maybe, after one minute or more of conversation, if someone says, “Hellozon der,” the response is, “Gooden hellozen, morrow zeeya,” and it’s time to walk away.
While computer programmers don’t know how to make a computer “understand” anything the way humans do, rapid, systematic processing of massive amounts of data — even millions of times what any one human being could manage — is what computers do very well. What current AI does is like observing common phrases and responses — but far more of them that any human could, using existing data, like Reddit posts and responses — and tabulating the connections to create a “large language model.” Then it searches for patterns in the input and computes the most likely output.
No can give a satisfying answer to “how it works” because it doesn’t work the way it appears to work. Like the undercover agent, it appears to understand the language and give meaningful responses, but it doesn’t. It just uses a staggering amount of data — and some very sophisticated statistical analysis — to make a really good guess about what output you would expect. A literal accounting of “how it works” on any given input would run to millions of lines, but it still wouldn’t tell you anything you cared to know, because it’s just the process of making a “guess” based on tabulated statistics. At no point does it “understand” anything, or “draw conclusions” in the sense that a thinking human being does.
Latest Answers