A large language model is a model that has been trained on a large dataset, a very very very large dataset. Think of a big part of the internet. Based on that, the model has started to learn what texts should look like as he has seen a lot of them.
Then, models like chatGPT are able to infer what is the next probable word in a sentence. So when you type your question to chatGPT, the model doesn’t understand the question. It even doesn’t understand that it’s a question. All it does is to predict what is the most probable word that should follow the question, then the next one and the one after and so on. Until it creates the answer.
The models creating images are the same but instead of predicting words, they predict pixels.
In both cases, the models are “dumb” as they don’t analyze the question/prompt. They just based their answers on probabilities computed over an enormous amount of data
Latest Answers