why do models like ChatGPT forget things during conversations or make things up that are not true?

583 views

why do models like ChatGPT forget things during conversations or make things up that are not true?

In: 803

23 Answers

Anonymous 0 Comments

Since most comments here all state that chatGPT is stupid and doesn’t know anything. There is a interested factor in nature that is pretty much how chatGPT works. Swarm intelligence (in chatGPTs case it’s a lot of transformers stuck together). This has been shown time and time again, with ants and many other natural occuring things. Even cells (yes also your cells) basically are really simple and stupid, but through combining many stupid things you get something not so stupid (some would consider smart). Although it is true that chatGPT predicts “only” the next word and it uses numbers to represent said words, I would not call it simple or stupid. Reason being is, to be able to predict the next word, in this case number or token, you will have to “understand” the relationship between those tokens, words, numbers. Even though chatGPT doesn’t have a model of the world inside and so yes it won’t know what the word actually means or what that object is, it still needs to understand that this word has a certain relationship with another word. If it couldn’t do so it wouldn’t be able to create coherent sentences. Now this doesn’t mean it understands said words, however it must at least to a certain degree understand the relationship between words (token). Now here comes the interesting part, there seems to be “emerging abilities” from LLMs, which were not trained to the model at all. (Google paper on Bard learning a language by itself without ever having any reference to this language in it’s training data would be one example). This phenomenon also emerges in swarm intelligence, as a single ant is super stupid, but in combination with a swarm can do amazing things.
So now full circle, yes chatGPT has no concept of our world whatsoever, that being said it has an internal “world view” (I’m calling it world view for simplicity, it’s more an understanding of relationships between tokens). This “world view” gives it the ability to sometimes solve things that are not within it’s training data, but due to the relationship of it’s tokens. Now does this make chatGPT or LLMs smart? I would not say so, but I would also not call them stupid.

(One Article with links to the papers about emerging abilities: https://virtualizationreview.com/articles/2023/04/21/llm-emergence.aspx?m=1)

You are viewing 1 out of 23 answers, click here to view all answers.