Why does ChatGPT lie? And why can’t this be fixed easily?

1.10K views

I’ve tried asking it to write arguments and support them but the references are fake. It apologizes when confronted but does it over and over again even when I ask it not to provide fake references.

In: 5

45 Answers

Anonymous 0 Comments

Great question! Think of ChatGPT like this: It’s a big collection of learned associations of words. So when I’ve browsed Reddit for far longer than intended (as usual, dangit), I might see that “abusive” and “relationship” (as well as gyms and lawyers, apparently) are often paired and together have a negative connotation, whereas “relationship” can also be a positive thing in combination with other words. “Abusive”, however, is almost never positive in any combination. Imagine a massive collection that has combinations and connotations like that stored in itself, ready to understand sentences through that lense. That’s ChatGPT. Except, besides understanding language, it can also generate it.

So how does one generate words as a model? Well, basically, you do the same trick as before when trying to understand language, except now, you’ve been taught combinations of words in RESPONSE to sentences. You don’t just try to comprehend them: you’re trying to say meaningful things as a reaction to what users say or ask. The folks over at OpenAI trained ChatGPT to do just that, feeding it prompts and giving it feedback on the quality of its response. Makes sense, right? If you want to learn German, someone has to tell you “Nein. Das sagen wir so nicht.” for you to ponder blankly until you finally get the highly coveted “Gut. Jetz las mich in Ruhe.” every now and then; same goes for the model.

What’s important to understand, then, is that the model doesn’t actually have an comprehension of what its generating. They are just learned combinations of words that work well given the examples it has been shown/taught. It can generate some VERY accurate things all on its own from all the patterns that it has learned, but the further it moves away from what it has been taught, the less reference ChatGPT has on what it can do to correct mistakes. After all, it can’t learn much more beyond what it has been taught before being deployed to you. It isn’t so much lying as simply grasping at straws using words that should theoretically belong together to give you a very compelling story that’s, well, nonsense, because it simply hasn’t learned a correct answer to your prompt.

I hope that helps it make a little more sense to you!

You are viewing 1 out of 45 answers, click here to view all answers.