When OpenAI makes an improved model, do they just correct its “wrong answers” till it spits out more correct-ness?

346 viewsOtherTechnology

When OpenAI makes an improved model, do they just correct its “wrong answers” till it spits out more correct-ness?

In: Technology

5 Answers

Anonymous 0 Comments

I wasn’t gonna comment, but other reply is, ironically, obviously copy pasted from ChatGPT and doesn’t properly answer the question.

So, “correcting wrong answers” is a fundamentally flawed way to think about deep learning AI models. You can’t just go in and flip some switches to make it give the “right answer” to an input, and even if you could, that would only correct that single input, which is useless in practice, since there might be infinite possible inputs. 

Furthermore, the engineers don’t actually know what any of the switches in the AI model actually *do* (sort of, you can perform certain kinds of analysis to get some understanding of what the model is doing, but it’s basically impossible to get to the point where you could dive into the model and manually change some numbers to make it do what you want).

Essentially, machine learning models are trained to recognize *patterns*. They’re a big bundle of math that can be fed lots and lots of examples of inputs and “correct” outputs (as well as incorrect outputs in many cases, so it can learn what *not* to do).

The math is designed so that, in theory, the model will learn the underlying *patterns* that connect inputs with outputs, rather than simply memorizing the answers to the training data. That way, when you give it new inputs it has never seen before, it can use those *patterns* it has learned to give “correct” outputs. If it only memorizes the training data, then it won’t be able to give meaningful answers on new data.

So basically, improving an AI model is not about making it “know more right answers” per se. It’s about improving the *patterns* it learns.

Broadly speaking, ways of doing that include:

1. More data. Basically as simple as it sounds, and improves performance 90% of the time.
2. *Better* data. For example, when training a language model, you might try to filter out low-quality writing from the training data, because you don’t want the model to learn that. You might also want to filter out offensive content so the model doesn’t learn that either.
3. A bigger model. It turns out, just having more parameters can do a lot for performance.
4. Improved model architecture. This requires the most innovation, and is pretty hard to explain in ELI5.

When OpenAI releases a new model, they’re most likely doing a combination of the above.

You are viewing 1 out of 5 answers, click here to view all answers.