How have we gone from zero chatGPT style programs to there being so many in such a short time?

557 viewsOtherTechnology

Like the title says, how have we gone from ChatGPT being the apex of it’s type (and therefore presumably very complex and rare) to seeing so many clones in sl short a time? Wouldn’t the code/ system that makes up ChatGPT be in copyright or the code be difficult to mimic, wouldn’t the amount of data scraping take forever?

In: Technology

14 Answers

Anonymous 0 Comments

ChatGPT and other AI models are the results of programs. People have been researching how to best write programs that make AI models for a long time, with some nice breakthroughs in the last decade.

Yet, there is nothing magical about ChatGPT other than that we finally got over a threshold. Using the same AI training on a computer that’s 5 years older would produce an unusable model that barely manages to string 5 words together. Not because the older computer does something different, but because it is slower.

Note that on the scale of computing power that is needed for an AI, even small differences can make a huge difference in the outcome. When training a model takes several months, doubling the compute power by letting it train twice the time isn’t that easy.

The same goes double for the amount of training data that can be used. Getting a 50 TB SSD nowadays is expensive. Getting it 20 years ago was impossible. Even collecting all the data got easier.

And it goes triple for the size of the model. You can’t train a model as big as ChatGPT on a GPU with 8 GB RAM. So the available training hardware had an effect on the size of the models, and bigger models unsurprisingly work better (brain size matters. try asking a mouse about it!).

So, in short: We got to the point where we have the hardware to train a big enough model with enough data for long enough to get a good result.

Additionally, models don’t scale linearly but appear to have thresholds where they suddenly get better.

You are viewing 1 out of 14 answers, click here to view all answers.