How have we gone from zero chatGPT style programs to there being so many in such a short time?

537 viewsOtherTechnology

Like the title says, how have we gone from ChatGPT being the apex of it’s type (and therefore presumably very complex and rare) to seeing so many clones in sl short a time? Wouldn’t the code/ system that makes up ChatGPT be in copyright or the code be difficult to mimic, wouldn’t the amount of data scraping take forever?

In: Technology

14 Answers

Anonymous 0 Comments

Because the technology wasn’t publicly available until November of 2022. Most AIs before that were closely held company secrets that didn’t allow users API access. API access allows users to write programs and apps that use the technology.

Anonymous 0 Comments

Everything behind ChatGPT, the maths, the models, the training data, was created in labs that publish their results. There are both scientific and commercial motivations behind it.

And as such, published science is analyzed, replicated, improved.

Anonymous 0 Comments

My thought: chat GPT is an interface that works via an AI machine learning model and underlying data set. There’s lots of new interfaces out there but there aren’t that many (legit) AI models+datasets. So everyone’s drinking from the same fountain, essentially.

Anonymous 0 Comments

> Wouldn’t the code/ system that makes up ChatGPT be in copyright or the code be difficult to mimic

The code is surprisingly simple. OpenAI is a non-profit research group (now with a for-profit arm), and they’ve [published](https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf) all the initial research that went into GPT. Nowadays you can create an LLM in just a [couple hundreds of lines of code](https://github.com/karpathy/nanoGPT/blob/master/train.py).

What’s prohibitive about creating them is the training data and compute resources. GPT 4 cost more than $100 million to train. Getting models that high in quality is really only doable with Microsoft or Google money, but everything’s accessible enough that *anytime* can create something that works *reasonably* well for specific uses.

Anonymous 0 Comments

They let anybody build on top of ChatGPT.

Imagine a blank t-shirt store opened their doors and said “anybody who wants to buy our blank t-shirts can draw whatever they want on them and do what they wish with them”.

The underlying t-shirt doesn’t change, but whatever goes on top of it is up to any developer.

Anonymous 0 Comments

[removed]

Anonymous 0 Comments

ChatGPT and other AI models are the results of programs. People have been researching how to best write programs that make AI models for a long time, with some nice breakthroughs in the last decade.

Yet, there is nothing magical about ChatGPT other than that we finally got over a threshold. Using the same AI training on a computer that’s 5 years older would produce an unusable model that barely manages to string 5 words together. Not because the older computer does something different, but because it is slower.

Note that on the scale of computing power that is needed for an AI, even small differences can make a huge difference in the outcome. When training a model takes several months, doubling the compute power by letting it train twice the time isn’t that easy.

The same goes double for the amount of training data that can be used. Getting a 50 TB SSD nowadays is expensive. Getting it 20 years ago was impossible. Even collecting all the data got easier.

And it goes triple for the size of the model. You can’t train a model as big as ChatGPT on a GPU with 8 GB RAM. So the available training hardware had an effect on the size of the models, and bigger models unsurprisingly work better (brain size matters. try asking a mouse about it!).

So, in short: We got to the point where we have the hardware to train a big enough model with enough data for long enough to get a good result.

Additionally, models don’t scale linearly but appear to have thresholds where they suddenly get better.

Anonymous 0 Comments

Because the vast majority of them are just using ChatGPT. The company sells licenses so you can make whatever app you want with it.

The couple that are using something else have been in development for years, and rushed their release out in order to compete with ChatGPT before it’s too late.

Anonymous 0 Comments

It’s relatively simple technology. GPT architecture was created in Google and published in 2018, another paper that improved this tech is called “attention is all you need” was a breakthrough and also was published. Facebook launched it open source models LLaMa right after ChatGPT surge. Everything else is training, data collection, fine-tuning. It’s expensive, but nothing complex here

Anonymous 0 Comments

Don’t mistake “not interesting to the general public” to be the same as “zero”.

Just because we have media focus on AI right now does not mean that the functionality came out of nowhere overnight – there have been many breakthroughs over the “quiet” period which hasn’t grabbed the publics interest.