eli5 how do LLM models of the size of few GBs have information about almost everything?

253 views

I tried runing llma2 on my local machine, the size of the model is roughly 4 gbs, and it runs offline.

It has so far answered all questions I asked, about diverse topics in computer science, literature, philosophy and biology. How is so much information stored in such a small size?

In: 8

4 Answers

Anonymous 0 Comments

It doesn’t have information, it just knows what sentences look like. It seems to give you answers because it makes good sentences. Alas, a lot of the time those sentences are false. I asked ChatGPT last week what are the odds of 90 heads out of 100 flips, and it was off by a mile but the punctuation was perfect.

Anonymous 0 Comments

Valid sentences written in English are not very information dense, about 1or 2 bits per letter (there’s a very good ELI5 on that here: https://what-if.xkcd.com/34/). So if you’ve got 4 * 2^ 30 bytes times 8 bytes per bit divided by about 5 letters per word and 1.5 bits per letter, you can fit about 500 million words (about 5,000 books) of English into 4GB if the information is stored in the absolute most efficient way possible. That’s quite a lot, especially if you do a good job of picking which bits of writing are the most important.

Anonymous 0 Comments

It’s a *language* model. It stores information about how words are typically arranged in sentences, and what sort of sentences would follow a given question/prompt.

If you ask it a question about biology, it doesn’t understand what your question means or what its response means. It just understands how people on the internet typically respond to a question like that.

And you actually *could* fit a boatload of knowledge in a few GB if you are just storing text, no photos or videos. But it doesn’t really keep a record of everything it was trained on. It just “remembers” the typical patterns and arrangements of words in the model. This is how it can come up with new responses or ideas that *potentially* nobody has ever said before, but this isn’t because it understands new ideas or is actually creative. It’s just spitting out words in a way similar to how humans write words which can be convincing and sometimes right, but is very often wrong.

Anonymous 0 Comments

>I tried runing llma2 on my local machine, the size of the model is roughly 4 gbs, and it runs offline.

>It has so far answered all questions I asked, about diverse topics in computer science, literature, philosophy and biology. How is so much information stored in such a small size?

Now other answers have tried to explain the nature of LLMs but I think the most crucial thing here is that in an age of 80GB bluray rips and 50GB game patches it can be easy to lose track of **how much freaking data** 4GB actually are.
The entire text of the English Wikipedia is about 13GB, for comparison.