Today, Google announced the release of Gemini 1.5 Pro, its next gen LLM.
Sundar Pichai posted: “Gemini 1.5 Pro, our mid-sized model, will soon come standard with a 128K-token context window, but starting today, developers + customers can sign up for the limited Private Preview to try out 1.5 Pro with a groundbreaking and experimental 1 million token context window!”
What does it mean to have 1 million token context window and how does it compare with the previous Gemini 1.0 Pro and OpenAI’s GPT 4.0?
In: Technology
The way LLM models work is that you give them a piece of text and tell them to predict the next word and do it over and over so that it forms sentences and answers you. But, a AI model doesn’t really understand text so instead its encoded as tokens, for example word “tokens” translates to some number, lets say 0.2495292603 The entire text you give it is encoded as series of such numbers and the next word it generates is also a number. There is a dictionary to translate words to numbers and vice versa.
Now, the length of that array, how many numbers you can and must give it, is fixed by model design. If your model can work with 10 word input array and you give it 20, you might as well not give it the first 10 because it can’t use them. In a text generation you see it as the model “forgetting” start of the conversation if it gets too long.
Do you need million word long text generations? Probably not too often. But this context window is useful in other ways too, you can add all sorts of information there. If you want to ask a question about laws for example, you put the relevant law text first and then ask your question about it. That way the AI can reference is same as it can reference any past conversation you have had with it.
So the size of the context window is pretty important quality of a LLM model, but it also costs, bigger context window means bigger model and more expense in computing it.
Latest Answers