AnswerCult

Question

3.83K viewsApril 26, 2024Other Technology

Question 100.55K April 26, 2024 0 Comments

This goes for almost all AI language models that I’ve used.

I ask it a question, and instead of giving me a paragraph instantly, it generates a response word by word, sometimes sticking on a word for a second or two. Why can’t it just paste the entire answer straight away?

In: Technology

28 Answers

1 2 3 Next »

Answer 1 · 2024-04-26T12:37:49+00:00

Modern ai is really truely just an advanced version of that thing where you hit the middle word in autocomplete. It doesn’t know what word it will use next until it sees what word comes up last. It’s generating as its showing.

Answer 2 · 2024-04-26T12:40:17+00:00

because thats how these answers are generated, such a language model does not generate an entire paragraph of text but instead generates one word and then generates the next word that fits in with the first word it has previously generated while also trying to stay within the context of your prompt.

It helps to stop thinking about these language model AI´s as some kind of program acting like a person who writes you a response and think of it more like as a program design to make a text that feels natural to read.

Like if you were just learning a new language and trying to form a sentence, you would most likely also go word by word trying to make sure the next word fits into the sentence.

Thats also why these language models can make totally wrong answers seem like they are correct, everything is nicely put together and fits into the sentences and paragraphs but the underlying information used to generate that text can be entirely made up.

edit:

just wanna take a moment here to say these are really great discussions down here, even if we are not all in agreement theres a ton of perspective to be gained.

Answer 3 · 2024-04-26T12:43:01+00:00

It could just give you the whole thing after it is done, but then you would be waiting for a while.

It is generated word by word and seeing progress keeps you waiting. So there is no reason for them to delay giving you the response.

Answer 4 · 2024-04-26T12:46:49+00:00

Mostly, it’s a design choice for user comfort. Computers generate most things in sequence, but AI is quite slow compared to other software, so they decided to give every bit right away as it’s generated, to not make users impatient. Especially since LLM generates quite long texts to begin with. It’s also more impressive. If LLM would be “thinking…” between each answer for 10+ seconds, you wouldn’t find it as cool.

Answer 5 · 2024-04-26T13:07:14+00:00

It’s just not fast enough to give the whole answer straight away; getting the LLM to give you one ‘word’ at a time is called “streaming”, and in some cases it is something you have to deliberately turn on, otherwise you’d just be sitting there looking at a blank space for a minute before the whole paragraph just pops out.

Answer 6 · 2024-04-26T13:10:56+00:00

A lot of these answers that you’re getting are incorrect.

You see responses appear “word by word” so that you can begin reading as quickly as possible. Because most chat wrappers don’t allow the AI to edit previously written words, it doesn’t make sense to force the user to wait until the entire response is written to actually see it.

It takes actual time for the response to be written. When the response slowly trickles in, you’re seeing in real time how long it takes for that response to be generated. Depending on which model you use, responses might appear to form complete paragraphs instantly. This is merely because those models run so quickly that you can’t perceive the amount of time it took to write.

But if you’re using something like GPT4, you see the response slowly trickle in because that’s literally how long it’s taking the AI to write it, and because right now ChatGPT isn’t allowed to edit words it’s already written, there is no point in waiting until it’s “done” before sending it over to you. Keep in mind that its lack of ability to edit words as it goes is an _implementation detail_ that will very likely start changing in future models.

Answer 7 · 2024-04-26T13:22:23+00:00

Of all the text that has been written, it preticts the next word.
So when you ask “Who is Michael Jordan?” It will take that sentence and predict what the next word is. So it Predicts “Michael”. Then to predict the next word it takes the text: “Who is Michael Jordan? Michael” and predicts Jordan. Then it starts over and again with the text: “Who is Michael Jordan? Michael Jordan”. In the end it says “Who is Michael Jordan? Michael Jordan is a former basketball player for the Chicago Bulls”. So bascily it takes a text and predicts the next word. That is why you get word by word. Its not really that advance.

Answer 8 · 2024-04-26T13:22:51+00:00

Anonymous Posted April 26, 2024 0 Comments

I do find that quite interesting. I have some ideas of where I am going as I write. Does the Ai have no idea?

Answer 9 · 2024-04-26T13:29:35+00:00

You could get a 30second long loading bar for every reply you give… But most people would drop the tool almost instantly, as our attention span keeps on shrinking at a staggering pace.

As the things stand, it is much more desirable to have *immediate* output than having *complete* output.

Also LLM technology works one word at a time at the moment, thus the visual output reflects the digital output of the algorithm

Answer 10 · 2024-04-26T13:36:10+00:00

Anonymous Posted April 26, 2024 0 Comments

That’s the whole game… it’s doing massive amounts of math to decide the next word that makes sense

AnswerCult

eli5: Why does ChatpGPT give responses word-by-word, instead of the whole answer straight away?

28 Answers

Search questions

Popular Questions

Latest Answers