The way the software is written, it comes up with a response one “word” at a time. I put word in quotes because sometimes the next word is not really a word that you see on the screen. For example, the next “word” could just be “This is the end of the message”.
Each word takes a lot of computation. That requires time, energy, computing resources such as CPUs and GPUs running on a server somewhere, and cooling. Compared to other things that computers do, computing the next word in ChatGPT4 takes a large amount of computation. Multiplied by how many people are using the service at the same time.
If it were to send the entire message at once, the reader would just be waiting there. So they send it one word at a time so you can start reading it even while it’s still writing. Another benefit is that you can see it is successfully writing and not just stuck.
Latest Answers