This is a pretty simple answer. LLMs like ChatGPT do not have fully formed thoughts, they generate what you read word for word based on it’s training data.
It literally has no idea what word will come next until the previous word has been output.
Now they could, if they wanted, program in a delay, where the server buffers up all the words and prints them all at once. But that is extra complexity and server overhead that nobody wants to pay for.
Latest Answers