Basically I’ve being trying to sort a list of food by date. I took a rough note of each food and date and it gave it to ChatGPT and asked it to format and order it. It formatted it just fine but it couldn’t order it by date. Most of them were in the right place but there were a few out of place. For example at one point it gave me:
– 1st February 2024 – Cookies
– 1st March 2024 – Biscuits
– 1st June 2024 – Soup
– 3rd June 2024 – Chocolate
– 9th May 2024 – Chocolate
– 1st August 2024 – Eggs
– 1st August 2024 – Chicken
– 15th September 2024 – Yogurt
– 25th November 2024 – Sauce
– 16th November 2024 – Soup
– 19th November 2024 – Apple Juice
– 1st November 2024 – Potatoes
– 1st November 2024 – Soup
– 1st May 2024 – Carrots
– 1st January 2025 – Shortbread
– 1st January 2025 – Pasta
– 11th January 2025 – Noodles
– 1st January 2025 – Carrots
– 2nd February 2025 – Cereal
– 7th April 2025 – Green Beans
– 26th March 2025 – Rice
– 28th April 2025 – Pasta
– 1st May 2025 – Stock Cubes
I tried both written and numerical date formats. I also tried asking it to format it and then order it in separate queries so it was only doing one thing at once. I’ve tried a few separate lists and it happened with each. I also got the same results with copilot. When I pointed out the mistake it would say something like “sorry, here’s the correct list” and output the exact same thing. I then remembered something similar happened about a year ago when I asked it to list the Agatha Christie books in publication order and tell me which ones were in thr public domain. It listed them all but there were mistakes in the order. It would then tell me that only books published after (for example) 1926 or later are in the public domain, and then tell me that a book published in 1925 was.
So why can’t it do this? It seems like a very basic task, one that much less sophisticated programs could do. It has so much information, surely some of that information includes which order the months come in and that 25 comes after 16. I’ve had it do relatively complicated calculations based on a rough written description, so ordering a few dates thst are all formatted the same should be a walk in the park, right?
In: Technology
The task requires a two step process. Think of an algorithm and then execute it. ChatGPT can’t do that automatically. You can often get a better result if you ask it to “think step by step”. But in case of a complicated algorithm and/or large data it’s better to ask it to produce a script in a programming language.
I asked it “Write a python script to sort the following data in ascending order” followed by your data (in the single prompt). It wrote a script that failed. I told it “the script failed with the following error:
ValueError: time data ‘1st February 2024’ does not match format ‘%d %B %Y’
Please fix it.” And it produced the correct code you can run online https://www.onlinegdb.com/KhL9WFO1D
ChatGPT actually “knows” very little… it’s been programmed specifically to be able to do some basic math, but the core of what makes ChatGPT useful is a thing called an LLM which is a generative language model.
What this means is that ChatGPT doesn’t know what it means to sort dates, or even what dates are. What it does have is an unimaginably huge amount of human writing, and it can use that to try to predict what a human would say when asked a specific question.
It’s predictions can get relatively close to a sorted list, but it isn’t actually sorting anything itself. It’s just making up words based on what it thinks a human would reply with your question, and that just happens to be somewhat close to what you wanted.
ChatGPT doesn’t really “know” anything, it simply uses word frequency to find words that correlate with each other. For example, if you ask ChatGPT, *”what do you think about kittens?”*, it will probably return something like, *”I think kittens are cute and fluffy”*. ChatGPT doesn’t know what words like “kitten”, “cute”, or “fluffy” mean, it simply searched it’s database and concluded that 75% of articles that contain the word “kitten” also contain the word “cute” and 53% of articles contain the word “fluffy”, so that’s how at a high level ChatGPT chooses what words to string together to form new text.
With things like dates or math problems, ChatGPT doesn’t have any logic built in to understand what that means. It’s simply parsing through text documents & articles talking about dates and ripping bits of text to string together into something that seems like real human text. It’s not like a program like Excel where dates can be stored as a data type and be compared with logic that produces results with 100% accuracy.
It’s because, at its core, ChatGPT is just a text prediction machine. It has no idea what it means to sort a list. It has no idea what a date is, or what a typical ordering of dates looks like. It just generates what it thinks is the most likely output for your query, based on all its training data.
>So why can’t it do this? It seems like a very basic task, one that much less sophisticated programs could do.
To put it into context, it’s like trying to answer an addition problem using the autocomplete on your phone. ChatGPT isn’t built to do anything other than generate text. There are sorting algorithms that exist for your desired use case, but you wouldn’t ask them to write a haiku about Abraham Lincoln.
The thing about machine learning algorithms like GPT is that, it works based on a statistical model instead of 100% accurate code.
When you ask a human programmer to make an app that sorts dates, they will make one that is mathematically guaranteed to work 100% of the time. (At least, if they are a competent programmer.)
Meanwhile what ChatGPT does is, it has a model that relates your text input with an appropriate response it should give, and will try to give the most accurate output. The way this works is that it trains itself with a ton of data to minimize the error of the model. When it receives an input it has never seen before (like your list of dates), it gives an answer based on that model, and the output of the ‘sorted date’ is nowhere as accurate as the human-written code.
ChatGPT is trained on language, so for questions with a definite answer like sorting, there is no guarantee that ChatGPT’s replies will be accurate. It does generally become more accurate the more training data it gets.
The other explanations are correct and probably easier to understand, but there is another very specific reason that LLMs aren’t good at this, at least not within a single prompt iteration:
Algorithmic complexity, specifically time complexity.
Sorting algorithms inherently take multiple steps to complete. There are many different sorting algorithms, but all of them (with some exceptions that aren’t worth getting into) require you to do some comparison and then swap elements around, many times in some order. Even if you try to parallelize this you still have to do it in multiple steps. You would probably not be surprised to learn that as the size of your list grows, the number of steps does, too.
LLMs, however, have a fixed “depth” to them. That is, they always process information in the name number of steps. Just to throw some jargon out there, they are “transformers” which have something called “self attention” layers which essentially allow the LLM to connect information in one part of the text to other parts of the text. It repeats this process in multiple steps, but crucially it’s the same number of steps every time, regardless of what your prompt is.
It’s quite possible that, despite just being text prediction machines, that LLMs actually have learned the concept of sorting lists, but this limited number of computation steps when predicting what to say next means that they will never be able to solve complex computations in a single prompt cycle.
Given this insight, one experiment you can try to get around this is by responding with “That list isn’t actually sorted. Fix it.” Repeat several times until it gets it right or continues to fail. I haven’t actually tried that with date sorting so I’m not sure if this actually works, however this technique can apply to many situations not just sorting.
Latest Answers