It doesn’t know what facts are. It has algorithmically determined what facts look like, and has been given a huge bank of largely plagiarised material to paint a picture of a fact with. It literally isn’t capable of repeating facts outside of an accident or a hugely weighted sample giving it overwhelming bias toward the fact. It’s been fed a load of stolen shit from the internet, so that second one isn’t that likely to happen lol.
It probably made up the address too, it just did so because people who talk about your area talk about certain streets a lot, and the numbers average out to a certain range. They probably talk about the neighbouring town’s restaurant, so hey, throw a mention of it in there.
This is, for example, why you can’t use it to write stories or essays and have it actually write anything good. It doesn’t know what a story or essay is and it doesn’t know the techniques involved in writing them properly. It just has an algorithm telling it what a story and an essay look like. If you tell it to write a story, it’ll write the most derivative crap possible and maybe throw in Jon Snow or a hyperspecific fanfiction porn trope that it can’t possibly have in its database without theft for good measure, and there won’t be anything resembling a coherent character arc, and if you tell it to write an essay it’ll ‘hallucinate’ all its sources because it knows what a citation looks like and maybe how they’re formatted, but it is incapable of understanding sources or even finding them properly.
Because while these devices are trained on internet data the internet itself is full of a lot of misinformation and on top of that learning algorithms are still kind of in their infancy, despite how promising they are they certainly still have a lot of issues that can result in a lot of false positives.
Although on the other hand in my experience chat GPT and similar AIs do not seem to lie nearly as much as some news articles claim.
Even tested this recently when they were news articles claiming that like chat GPT almost always gets math wrong, started asking it math questions of various degrees of hardness and it only really ended up getting one kind of wrong and it was due to a mistake that I could see even a human making
And if I just usually ask it something basic like who was the 30th president of the USA or something it usually gets that right. It typically just seems to have issues with more logic related questions because the AI itself is not really designed to be logical it’s designed to be conversational
I recently attended an event where the head of the Microsoft Copilot team was the keynote speaker. During her presentation she stressed that the biggest issue with AI being adopted was that people were using it like a search engine. This is your problem. ChatGPT and Llama3 are not built to search the internet for you. It’s like using a screwdriver to hammer a nail, you’re using the tool wrong. These tools are meant to be used to create new ideas. The other posts talk about HOW the tools create new ideas, but the key take away here is that these are GENERATIVE tools. That’s what the ‘G’ stands for in ChatGPT. Ask them to create a meal plan for your specific dietary needs or create a new recipe given a list of ingredients. Do not ask them to find you a restaurant to eat at.
First off, I’d just like to define an word to add some context. Semantics is the study of meaning. In simple terms semantics refers to the meaning behind the words that you write and that you say. When you speak you naturally think about the meaning of your words. When you ask a LLM a question, you are logically expecting it to answer with a semantically coherent and correct response.
As others have mentioned, LLMs are just adding the next word based on the probability calculated from the context. However this probability is calculated in very complex ways in the background. LLMs seem to have the ability to generalize certain semantic information within their neural networks to the point where is seems to be able to reason and connect seemingly disconnected pieces of information. However this phenomena is not fully understood at the moment. This also means that when you ask it something it doesn’t know it will always give you it’s best guess based on the probability. Another weird pattern that you might see when using LLMs is that it might tell you it doesn’t know something even when it does, this is probably because the original training data probably had a certain piece of text that might have biased the model into answering in a particular way.
Artificial Intelligence is often used as a term when referring to LLMs and Machine Learning. However there are several other branches of AI that are actively being explored that I think are worth mentioning in this thread.
Knowledge Graphs is a different approach to semantic data analysis and usage. With knowledge graphs it’s easier to determine what the system knows and what the system doesn’t know. So it’s easier to keep the system from hallucinating. However knowledge graphs are usually harder to create and harder to use for more casual things.
Another interesting branch of AI is Logic programming. With logic programming you can determine the rules of a problem you are trying to solve and allow the system to interpret those rules to find a solution. With Logic programming you can solve complex issues, however, similar to knowledge graphs, using logic programming languages tends to require a lot of time, and isn’t really convenient for day to day use.
I believe future research into AI will combine these technologies in smart ways to leverage each of their strengths and weaknesses.
Imagine a box you can put instructions written in chinese. Out comes the same writing only in english you can now read.
Because you can’t see inside the box, you may think a chinese national is inside, translating for you. In reality, its a random american, with an english to chinese dictionary. This person has the *tools* but not the intelligence to understand what its doing.
Because ChatGPT is not a search engine, it’s not a research tool, and it’s not actually intelligent. It is a chat tool, and while it’s pretty cool there isn’t an actual brain behind it. You give it the conditions you want and ask it to do something, and it’ll scan the data used to train it and form sentences based on those conditions. It’s basically a next-word-prediction program with a *hell* of a lot of “if-then-else” statements in the programming.
Latest Answers