Eli5: What is the difference between correlation and causation


Correlation mean there is a link between two events. Causation mean that one of the two event is causing the other.

A cool example is in New York city when they found out that when more ice cream were sold, there was more crime. This is correlation, when one event (sell of ice cream) increase, another event (crime) also increase proportionally.

But of course there isn’t any causation between the two. Selling more ice cream don’t cause more crime and making more crime don’t increase the selling of ice cream.

Sometime it’s just random chance that there is correlation between two event. Other time like with ice cream and crime it was a third event that were causing both. In this exemple, Summer was the cause of more crime and more ice cream sold.

Causation is the ability of one variable to influence the other. What that means is that by changing one thing, it CAUSES the other to change.

Correlation is the relationship between two sets of data. You can have positive correlation, negative correlation, or NO correlation.
Just because there is correlation, it does not mean there is CAUSATION. A great example of it is this graph of global temperature and amount of pirates ( https://churchoftheflyingspaghettimonsteraustralia.files.wordpress.com/2013/12/pchart1.jpg )
This shows a correlation of pirates and global temperature. It’s joking trying to imply that if we increase the amount of pirates, we will decrease global warming. The data is correlated, but it is not causative.

Correlation means two things have been observed to occur together.

This may mean one is causing the other, but there may also be some third factor causing them both, or it’s simply coincidence.

Causation means that the causative link has been established – one thing is definitely causing the other.

Causation: A causes B; means that having A leads to B and not having A leads to not B through an invariant mechanism.

Correlation: A and B occur together; means that having A indicates B is more likely than not and not having A indicates B is less likely than not. No mechanism is needed, it’s a backward looking statistical fact.

Causation is predictive, if you see A in the future, you know B is coming. Correlation is unreliable for prediction, though it’s still often misused that way.

Causation: one thing happens because of the other.
Example: somebody hits you in the face, your face turns red.

~~Correlation: two things happen at the same time, not because of one another.
Example: You walk through a park and a leaf falls of a distant tree.~~ What the others said.

Correlation means two events are connected, but don’t have a direct connection. Maybe you find that there’s a chart showing that the stock market goes up in years that a particular football team has a winning record and stock market goes down in years that team has losing record. But there is nothing in the way that the stock market operates that would be affected by how a football team performs.

Causation is when there is a direct tie. If you look at sports bar revenue in the football team’s city and see that bars sell more beer when the team plays well and sell less beer when the team plays poorly, that would be causation because the team’s performance impacts whether people go out to watch the games and drink or not.