What does p(x), p(x, y) and p(x| y) mean?

233 views

I am in my master now and didn’t have stochastic in my Bachelor. Out of nowhere there are lecture slides full of this and I am lost. I think p(x) means just the probability of variable x. One of the others means probability of x given y, I think. But when they start with:

p(y|x) = p(y) * p(x|y)/p(x)

I am completely lost.

In: 2

3 Answers

Anonymous 0 Comments

What you’ve put their is the Bayesian formula and you basically have it right.

p(x) means the base probability of x. That is, the probability of x happening regardless of any other factors.

p(x|y) means the probability of x given that y is true.

Generally when you see the above equation you are trying to figure out p(y|x). That is, some evidence, x, has been observed, and you want to know the probability that, given the observance of x, some related condition y is also true. To calculate this you need to know:

The base probability of x occurring (p(x)).

The probability that x would be observed if y was true (p(x|y))

The base probability of y occurring, whether or not x is observed (p(y)).

Anonymous 0 Comments

This is known as Bayes’ theorem. It is a way to compute probability of an event, based on prior knowledge of conditions that might be related to the event. Here p(x|y) is the likelyhood that x is true given that y is true. You will probably get a better answer by watching some khan academy on this topic.

Anonymous 0 Comments

p(x) is the probability of x occurring, where x is some event. For example, if x is that a coin lands on heads then p(x) is 0.5.

p(x, y) is the probability of x *and* y. If x is that a coin lands on heads and y is that a 6-sided die rolls a 6 then p(x, y) would be 1/12.

p(x | y) is the probability of x given that y has occurred. This is the core concept in Bayesian statistics. Instead of asking globally what a probability is you consider what some other piece of information tells you about that probability. If we use the same coin flip and die roll example then we would recognize that those events are independent, so p(x | y) (the probability of heads, given that you rolled a 6) is the same as p(x), so 1/2.

A classic example of this in practice is a medical test for a rare disease. Imagine that the disease has a prevalence of 1 in a million people, so p(d) is 1/1,000,000. You have a test that is 99% accurate, so p(t | d) is 0.99 (i.e. the probability of getting a positive result, given that you have the disease, is 99%). You want to know “if I get a positive test result, how likely is it that I have the disease?” This would be written as p(d | t). To compute this you’d use Bayes’ Theorem, which you wrote out: p(d | t) = p(d) * p(t | d) / (p(t). We know that p(d) is 1/1,000,000 and p(t | d) is 0.99. To compute p(t) we consider that 99% of the 1 in a million will get a positive result, as will 1% of the 999,999 in a million (aside: real tests would have different error rates here, but we set the test as 99% specific and 99% sensitive for simplicity). We then compute p(t) = 0.99 * 0.000001 + 0.01 * 0.999999, or 0.01000098. Plugging this all in we find that p(d | t) = 0.000001 * 0.99 / 0.01000098 = 0.00009899, or about 1 in 10,000. This 99% test said you had a disease, yet it’s much more likely that the test was wrong than it is that you actually had the disease.

That example leads to comics like [this](https://xkcd.com/2545/), where the out-of-frame speaker reasons that P(false positive | chapter on Bayes Theorem) is nearly 1.