p-value and level of significance both get at the same idea:
We have some assumption of how the world works (a hypothesis). We then run an experiment to obtain data about the world that will allow us to test that hypothesis. The data don’t let us say, for certain, whether our assumption was correct. However, we can ask “what is the probability of obtaining the results we saw if our assumption was true?” This probability is the p-value. If it is very low we may say that our assumption isn’t very good and we should throw it out.
Level of significance is very related. This is a parameter set by the researcher. It is a bound on the p-value such that, if the p-value is any smaller than that bound, we are willing to reject our assumption. What it’s really asking saying is “Even if our assumption still has a probability of ‘a’ of being true we will reject it.” 5% and 10% are frequent choices. That is, researchers often say that the world under their assumption could have generated the data they saw 5% of the time, they would still reject it.
How do we determine the p-value, though? Well, we have knowledge about the process that generates our estimate. This knowledge of the process is written as a function called the probability density function This allows us to see the relative likelihood (probability density) of any our parameter taking on any value given our assumption. A PDF doesn’t take a value and spit out a probability, though. I won’t get into the math, but a critical value basically says “here’s the value on the PDF where, for all values beyond it, we have an x% chance of observing any value in that range” where x is your significance level.
By level of confidence is a probability associated with a confidence interval. Remember, our coin from before has some “true” probability of heads. The p we estimated from our experiment is just that, an estimate. Based on how much data we observe we might be more or less confident that this estimate is a precise one.
A 95% confidence interval says that, if you were to run this experiment 100 times, an interval this wide around your estimate will contain the true value of the parameter (in this case p) 95 of those 100 times*.
*Note, as a matter of strict interpretation, the CI around the estimate for your single experiment either contains the true value or it doesn’t, so that’s why we talk about re-running the experiment 100 times.
Latest Answers