What is p-hacking, how does it work, and what does it mean for science in general?

637 viewsOther

I’ve been reading about how some studies that we assume provide us with information about the world actually don’t teach us anything because of statistical manipulation that makes them look more relevant than they are. How does this work?

In: Other

12 Answers

Anonymous 0 Comments

P-hacking occurs when you have so much data that can be sliced so many ways that you can probably find something that seems surprising. Sometimes it’s intentional, but sometimes it’s just overenthusiasm combined with poor documentation of the experiment or study.

The core of the problem is that the “p” in p-hacking represents a probability. Like “there’s less than a 5% chance of this result happening by random chance.” But if I have 100 different ways I can analyze the data, I’ll probably find 5 different results that seem significant, but are actually just random.Example:

I think vitamins are good for you, and I want to help everyone by doing science on vitamins. So I get data from people all over the USA:

* what vitamins they take.
* where they live.
* how old they are.
* how healthy they are.

If I look for patterns in the data, I may find patterns like this:

* **AMAZING MEDICAL NEWS: Older women in Southern states that take a daily dose of vitamin C are less likely to have a heart attack compared to those that don’t. Scientists conclude that vitamin C stops sunburn from causing heart problems in women.**
* (Sad face) Children in cities who take vitamin B daily are more likely to be overweight than those who don’t. I conclude that parents of overweight kids must be trying to help them by giving them extra vitamins. I dont think that’s an interesting result on vitamins, so you never see that result.

The main way to prevent p-hacking is therefore this:

* Before I analyze the data, maybe even before I get the data, I write down every analysis I plan to do and how I will judge what’s interesting. I commit to including that writeup in every publication I make.
* Before getting the data, I probably didnt have a theory that there would be interesting effect for the specific combo of vitamin c – women – south – heart attacks. So I’m not allowed to make any conclusions on that.
* I may have had a theory about vitamin B and children’s obesity. If I did and included it in the plan, I’m compelled to discuss that. It might be “negative health outcome in urban children, no outcome in rural children. Possible confounders are vitamin prescriptions for obese children.”

Edit to add:

After finding the older women – vitamin c – south – heart result, I *AM* allowed to redo a study on that, but I have to start from scratch. I can’t use the same data. Possibly I cant even use the same method of collecting data. But if I come up with a totally different sample of old southern women and I see the effect *there* then I do have a valid result. I can even say “I saw this in the data from an earlier study and I thought it was interesting <insert details>, so I came up with this study to test it.”

You are viewing 1 out of 12 answers, click here to view all answers.