I do not understand why its harder to find a significant difference in data when you do more comparisons

140 views

I am a grad student desperately trying to analyze her data. I am having a hard time understanding why correcting for the amount of tests I’m doing (Bonferoni and Tukey) is taking away my significance. I have 4 factors across 3 timepoints and when I run stats on each factor across the timepoints, they are significant. When I put them all together on one graph (all four factors across all 3 timepoints), they are no longer significant. I understand how Bonferoni works, what I am asking is why does it feel like I am being punished with stricter p-values when I am being more thorough? I feel like this correction encourages people to break down their data in order to get significance, which feels icky. Im wishing I would have studied just one of the factors across the timepoints instead of all 4.

In: 4

4 Answers

Anonymous 0 Comments

Bonferonni is too strict and superseded by holm-bonferroni anyway, which is less strict but still does a good job maintaining alpha as the threshold.

Anyway, it depends what you were investigating exactly with the multiple factors. E.g., in the extreme case, each is exactly identical and that would make correction pointless (because it’s really just one factor). So the fact the bonferonni makes them not significant isn’t necessarily a big deal. I’d have larger questions about how separate the factors really are. If numeric are they highly correlated, like .8? Or are they truly unique? If they’re highly correlated, then bonferonni correction doesn’t actually do any good (meaning it isn’t actually protecting type I error rates, it’s unnecessarily making it harder to detect anything). There are methods above my head that take into account the correlation of the “independent” tests through bootstrapping.

If they’re unique, then as the other responder said, it’s fair to just say these factors are individually significant, even if bonferonni (or a better correction) eliminates the effect, that doesn’t mean it isn’t interesting and worth following up on. You don’t have the power for four separate tests maybe, but all of them were individually significant and that’s still interesting. Correction is just another tool for decision making. I’d be more skeptical of the results if it was just one effect was significant of the four, and then it didn’t survive correction.

You are viewing 1 out of 4 answers, click here to view all answers.