When study statistics results are reported, what does it mean when authors say “results upon controlling for XYZ factors”?

301 views

I don’t fully understand what controlling for a factor in a experiment means, especially when it comes to real world studies with large number of people in the trials. For e.g. ” Yogurt consumers had a higher DGAI score (ie, better diet quality) than nonconsumers. *Adjusted for demographic and lifestyle factors and DGAI*, yogurt consumers, compared with nonconsumers”
Looking for an intuitive way to understand what controlling for factors means.
Thank you in advance!

In: 3

7 Answers

Anonymous 0 Comments

Let’s pretend that we think owning a super-fancy-super-speedy sportscar is somehow “good for your health” so we do a study of a million people’s car purchases and compare that to their ages at death.

Out of 1,000,000 folks only 10 had super-fancy-super-speedy sportscars, and on average they lived to be 85 years old. The other 999,990 people did not have super-fancy-super-speedy sportscars, and they lived to be 75 years old on average. Wow! Super-fancy-super-speedy sportscars “owners” live 10 years longer on average than “non-owners”! We have confirmed our hypothesis!

Or have we? What else could have caused the difference? Well, for one thing, super-fancy-super-speedy sportscars are super expensive… so what happens if, instead of just “owner” versus “non-owner”, we also study income tax data so that we can take “rich” versus “not-rich” into account?

In that case we might find that there are 10 “rich owners”, 10 “rich non-owners”, 999980 “non-rich non-owners”, and 0 “non-rich owners” of super-fancy-super-speedy sportscars. So what were the average lifetimes for each of those sub-groups? The “non-rich non-owners” are still 75 on average, there are no “non-rich owners”, the 10 “rich-owners” lived to 80 on average, and the 10 “rich non-owners” lived to 90 on average… so what does any of it all mean now?

Well, when wealth is taken into account… it looks (on average) like the “rich” live longer than the “non-rich” regardless of whether they bought a super-fancy-super-speedy sportscar or not. Furthermore, if we compare only “rich non-owners” versus “rich owners” we see that (adjusted for “rich”) the “owners” actually live 10 years shorter on average than the “non-owners”!

So, “controlling for XYZ… we find ABC” (as a summary statement) essentially means that the researchers made sure to separate groups into smaller subgroups (based on XYZ) to check whether the statistical effect (ABC) still persists. It doesn’t mean the study accounted for any-and-every possibility (maybe their critics believe they should have considered IJK, too) but it does mean that they did the work to show, that at the very least, that ABC isn’t just XYZ in disguise (the way “owners” was actually “rich” in disguise in the silly sportscar example).

—————————–

As for what “…demographic and lifestyle factors…” specifically means, you’d have to dive into the methodological details of the specific article. Maybe they were really really overly thorough and broke out a multitude of sub-sub-sub-groups down based on gender, age, wealth, ethnicity, sexual orientation, left/right-handedness, favorite color, etc. or maybe they were lazy and split it only based on gender then called it a day hoping nobody would dig into the specifics. That same “adjusted for demographic and lifestyle factors…” line might be an understated statement or an overreaching one, but the only way to know for sure is to check the specifics of the methodology.

You are viewing 1 out of 7 answers, click here to view all answers.