whether a data sample size can ever be large enough to compensate for a lack of diversity within the sample?

682 views

I’m doing some research for an unrelated piece and came across this idea, but because I am not really proficient in stats, I don’t know how to refute it. But I feel I could understand if explained to me logically.

I’d also be curious to know if there are analytic concerns with extremely large sample sizes.

In: Mathematics

4 Answers

Anonymous 0 Comments

Yes and no.

If you’ve got a sample of 500 Democrats and 50 Republicans, that’s not going to reveal different information about electoral outcomes than a sample of 500,000 Democrats and 50,000 Republicans. Indeed, statistics depends on this – that’s *why* you can interview so few people can get a clear picture of the overall population.

With that in mind, you’d actually need to work pretty hard to ensure that the 10:1 bias in the small sample was replicated in the larger sample. If you’re not explicitly selecting for such bias, the closer your sample size gets to the population size, the less bias you’ll have in the sample. If your sample size becomes equal to your population size, it should be obvious that there cannot possibly be any sort of bias – your sample exactly mirrors all the important demographics of your population because they’re the same set.

Consider the difference between “I polled everyone in my office”, “I polled everyone in the building” and “I polled everyone in the city”. You’re not intentionally introducing or preserving bias, so simply expanding your sample will normally reduce the bias (increase the diversity) of your sample.

You are viewing 1 out of 4 answers, click here to view all answers.