whether a data sample size can ever be large enough to compensate for a lack of diversity within the sample?

684 views

I’m doing some research for an unrelated piece and came across this idea, but because I am not really proficient in stats, I don’t know how to refute it. But I feel I could understand if explained to me logically.

I’d also be curious to know if there are analytic concerns with extremely large sample sizes.

In: Mathematics

4 Answers

Anonymous 0 Comments

Your color example perhaps needs some adjusting. There is “Truth in the Universe,” (TITU) in which we asked every single person what their favorite color is/was (FYI we can speak to the dead, just check out The Dead Files on the Travel Channel).

If we then ask 1,000 random living Americans what their favorite color might be, we will have some level of confidence that their favorite color will reflect the actual TITU. We have biased the sample by using Americans, and by using those who are alive (perhaps quite a few of the departed favor sepia), but how important is nationality with regards to color? If unimportant, we can increase the sample size to 100,000 random Americans, and we will then have more confidence that the answer more closely approximates the TITU reality.

But if nationality IS a biasing factor, then increasing the sample size while still only including Americans leaves a significant source of error, and ultimately cannot narrow the confidence interval significantly.

You are viewing 1 out of 4 answers, click here to view all answers.