Because statistics is counterintuitive.
A RANDOM SAMPLE of around 1000ish people (actually you’ll find its usually more like 1200, that’s important) will give you answers that are only about 3% off from reality.
The important part is the random part. If you grab, say, 1000 people from downtown Manhattan, you won’t get a picture of the country. You absolutely must have the closest to a perfectly random sample as you can get. Which is actually fairly difficult.
A really great example of this is from the first real scientific polling done on a Presidential campaign. In 1936 the Presidential election was up between FDR and a guy you’ve never heard of before named Alf Landon.
A magazine called Literary Digest had been doing polls of its readers, and it had a LOT of readers, for several Presidential electons and had been right for severasl Presidential elections.
In 1936 Literary Digest sent out 10 millon polls and got back 2.27 million answers. And based on that they said Alf London was going to kick FDR’s ass because their polling showed a massive win for Landon.
This other guy, George Gallup, did a scientific poll of around 1000 people and he said that FDR would win in a landslide. This prompted a great deal of mockery, how could he possibly say something like that with his measley 1000 people?
The answer was: randomness.
Turns out that Literary Digest readers were mostly richer, and mostly in certain geographic areas. They’d gotten lucky in the past but in 1936 the election was decided by poor people, often first time voters, who had been totally ignored by the Literary Digest poll.
It’s counterintuitive, you wouldn’t think that 1000 would be enough, but nope, it really is as long as its random enough.
To get an accurate poll of the entire 8 billion people on Earth you’d only need to sample around 2400 people, as long as you got a completely random sample.
And it’s randomness that’s vastly more important than size. If it’s not random it doesn’t matter how big your sample is, it’s going to be wrong. And a smaller random sample will just give slightly bigger error bars. You could sample 1,000 truly random people on Earth and your margin of error would only be around 6%. That 2400 I mentioned earlier was for a 2% margin of error.
Now, there are all sorts of other factors involved. For example, people will tend to be agreeable. If you say “Do you agree Bob Dole should be declared to be a twit” you’ll get a lot of people saying yes even if they don’t necessarially agree just because people tend to say what they think you want to hear.
Examining the questions asked by a poll is as essential as the sample size. A better question, for example, would ask something more like “Some people say Bob Dole is a twit, others say he isn’t. What do you think?” and would flip that 50% of the time so the question was phrased “Some people think bob Dole is not a twit, others say he is. What do you think?”
Bias kicks in depending on if the person is answering questions on a computer or on paper vs if they’re answering in person. For example if a Black pollster asks questions about race, surprise, a lot of white respondants lie and give much more racially progressive answers than they would if a white pollster was doing the questioning.
In general if you drill into the poll questions and find that they’re all pretty biased (“Donald Trump thinks America is the greatest country to ever exist, do you also love America or are you a commie?”) then it’s an indication that the poll isn’t actually designed to get accurate answers.
There’s also a practice called push polling, in which people are told that they’re being asked questions on a poll but in fact the purpose is to propagandize them and their answers are irrelevant.
“Scientists say that Diet Coke gives you cancer and Diet Pepsi will make you live forever, do you prefer to drink Diet Pepsi?” is a great way to advertise for Diet Pepsi, but a lousy way to find out how many people drink which soda.
Latest Answers