It is a bitter reality for the candidates who failed to qualify for next month’s debate, and griping has been in plentiful supply. Some argue that the Democratic National Committee’s qualification criteria are arbitrary — candidates must reach 2% support in at least four polls by approved organizations — and that the committee is prematurely culling the field. Others want the DNC to count a wider range of surveys.
But among voters upset that their preferred candidate didn’t make the cut, some objections stem from misunderstandings of how polling works. We talked to pollsters to clear up some points of confusion.
You probably won’t be polled.
As the story goes, George Gallup, a pioneer of modern polling, once told a woman who questioned his sampling methods — nobody she knew had ever been polled, she said — that being selected for a poll was about as likely as being struck by lightning.
It’s true. If a poll has a sample size of 1,000 — and many are smaller — that’s 1,000 people out of millions. Your chances of being chosen for that poll might be less than one-thousandth of a percentage point.
Dozens of polls have been conducted by DNC-approved organizations this year, but one-thousandth of a percentage point multiplied dozens of times is still a fraction of a percentage point. Your chances look marginally better if you count only the voting-age population or registered voters, but not better enough to be meaningful.
So if nobody has ever polled you, that’s not surprising at all.
But that doesn’t make the polls invalid.
Without a statistical background, it can be hard to understand how such a tiny piece of the electorate can represent the whole. But it absolutely can.
“One of the first things that you learn in probability statistics is that the accuracy of your sample is based on the size of the sample itself and not the proportion of the population that it represents,” said Patrick Murray, director of the Monmouth University Polling Institute.
Imagine you wanted to know how many blades of grass were in your lawn. If you divided the lawn into 10,000 equally sized squares and counted the blades in 500 of those squares, you could extrapolate that number to the whole lawn. If you made 1 million squares, you could still extrapolate by picking 500 of them. The total number of squares on the lawn doesn’t matter; what matters is that you randomly choose 500 of them to sample.
Similarly, the total number of voters in the United States doesn’t matter. What matters is that each voter has an equal chance of being surveyed, even if that chance is infinitesimal.
Sample size does matter, of course, but only to a point. A 500-person sample is much better than a 200-person sample. But there isn’t much difference in accuracy between a 1,000-person sample and a 2,000-person sample, said J. Ann Selzer, president of the Iowa polling firm Selzer & Co.
You can’t volunteer.
Random sampling is an essential part of polling. You can’t conduct a reliable poll with volunteers, because the people who take the initiative to volunteer won’t be representative of the full electorate.
“You can’t invite yourself to the party,” Selzer said, “because the people who would choose to invite themselves to the party may have some existing bias.”
Some pollsters have a database of registered voters and randomly select some of them. Others use random digit dialing, meaning they have a computer generate random phone numbers.
Online organizations like SurveyMonkey and YouGov operate a bit differently: They take samples from the pool of people who fill out their user-generated surveys, so people can put themselves in the running by taking such surveys. But they can’t make themselves be chosen: Within those pools, participants are still selected randomly, and the pools are still very large.
This month, Rep. Tulsi Gabbard of Hawaii — who ended up falling two polls short of qualifying for the September debate — urged her supporters to take SurveyMonkey and YouGov surveys in hopes of being chosen for an official poll. Her supporters appear to have gone a step further: Selzer said they had been calling her office, asking to be included in her polls, which are not conducted online.
There is no way to increase your odds of being chosen for one of Selzer’s polls, which have been among the most accurate in the country in past elections. They would be less reliable if there were.
Good polls are weighted, but only lightly.
Because demographics like race, sex, age and education level are associated with differences in political opinion, polls are more accurate when the demographic breakdown is accurate. If, for example, women or white people or baby boomers are over- or underrepresented in a sample, the poll results are likely to be skewed.
Pollsters address that problem through a process called weighting. If, say, college-educated men account for a larger percentage of the sample than they do of the full electorate, the pollster will count their responses a little less in the results.
In high-quality polls, these adjustments tend to be small, because a properly selected sample will have a close-to-accurate representation of most demographics to begin with. So a reputable organization would take a sample that was 54% women and weight it to match the percentage shown in the census, but a sample that was 75% women would be fatally flawed.
“If your original sample is way off on a certain demographic, no amount of weighting is going to help you,” said Doug Schwartz, director of the Quinnipiac University Poll. But if you choose your sample properly, he added, “you’re not going to be way off on any of the demographics.”
Every poll can’t be accurate. But averages usually are.
Many people are familiar with the margin of sampling error, which reflects the uncertainty involved in taking a sample of a larger population. A margin of error of plus or minus three percentage points means that a candidate’s actual support could easily be as much as three percentage points higher or lower than what the poll says.
That margin is tied to something called a confidence interval. Most polls publish a 95% confidence interval: 5% of the time, that interval won’t contain the truth because of sampling error. That rate also may not capture every possible polling problem, like people not picking up the phone.
“You cannot treat polls as crystal balls no matter how well they are done,” Murray said. “Statistics tell us that polls cannot be accurate 100% of the time.”
But while you shouldn’t put too much faith in a single poll, averages of many polls are pretty reliable.
To qualify for next month’s debate, candidates needed to register 2% support in at least four of the 21 polls that met the DNC’s criteria. A candidate supported by 2% of all Democratic voters very likely got less than that in some polls. But it would be exceedingly unlikely for them not to reach the bar in at least four.
“If you see in poll after poll that a candidate is not getting the 2%,” Schwartz said, “that will give you confidence that they’re really not at that 2%.”
This article originally appeared in
.