Across the board, public opinion surveys overstated Democratic support in the 2020 election.
This is a disaster for the polling industry and for media outlets and analysts that package and interpret the polls for public consumption, such as FiveThirtyEight, The New York Times’ Upshot, and The Economist’s election unit. They now face serious existential questions. But the greatest problem posed by the polling crisis is not in the presidential election, where the snapshots provided by polling are ultimately measured against an actual tally of votes: As the political cliché goes, the only poll that matters is on Election Day. The real catastrophe is that the failure of the polls leaves Americans with no reliable way to understand what we as a people think outside of elections—which in turn threatens our ability to make choices, or to cohere as a nation.
Some have come to believe that a distrust of institutions is more pervasive than anticipated across many voter groups, and that it leads conservative voters, even those with college degrees and urban addresses, to avoid participating in polls in disproportionate numbers. If so, the problem likely can’t be corrected by adding more members of any one demographic group to a polling sample, they said.
“I readily admit that there were problems this year, but it is too soon to know the extent of the problems, or what caused them,’’ said Courtney Kennedy, who supervises poll methodology for the Pew Research Center. In both 2016 and 2020, she said, ”there was a widespread overstatement of Democratic support,’’ but the causes this time around could be different.
Without doing a deep dive, here’s the nutshell answer. It’s always been the case that not everyone answers the telephone when pollsters call. This produces a non-random sample, which pollsters have to correct using models to reweight the sample so it matches the actual electorate.
But in recent years, this problem has become acute: response rates to polling calls have plummeted to around 5 percent these days. This produces a massively lopsided sample, which in turn puts a lot of pressure on the model weights to correct things. It’s now gotten to the point where everything depends on the accuracy of the model, and if the model is off then the polling numbers are worthless. But this produces something of a tautology: the goal of the model is to emulate the “real” electorate, but there’s never any way to be sure you’ve done that since the real electorate is what you’re trying to measure in the first place.
In 2016 the models failed to adjust properly for educational levels among likely voters. In 2020 pollsters corrected for that but obviously failed to account for something else. Eventually they’ll figure out what it was. And there’s no guarantee that they’ll get it right in 2024, which might have some entirely different problem.