This post is written by Ben Recht and Leif Weatherby.
Ben was briefly quoted in the New York Times this morning, arguing that election forecasting is “more akin to astrology than meteorology.” Given the lack of space, the Times didn’t provide his justification for such an assertion. Serendipitously, we had already written a piece summarizing our dissatisfaction with polls. Before we all move on to Thanksgiving and forget about the ridiculousness of presidential horseraces for two years, let us flag a few points about the absurdity of polls and the media that demands them.
2024 was the year the election forecasters gave up. On the Monday before the election, the New York Times polling average showed Donald Trump and Kamala Harris within one point of each other in 6 critical swing states. They put the final election popular vote prediction at 49-48 in favor of Harris. Effectively, a tie. Poll aggregator Real Clear Politics split the difference even finer, predicting the result 48.5-48.5. Poll forecaster Nate Silver put the probability of either candidate winning at exactly 50-50. Whatever happened, it was supposed to be razor thin.
We now know that these polls were hiding a clear Republican sweep of the presidency, the House, and the Senate. As politician scientist Tom Wood showed in a devastating graphic, forty-seven states trended more Republican than every polling average, meaning that Trump has now been systematically underestimated in every election he’s run in. The ambitious program to roll back the administrative state got a democratic mandate from the most diverse coalition a modern Republican has ever amassed. The result was that, in a “close” race, he won every swing state. That stark truth seems like precisely the sort of thing the prognosticators should have been able to tell us, at least in the aggregate. Instead, the Republicans defeated the pollsters.
Polls attempt to divine big-picture answers about the sentiment of millions of people from the responses of a vastly smaller group, many of whom aren’t especially eager to tell the truth about their opinions. The internet, ubiquitous cell phones, and the widely varying use of technology among different age demographics have all contributed to the problem, thwarting techniques honed at a time when landlines were in every American household.
With response rates in the single digits, pollsters are now forced to apply “statistical corrections,” backed by a series of guesses about the pre-existing beliefs and tendencies of the populace. Let us make it concrete, though a somewhat caricatured example. A pollster calls thousands of numbers until they get 800 people to actually talk to them for 20 minutes. 600 of these are Democrats, and they all say they will vote for the Democratic candidate. 200 of these are Republicans, and they all say they will vote for the Republican candidate. The pollster looks at their data and thinks there’s no way the electorate will be 75% democratic. They reweight their sample to look like 50-50 and declare the election a tie.
This last step has a technical name in statistics: weighting. The rest of us call it guessing. Low response rates and statistical corrections make polling into a special kind of obfuscated punditry, undermining its claim to neutral objectivity and rendering it useless.
Given this statistical mess, we need to ask what these polls are supposed to do in the first place. What is the value of pollsters harassing swing state voters? The promise of the polling industry had high ideals when it came into existence about a century ago. At the beginning stands the journalist and commentator Walter Lippmann, whose 1922 work, Public Opinion, provided the blueprint for modern polling and kicked off a century of political data collection. Lippmann saw surveys and polls as a means to inform citizens and politicians about their shared society, giving them ways to act rationally by knowing the viewpoints of their fellow humans. Lippmann thought he was proposing a tool for a better democracy. He pointed out that the sheer size of the United States made it necessary to create a social scientific lens to see what the public really believes or wants.
But at some point, the polls became the point. In the 1990s, with Bill Clinton’s “third way” and his numbers gurus like Dick Morris, politics was gamified, reduced to a wild-goose chase after micro-segments of the population and their allegedly isolated attachment to fragmented issues, like abortion or guns. Lippmann hoped data would help the population see itself. The metastasized data science inspired by Clintonism instead let politicians see what they wanted to see about their citizenry. It created a video game version of politics that targeted a simulation of reality. This warped mindset has ballooned in the intervening decades, and this election seems like its logical end. By fitting politics to up-to-date data about public opinion, Democrats believed they could outflank their innumerate Republican counterparts and – it was often said – effectively remain in power continuously in one way or another. Given the outcomes of the last decade, it’s time for a gut renovation of the relationship between politics and polling.
Democrats’ belief in numbers mirrors the broader triumph of quantification for quantification’s sake in the professional class and tech elite. We think of this as “argument by numbers,” a ubiquitous rhetorical style used across industry, sports, and politics, in which only opinions or decisions backed by “data” are seen as serious or actionable. But “Data is better than no data” is a vacuous assertion. We deceive ourselves with a data fetish, letting numbers provide the simulacrum of objectivity rather than the real thing. The information we get from pollsters is simply not the type we need to do what Lippmann imagined: support substantial politics or enlighten the electorate. Instead of this high-minded function, they have become infotainment and anxiety-control, a surrogate for building coalitions and fighting for what we believe. Polling as a social process is irretrievably broken.
This is why the survey industry needs a tear-down. For too long, its promises have borne little scrutiny. If you tell pollsters that they were wrong, they invariably respond with a condescending technical explanation beyond the limits of your energy and patience. But make no mistake: the industry is in a crisis that can’t be fixed with better regression analysis. This latest polling failure demonstrates that it’s time for liberal wonks to come around.
It’s time to turn back the tide and reclaim a politics unbeholden to these false spreadsheet idols. Data science has prevented Democrats for too long from engaging in self-reflection about their technocratic failures. Strategy by numbers punts on the actual project of politics—persuading voters that your vision is best for the country—and buries the crucial work of civic coalition-building in a margin of error. College-educated urbanites need to find some way to connect with the rural working class. The left and the right need some way to understand each other and voters without recourse to faulty data analysis that persistently leads us down the wrong path. Data actively prevents us from having these much needed conversations and encounters. If a response to Trump is on the 2028 horizon, quantitative wonks have little to contribute to that fight. Which means we put an end to the Era of the Pollster.
Interesting piece!
I'm a little unclear why we'd call this a polling failure. Nate Silver did in fact suggest that the single most likely outcome was Trump winning every swing state (https://www.newsweek.com/donald-trump-kamala-harris-polls-swing-states-1974158). You seem to be arguing that because Silver and other pollsters predicted the elections as being very close, that one party winning all the swing states indicates a "failure", but Silver at least clearly understood the biggest source of randomness was the latent bias the polls had no way of capturing, and expected that bias to be strongly correlated across swing states.
I'd also object to the sentence "The ambitious program to roll back the administrative state got a democratic mandate from the most diverse coalition a modern Republican has ever amassed." The popular vote margin here was about 1.6%, which makes it the tightest popular vote margin since 2000 and the second tightest since 1970 (https://en.wikipedia.org/wiki/List_of_United_States_presidential_elections_by_popular_vote_margin; the 1960's were tighter). The Republicans are of course turning this into a "broad mandate" claim, but it's nothing like (say) 1984 when Reagan won the popular vote for his second term by 18%.
There's something important and correct about what you're saying, which is broadly that these polls aren't very useful and we've turned them into a big stupid circus. Ultimately noisy measurements of a number (percentage of Republican votes) where we only care about one bit of information on whether that number is above a threshold (50%) are pretty low value when the measurement noise is large compared to the distance to threshold, and that's the situation we were in for this race. And the right answer is to pay less attention to them, not to try to read the tea leaves. I certainly agree this isn't supporting substantive politics or an informed electorate.
I'm ultimately agnostic on your claim that weighting renders polling useless. Again, at the object level, Silver suggested the actual outcome as most likely, exactly through his fancypants weighting process. (I don't think he or Gelman would actually make a claim of "neutral objectivity"; I think they'd agree they're engaging in an at-best semi-scientific guessing process?) In this case, the weighting doesn't get you a clear answer, but it's not obviously not working; it seems plausible to me that if the election were a little less close some of these methods might "amplify signal"? But we of course have no way of knowing.
Certainly I agree that everyone is paying way too much attention to the polls.
Annoying that the NYT article implies that the point of election forecasting is to get a probability of winning: " forecasts go a step further, analyzing the polling and other data to make a prediction about who is most likely to win, and how likely."
I doubt many forecasters would agree that this is their goal. It's more about predicting vote margins, trying to get insight into how close things might be across different locations. The forecasters I know are hesitant to even provide probability of win information because its so easy for people to lose sight of the uncertainty. E.g., a vote share prediction that's off by half a percentage point can change the probability of winning by six percentage points, which seems like a lot! But half a percentage point would not at all be surprising given what we know about sources of uncertainty. (Example from here: http://www.stat.columbia.edu/~gelman/research/published/jdm200907b.pdf)
I find election forecasts interesting because the demand is so out of whack with what they could actually provide. I think it is possible to learn a lot from them if you take them less as oracles and more as tools for thinking. They offer a way to explore different possible outcomes under different assumptions about sources of error, giving some people a way to engage more than they would with politics. But, that's just not what most people want from them, and I doubt it ever will be. So I agree they are kind of doomed to failure, but it's not necessarily because they have no potential to be useful.