Late in the fourth quarter on Monday Night Football, the Baltimore Ravens found themselves deep in a two-touchdown hole against the Detroit Lions. With only a minute thirteen left to go, superhero quarterback Lamar Jackson completed two miracle passes and ran the ball into the endzone to bring the game within eight. And now, the three American Football fans (the sport, not the band) who read this substack of course know what happened… The Ravens attempted a two-point conversion.
They failed, though the attempt was negated by offsetting penalties. Given a second chance, the Ravens tried to convert again, and Jackson missed a wide open receiver. They then attempted and failed an onside kick, losing one of the best games I’ve watched this year by a score of 38-30.
As soon as Ravens coach Jim Harbaugh started signaling for the two-point, hall-of-fame quarterback Peyton Manning was exasperated. On the wonderfully bizarre nationally televised zoom call with his brother Eli, he conceded, “I don’t know why you wouldn’t just kick the point here, E. These analytics go over my head.”
Me too, Peyton! Me too. I’ve written about this particularly puzzling gamble before. To review, the main idea is that two-point conversions succeed about half the time. Extra points succeed over 95% of the time. All two-point conversions are independent, identically distributed stochastic events. Since you need to score two touchdowns, you are more likely to win if you attempt a two-point conversion. If you do some back of the envelope scribbling, the chances of winning, assuming that you recover the onside kick and score another touchdown, are 61% if you go for two, but only 46% if you kick the field goal. You can read more about the rest of the mathematical sophistry behind the strategy in this Seth Walder article.
Rather than rehashing my critique of Walder’s silly argument here, I want you to note how I slipped a patently false offhand remark in there: “All extra point attempts and two-point conversions are independent stochastic events.” In my previous 2-point post, I explained why this wasn’t remotely true. But when I looked to write a fresh complaint about this analytics-brained strategy, I found something even more interesting. It’s not clear that two-point conversions are remotely well modeled as 50-50 coin flips.
If we grant the analytics nerds the premise that we can treat football like an actuarial problem, then the events they study must have temporally stable frequencies. A strategy only works if the odds of a conversion today are the same as the odds of a conversion tomorrow. How could we, um, test whether two-point attempts are well-modeled as independent random events?
Aha! We get to play with my favorite mathematical object, the statistical significance test (lol, sarcasm). Here’s a case where we have a clear statistical null hypothesis: the rate of two-point conversions is stable over time. Let’s find out if this hypothesis holds up to severe testing.
In 2024, there were 55 successful conversions out of 135 attempts. The success rate was 40%. Oh yikes. That’s a big gap. A 15 percent difference takes the “chances of winning” from 60% down to 50%. This is the same “chances” as if they were to kick the extra point.
Analytics folks might complain, “Those sample sizes are too small to tell if there’s a real difference. When you flip 135 coins, it’s not that unlikely that you’ll see only 55 heads!” I mean, this is false. The odds of seeing 55 or fewer heads in 135 coin flips are more than 50 to 1.
They might then say, “Well, it’s not exactly a coin flip, but there’s some stable distribution from year to year that helps us inform strategy.” That doesn’t hold water either. I can play yet another statistics game and test whether the difference between the years is statistically significant. Under a two-tailed proportions z-test, the p-value is less than 0.02. With that p-value, I could publish a paper causally attributing the drop in two-point conversion success rate to analytics people overhyping the value of two-point attempts. Sloan Conference, here I come.1
For those who are not longtime argmin readers, know that I don’t believe in this sort of statistical testing or causal inference at all. I only bring it up because sports analytics nerds want to play math games, and I can play them too! I think we’re better off not playing statistical games, but live by the statistics sword, die by the statistics sword. Analytics is predicated on the world being modelable like a casino. Statistical testing suggests it’s not. Moreover, we know that as soon as you identify a statistical quirk in a complex game like football, teams can adapt to neutralize the strategy. Everyone knows the game adapts from year to year. Analytical edges can only exist for short blips in time.
Football analytics is a rich resource of abuse and misuse of statistics. It’s baffling to me that the analytics crowd thinks the same rules that apply to blackjack can be applied to as dynamic and complex a game as football. They want you to believe that pure rational thought can devise optimal strategies in a game where big angry dudes run into each other for three hours. But the math doesn’t check out.
Anyway, somebody tell Mina Kimes.
Analytics folks might want to watch the percentages in 2025, where the success rate on the 22 attempts so far has been only 32%.