Algorithmic Impacts of Jocks versus Nerds
A sports star calls out the negative consequences of infotainment
Sometimes confirmation bias comes from the oddest of places. Yesterday on the Pat McAfee Show, future hall of famer JJ Watt went on a long tirade against statistics and how they become self-fulfilling prophecies. For those readers who are not fans of American Football (I’m guessing that’s a lot of you), JJ Watt is one of the best defensive players of all time. I had the pleasure of watching him wreak havoc at the University of Wisconsin, carrying the team to the Rose Bowl and being named the team MVP in 2010. He then went on to dominate the NFL for 12 seasons.
Watt, now retired, has transitioned into the world of sports punditry. On McAfee’s Show, he angrily reacted to a question about how Pro Football Focus (PFF) thinks CJ Stroud, the quarterback of Watt’s former team the Houston Texans, stinks. It’s a bit ironic that PFF thinks so lowly of Stroud, whereas the rest of the sports babbling world thinks Stroud had the greatest rookie season ever.
PFF is a sports analytics company that produces rankings of all NFL players. On their website, they claim they have mounted “over 15 years as the world leader in advanced football data, tools, and analysis.” PFF arose out of the big data moment in the late aughts. At that point, people were becoming enamored with advanced statistics in sports, but football had been hard to handle. The NFL started releasing “coaches tape” to download.
The coaches tape shows a specific view of the field that isn’t particularly cinematic or entertaining but lets coaches see what every player is doing. Internet nerds started “charting” this tape as if they were coaches themselves. They would invent statistics and blog about their over-analyses.
This was all harmless fun, but because of the big data hype, media companies wanted in on these analytics so they could add data-backed infotainment to their broadcasts. PFF caught the attention of Sunday Night Football broadcaster Cris Collinsworth, and by 2016 the “PFF grades” were being shown on national television every Sunday.
Watt went on a long, expletive-laden rant about how these rankings were not only meaningless but harmful. For what it’s worth, Watt had some of the highest PFF scores of all time, and his off-chart performances forced the company to adjust their rating system.
Watt complains that PFF asserts that they know what should have happened in plays, even though they can’t actually know those counterfactuals. Football is complicated. In each play, each of the 22 players on the field has a particular idea of what should happen. But what actually happens and what should have happened always diverge. The two teams are trying to outsmart each other and disrupt the other’s plans. PFF doesn’t know what the plays are supposed to be, but they argue that they can infer what should have happened by watching enough plays. Statistics should suffice to fill in the gaps.
PFF has stats about “pash rush win percentage” that grade how frequently defensive players overpower offensive players to disrupt plays. They have stats about “interceptable passes,” arguing that some quarterbacks should have more interceptions than the numbers reveal. And from these different statistics, they think they can rank all of the players in the NFL.
Watt rejects all of this. If a player is always “winning” pass rushes, he should be getting a lot of sacks. If he’s not getting those sacks, then there’s probably something wrong with his technique. In the same way, if a player is throwing a lot of “interceptable passes” but somehow doesn’t have that many interceptions, maybe those balls weren’t actually catchable by the defensive backs.
PFF likes to think about everything in football as regression to the mean. They pose that there’s statistical luck in each play, and if a player is too lucky or unlucky, it will even out eventually. But football outcomes aren’t simple coin tosses. The variability and intricacies of different game plans make “any given Sunday” different than the one before it. The games change not only based on the opponents and their plans, but the games change as seasons progress and players get injured or dramatically improve their game. As my Philadelphia readers know too well, what happens at the beginning of a season can be completely non-predictive of what happens at the end.
Now, if these stats were just fun for internet posters, this argument would just be academic. But the statistics have real impact. These companies sell their stats to gamblers. I’m already not sure that this sort of profit-seeking is ethical. But Watt complains that since PFF gets on NBC, their numbers and rankings affect public perceptions. This has downstream consequences. PFF rankings end up affecting which players get awards. And this then bleeds into future contract negotiations, affecting player compensation.
It’s a beautiful example of how statistics, even if they are meaningless, can have tangible negative consequences. Players have been complaining about advanced statistics since they started getting internet popularity, but this is the first time I’ve seen players call out the ramifications of media leaning too hard on randomly chosen stats by superfans. It struck me as incongruous to hear arguments from the ACM FACCT conference echoed on a popular sports broadcast. But perhaps that means we’re finally approaching the point where normal people are not only growing tired of the data nerds but are worried that their net impact is negative.
So you're definitely agreeing with Watt that these stats are meaningless? I have to say ... I don't know!
I'd love to get your thoughts on a couple specific examples.
Imagine a defensive player P who is extremely fast and strong, and gets a lot of sacks. The offense adjusts by consistently double-teaming P. As a result, P gets very few sacks. It seems likely to me that PFF can and does observe this, and they ought to grade P highly because of it. It might not show up in P's team winning more, because maybe the rest of P's team isn't very good, or maybe the number of wins is just too noisy (it's important to note that football has an order of magnitude fewer games per season than baseball or baskbetball), but there is some real sense in which P is doing a better job than a different defensive player P' who's only getting single-teamed but getting the same number of stats.
I'm also interested in your thoughts on the "interceptable passes" stat. This passes my smell test: NFL receivers are good but not arbitrarily good, and how often they catch a pass "to them" sure looks like it's related to how accurate or tight the pass is. It seems quite plausible to me that someone who watches a lot of football and says "That pass wasn't intercepted because it was nowhere near the defensive player, but *that* pass wasn't intercepted only because the offense got lucky " is saying something that isn't total nonsense. (I guess they're saying "Basically no passes that look like the first one get intercepted, but a good fraction of passes that look like the second get intercepted.") Combine this with, again, the relatively small numbers involved in NFL football, which imply that even if we were observing some perfect random process we'd expect a lot of variance around infinite populatin average over the course of a season, and I can easily imagine this is doing something that points in the direction of meaningful.
Or maybe it's all garbage!
Ive watched lots of Texans game (big houston sports fan here), and I can only imagine CJ stroud has a low PFF score is because he throws really risky balls that somehow always end up landing perfectly in the WR hands. It’s what makes him so special because he makes low-medium risk high reward plays that 99% of QBs would be high risk high reward plays. Some are crazy behind the shoulder plays that are like inches away from the CBs or Safetys hand, but no cigar. And he does this on a really consistent basis. This is where I suppose the eye test beats out statistics and statistics just cant capture the nuances of how good CJ Stroud is.