Welcome to Regression Alert, your weekly guide to using regression to predict the future with uncanny accuracy.
For those who are new to the feature, here's the deal: every week, I dive into the topic of regression to the mean. Sometimes, I'll explain what it really is, why you hear so much about it, and how you can harness its power for yourself. Sometimes I'll give some practical examples of regression at work.
In weeks where I'm giving practical examples, I will select a metric to focus on. I'll rank all players in the league according to that metric, and separate the top players into Group A and the bottom players into Group B. I will verify that the players in Group A have outscored the players in Group B to that point in the season. And then I will predict that, by the magic of regression, Group B will outscore Group A going forward.
Crucially, I don't get to pick my samples, (other than choosing which metric to focus on). If the metric I'm focusing on is yards per target, and Antonio Brown is one of the high outliers in yards per target, then Antonio Brown goes into Group A and may the fantasy gods show mercy on my predictions. On a case-by-case basis, it's easy to find reasons why any given player is going to buck the trend and sustain production. So I constrain myself and remove my ability to rationalize on a case-by-case basis.
Most importantly, because predictions mean nothing without accountability, I track the results of my predictions over the course of the season and highlight when they prove correct and also when they prove incorrect. Here's a list of all my predictions from last year and how they fared. Here's a similar list from 2017.
The Scorecard
In Week 2, I opened with a primer on what regression to the mean was, how it worked, and how we would use it to our advantage. No specific prediction was made.
In Week 3, I dove into the reasons why yards per carry is almost entirely noise, shared some research to that effect, and predicted that the sample of backs with lots of carries but a poor per-carry average would outrush the sample with fewer carries but more yards per carry.
In Week 4, I explained why touchdowns follow yards, (but yards don't follow back), and predicted that the players with the fewest touchdowns per yard gained would outscore the players with the most touchdowns per yard gained going forward.
In Week 5, I talked about how preseason expectations still held as much predictive power as performance through four weeks. No specific prediction was made.
In Week 6, I talked about why quarterbacks tended to regress less than other positions but nevertheless predicted that Patrick Mahomes II would somehow manage to get even better and score ten touchdowns over the next four weeks.
In Week 7, I talked about why watching the game and forming opinions about players makes it harder to trust the cold hard numbers when the time comes to put our chips on the table. (I did not recommend against watching football; football is wonderful and should be enjoyed to its fullest.)
In Week 8, I discussed how yard-to-touchdown ratios can be applied to tight ends but the players most likely to regress positively were already the top performers at the position. I made a novel prediction to try to overcome this quandary.
In Week 9, I discussed several of the challenges in predicting regression for wide receiver "efficiency" stats such as yards per target. No specific prediction was made.
In Week 10, I proposed a "leaderboard test" to quickly tell whether a statistic was noisy (and more prone to regression) or stable (and less prone to regression). I illustrated this test in action and made another prediction that yards per carry would regress.
In Week 11, I mentioned that many unexpected things were at the mercy of regression to the mean, highlighting how the average age of players at a given position tends to regress over time as incoming talent ebbs and flows.
In Week 12, I predicted that because players regress, and units are made up of players, units should regress, too. I identified the top five offenses, bottom five offenses, top five defenses, and bottom five defenses, and predicted that after four weeks those twenty units would collectively be less "extreme" (defined as closer to league average). Because offense tends to be more stable than defense, I added a bonus prediction that the defenses would regress more than the offenses.
Statistic For Regression
|
Performance Before Prediction
|
Performance Since Prediction
|
Weeks Remaining
|
Yards per Carry
|
Group A had 20% more rushing yards per game
|
Group B has 30% more rushing yards per game
|
Success!
|
Yard:Touchdown Ratio
|
Group A had 23% more points per game
|
Group B has 47% more points per game
|
Success!
|
Mahomes averaged 2.2 touchdowns per game
|
Mahomes averages 2.0 touchdowns per game
|
Failure
|
|
Yard:Touchdown Ratios
|
Group B had 76% more point per game
|
Group B has 146% more points per game
|
Success!
|
Mahomes TDs Redux
|
Mahomes averaged 2.2 touchdowns per game |
Mahomes averages 2.3 touchdowns per game
|
1
|
Yards per Carry Redux
|
Group A had 22% more rushing yards per game
|
Group B has 12% more rushing yards per game
|
1
|
"Extreme" performance
|
"Extreme" units were ~6.4 ppg from average
|
"Extreme" units are 95% as "extreme"
|
3
|
Defense vs. Offense
|
|
Defenses have regressed 4% more than Offenses
|
3
|
With Mahomes on bye, our touchdown prediction earned a one week reprieve. He will need to reach the end zone three times next week to earn a victory for regression.
A second straight strong showing from our Group A backs puts us as close to a failed yards per carry prediction as we've ever been since this column's inception. We're still... not all that close. If Group B maintains its 70 yards-per-game average next week, Group A backs will collectively need to average 87 yards per game to take the lead— 91 yards per game if Matt Breida misses another contest. It's firmly within the realm of possibility, but still looking like a longshot.
The most extreme units actually managed to get even more extreme after one week. Baltimore's offense went from 11.0 points better than average to 12.7, New England's defense went from 9.2 to 10.8, and Chicago's defense went from 4.9 points worse than average to 5.9 points worse than average. (The fourth "extreme" unit, Arizona's defense, improved slightly from 4.3 points worse than average to 3.9 points worse than average.) But this is why we use larger samples; the entire sample of offenses went from 6.8 points away from average down to 6.6 points away from average, while the defenses went from 6.0 to 5.6.
The most interesting change came from the Kansas City Chiefs, whose offense fell from 6.4 points better than average down to 5.8 points better than average despite the team being on bye. This is because many of the defenses Kansas City played earlier in the year played poorly in Week 12, so Kansas City's performance looks less impressive in hindsight.
Misconceptions in our Perceptions of Interceptions
I discussed earlier this year about how most quarterback stats were stickier (more predictive) than stats at other positions with the giant notable exception of interceptions. We don't talk about interceptions much in fantasy because in most scoring systems they're functionally irrelevant. Some leagues don't penalize them at all, while most others only give -1 or -2 points apiece.
But while they're not especially fantasy relevant, they play a massive role in determining which quarterbacks we think are good and which we think are bad. And this is a shame because study after study tells us that interceptions are the single noisiest quarterback stat we have and it's not really close. If you use the "leaderboard test" we discussed a few weeks ago on single-season interception rate you'll find a list pretty evenly split between all-time greats (Tom Brady, Johnny Unitas, Fran Tarkenton), solid starters (Phil Simms, Roman Gabriel, Ken O'Brien, Steve Bartkowski), and journeymen (Steve DeBerg, Brad Johnson, Nick Foles, Brian Griese, David Garrard).
Fellow Footballguy Danny Tuccitto once calculated how many attempts a player needed before that player's "per-attempt" statistics were 50% the result of the player's underlying skill and 50% the result of random chance. Yards per attempt stabilizes in just 396 attempts, which means midway through the season we probably have a pretty good idea of which differences are meaningful. Interception%, on the other hand, takes 1,681 pass attempts to stabilize. To put this into context, yards per carry (my absolute favorite statistical punching bag) stabilizes in 1,978 carries.
More importantly, that's not 1,681 career attempts, that's 1,681 attempts in the same system and with (largely) the same supporting cast. So while Jameis Winston has 2,356 career pass attempts, he's nowhere near the point where we can say that his career interception percentage is mostly representative of his true level of play because he only has 434 attempts in his current system. Which seems like an important point to make given that Jameis Winston has become the avatar for otherwise good quarterbacks who just can't stop throwing interceptions. (Not without reason.)
I feel quite confident that Jameis Winston's "true" interception rate is higher than, say, Aaron Rodgers' or Russell Wilson's. But I also feel confident that the difference isn't anywhere near as extreme as the results to this point would suggest. To this point, Winston is on pace to be the first player with 29 interceptions since Brett Favre back in 2005 (and the second since Vinny Testaverde in 1988). Over the last decade, only Eli Manning has thrown 25 interceptions in a season (2010 and 2013). Winston has as many interceptions as the six best teams combined. That's not going to hold up!
So let's put a prediction on paper and see what happens.
As of today, Green Bay, Kansas City, Minnesota, Seattle, Jacksonville, Arizona, Oakland, Tennessee, New Orleans, New England, Baltimore, Denver, Philadelphia, Chicago, Indianapolis, Houston, Buffalo, and Detroit have combined to throw 106 interceptions. This is our Group A.
As of today, Tampa Bay, Miami, the Los Angeles Chargers, the Los Angeles Rams, Cleveland, the New York Jets, Pittsburgh, Atlanta, and Washington have combined to throw 122 interceptions. This is our Group B.
To this point Group B has thrown 15% more interceptions than Group A. But there are a lot more teams in Group A, and when everyone's interception rate regresses going forward, Group A teams are going to start throwing a lot more interceptions than Group B. When it comes to protecting the football, nobody will ever mistake Jameis Winston for Aaron Rodgers or Tom Brady, but from the point where I made this prediction last year, his interception rate was cut in half. We'll just have to see if he's up for an encore performance.