Welcome to Regression Alert, your weekly guide to using regression to predict the future with uncanny accuracy.
For those who are new to the feature, here's the deal: every week, I dive into the topic of regression to the mean. Sometimes I'll explain what it really is, why you hear so much about it, and how you can harness its power for yourself. Sometimes I'll give some practical examples of regression at work.
In weeks where I'm giving practical examples, I will select a metric to focus on. I'll rank all players in the league according to that metric, and separate the top players into Group A and the bottom players into Group B. I will verify that the players in Group A have outscored the players in Group B to that point in the season. And then I will predict that, by the magic of regression, Group B will outscore Group A going forward.
Crucially, I don't get to pick my samples (other than choosing which metric to focus on). If the metric I'm focusing on is touchdown rate, and Christian McCaffrey is one of the high outliers in touchdown rate, then Christian McCaffrey goes into Group A and may the fantasy gods show mercy on my predictions.
Most importantly, because predictions mean nothing without accountability, I track the results of my predictions over the course of the season and highlight when they prove correct and also when they prove incorrect. Here's a list of my predictions from 2020 and their final results. Here's the same list from 2019 and their final results, here's the list from 2018, and here's the list from 2017. Over four seasons, I have made 30 specific predictions and 24 of them have proven correct, a hit rate of 80%.
The Scorecard
In Week 2, I broke down what regression to the mean really is, what causes it, how we can benefit from it, and what the guiding philosophy of this column would be. No specific prediction was made.
In Week 3, I dove into the reasons why yards per carry is almost entirely noise, shared some research to that effect, and predicted that the sample of backs with lots of carries but a poor per-carry average would outrush the sample with fewer carries but more yards per carry.
In Week 4, I talked about yard-to-touchdown ratios and why they were the most powerful regression target in football that absolutely no one talks about, then predicted that touchdowns were going to follow yards going forward (but the yards wouldn't follow back).
In Week 5, we looked at ten years' worth of data to see whether early-season results better predicted rest-of-year performance than preseason ADP and we found that, while the exact details fluctuated from year to year, overall they did not. No specific prediction was made.
In Week 6, I taught a quick trick to tell how well a new statistic actually measures what you think it measures. No specific prediction was made.
In Week 7, I went over the process of finding a good statistic for regression and used team rushing vs. passing touchdowns as an example.
In Week 8, I talked about how interceptions were an unstable statistic for quarterbacks, but also for defenses.
In Week 9, we took a look at JaMarr Chase's season so far. He was outperforming his opportunities, which is not sustainable in the long term, but I offered a reminder that everyone regresses to a different mean, and the "true performance level" that Chase will trend towards over a long timeline is likely a lot higher than for most other receivers. No specific prediction was made.
In Week 10, I talked about how schedule luck in fantasy football was entirely driven by chance and, as such, should be completely random from one sample to the next. Then I checked Footballguys' staff leagues and predicted that the teams with the worst schedule luck would outperform the teams with the best schedule luck once that random element was removed from their favor.
In Week 11, I walked through how to tell the difference between regression to the mean and gambler's fallacy (which is essentially a belief in regression past the mean). No specific prediction was made.
In Week 12, I showed how to use the concept of regression to the mean to make predictions about the past and explained why the average fantasy teams were close but the average fantasy games were not. As a bonus, I threw in another quick prediction on touchdown over- and underachievers (based on yardage gained).
In Week 13, I went through the rabbit hole and investigated how performance in Regression Alert was also subject to regression to the mean, and how our current winning streak was unsustainable and destined to end sometime.
In Week 14, I talked about why larger samples were almost always better than smaller subsamples and how "hot streaks" were often just an illusion. I made two specific predictions: that a group of "hot" players would cool back down to their season average, and that the ice-cold Ja'Marr Chase would heat back up again.
In Week 15, I walked through the sad math predicting that your best fantasy team was going to lose in the playoffs.
In Week 16, I walked through how small edges become big edges when you exploit them over and over and over again. I also predicted that kicker performance over the final two weeks would be more a function of the team's offense than the opposing teams' defenses.
Statistic for regression | Performance before prediction | Performance since prediction | Weeks remaining |
---|---|---|---|
Yards per Carry | Group A had 10% more rushing yards per game | Group B has 4% more rushing yards per game | None (Win!) |
Yards per Touchdown | Group A scored 9% more fantasy points per game | Group B scored 13% more fantasy points per game | None (Win!) |
Passing vs. Rushing TDs | Group A scored 42% more RUSHING TDs | Group A is scoring 33% more PASSING TDs | None (Win!) |
Defensive Interceptions | Group A had 33% more interceptions | Group B had 24% more interceptions | None (Win!) |
Schedule Luck | Group A had a 3.7% better win% | Group B has an 18.5% better win% | None (Win!) |
Yards per Touchdown | Group A scored 10% more fantasy points per game | Group B has 19% more fantasy points per game | None (Win!) |
"Hot" Players Regress | Players were performing at an elevated level | They have regressed 77.4% to their season average | 1 |
Yards per Route Run | Group A led by 109% | Group A leads by 35% | 1 |
Kicker Production | n/a | 44% of games are closer to offense | 1 |
Our formerly hot players had a strong week, spearheaded by Tee Higgins' 43.4 points, but they're still performing significantly closer to their season average than they are to their hot stretch. In fact, 11 of the 20 players are averaging fewer points over the last three weeks than they are over the full season.
As for our rookies, Chase has shaken off his cold streak and is performing much closer to his full-season average again, but Waddle and Smith have both maintained their elevated play and, since our samples so far are 3 games each, things aren't looking great for our prediction. But the nice thing about such small samples is there's potential for a lot of movement in one week, so I'm not closing the book on this yet.
As for our kickers, I predicted in 55% of games kicker scoring would be more determined by the offense's point scored average than the opposing defenses' points allowed average. In our first week, the bet went 14-18. We'll need to go 21-11 in Week 17 to meet our 55% target. I'm not optimistic.
Careers Aren't Getting Longer, Talent Is Just Regressing
Thing I forgot I did.
— Adam Harstad (@AdamHarstad) January 18, 2020
Are NFL careers getting longer? Here's the age of the oldest "fantasy-relevant" player in every year from 1970-2018, excluding Jerry Rice.
("Fantasy-relevant" basically means "minimally-productive" in this context.)
I'd say... maybe? But not really. pic.twitter.com/VxGMEskLWZ
In 2018, I wrote about the perceptions that NFL careers were longer than ever before. Surprisingly, I discussed how they most certainly were not getting any longer (at least among the very oldest players), and how any perceptions to the contrary were mostly driven by a super-talented group of future Hall of Fame quarterbacks, headlined by Tom Brady.
In fact, there's no other position where careers are getting longer like they are at quarterback. In the last decade, eight different offensive linemen have started at least half a season at age 36 or older. Six different players did it in the year 2000 alone. There were seven different double-digit sack seasons by a 36-year-old player between 1997 and 2000. There has been one in the 17 years since. Even kickers aren't seeing any major improvements. From 2000-2009, the league averaged three kickers and punters per year over the age of 40. From 2010-2017, it averaged 2.5. (Old placekickers were slightly up, but old punters were way down.)
I then used this as a springboard to discuss one of my favorite concepts in fantasy football: the idea that, on average, a certain amount of talent enters the NFL every season, but how that overall average hides huge fluctuations from year to year. Sometimes we see five Hall of Fame quarterbacks enter the league in a five-year span. Other times we see one, or even zero. And because those ultra-talented players are more productive and because they play longer, average positional age today is largely driven by the quality of draft classes a decade ago.
(An illustration: in 2018 there were more starting quarterbacks from the 2000-2005 classes than there were from the 2006-2011 classes, despite the latter quarterbacks theoretically being five years younger.)
This is one of my favorite concepts because I used the theory to predict in 2013 that we were likely to see an influx of talent at running back that left the position much younger and more talented overall. It's rare to get a chance to make a long-term prediction like this and revisit it a half-decade later. It's rarer still for the prediction to prove correct.
And since it's too late in the year for any new predictions, I like looking back at the season as it wraps up and recalculating the average age at each position. It gives us a good snapshot of the state of the talent in the league. We can see which draft classes have been strong, where there's a surplus of young stars and where there's a shortage.
So here's the average age of the Top 24 quarterbacks, Top 36 running backs, Top 36 wide receivers, and Top 24 tight ends. These ages have all been weighted by production so that top performers exert more influence. In each case, RED means that the position is younger than historical averages while BLUE means it's older.
Displayed like this, a number of things leap out. For instance, you can clearly see the strength of the 2017 running back class immediately flipping the position from blue to red.
If you count Cordarrelle Patterson as a WR, then 7 of the top 12 fantasy RBs right now came from the 2017 class… and Christian McCaffrey isn’t even one of them.
— Adam Harstad (@AdamHarstad) December 13, 2021
“Sell Year 5 RBs” theory running face-first into the greatest RB crop in history.
Also your semi-regular reminder that
— Adam Harstad (@AdamHarstad) December 14, 2021
Christian McCaffrey
Alvin Kamara
Dalvin Cook
Austin Ekeler
Joe Mixon
Aaron Jones
Leonard Fournette
James Conner
Kareem Hunt and
Chris Carson
all came from the same NFL draft class.
That's ten *different* RBs with at least one Top 12 finish.
You can also see the incredible year-after-year influx of talent at receiver in 2019 (DK Metcalf, A.J. Brown, Deebo Samuel, Terry McLaurin, Diontae Johnson, Marquise Brown, among others), 2020 (Justin Jefferson, CeeDee Lamb, Tee Higgins, et al), and 2021 (JaMarr Chase, Jaylen Waddle, DeVonta Smith, and likely more to come).
As I alluded to in the first tweet above, recognizing that talent is not evenly distributed across seasons makes certain things more obvious. For instance, "fade Year 5 running backs" is a great strategy five years after a weak class, but a terrible strategy five years after the greatest running back class in history.
It also shows the payoff of rookie picks in dynasty leagues can swing wildly from year to year. If you've been over-invested in the last four draft classes, your team is probably doing great right now. But that doesn't mean "over-invest in rookie picks" is the ideal strategy. Early reports of the 2022 class suggest it's a weaker group, and if so, we'll probably start to see these positions trending a bit more towards the blue in the coming years.
When it comes to long-term trends, talent drives everything, but you can add incoming talent to the list of things that regresses to the mean.