Welcome to Regression Alert, your weekly guide to using regression to predict the future with uncanny accuracy.
For those who are new to the feature, here's the deal: every week, I dive into the topic of regression to the mean. Sometimes I'll explain what it really is, why you hear so much about it, and how you can harness its power for yourself. Sometimes I'll give some practical examples of regression at work.
In weeks where I'm giving practical examples, I will select a metric to focus on. I'll rank all players in the league according to that metric, and separate the top players into Group A and the bottom players into Group B. I will verify that the players in Group A have outscored the players in Group B to that point in the season. And then I will predict that, by the magic of regression, Group B will outscore Group A going forward.
Crucially, I don't get to pick my samples (other than choosing which metric to focus on). If the metric I'm focusing on is touchdown rate, and Christian McCaffrey is one of the high outliers in touchdown rate, then Christian McCaffrey goes into Group A and may the fantasy gods show mercy on my predictions.
Most importantly, because predictions mean nothing without accountability, I track the results of my predictions over the course of the season and highlight when they prove correct and also when they prove incorrect. Here's a list of my predictions from 2020 and their final results. Here's the same list from 2019 and their final results, here's the list from 2018, and here's the list from 2017. Over four seasons, I have made 30 specific predictions and 24 of them have proven correct, a hit rate of 80%.
The Scorecard
In Week 2, I broke down what regression to the mean really is, what causes it, how we can benefit from it, and what the guiding philosophy of this column would be. No specific prediction was made.
In Week 3, I dove into the reasons why yards per carry is almost entirely noise, shared some research to that effect, and predicted that the sample of backs with lots of carries but a poor per-carry average would outrush the sample with fewer carries but more yards per carry.
In Week 4, I talked about yard-to-touchdown ratios and why they were the most powerful regression target in football that absolutely no one talks about, then predicted that touchdowns were going to follow yards going forward (but the yards wouldn't follow back).
In Week 5, we looked at ten years' worth of data to see whether early-season results better predicted rest-of-year performance than preseason ADP and we found that, while the exact details fluctuated from year to year, overall they did not. No specific prediction was made.
In Week 6, I taught a quick trick to tell how well a new statistic actually measures what you think it measures. No specific prediction was made.
In Week 7, I went over the process of finding a good statistic for regression and used team rushing vs. passing touchdowns as an example.
In Week 8, I talked about how interceptions were an unstable statistic for quarterbacks, but also for defenses.
In Week 9, we took a look at JaMarr Chase's season so far. He was outperforming his opportunities, which is not sustainable in the long term, but I offered a reminder that everyone regresses to a different mean, and the "true performance level" that Chase will trend towards over a long timeline is likely a lot higher than for most other receivers. No specific prediction was made.
In Week 10, I talked about how schedule luck in fantasy football was entirely driven by chance and, as such, should be completely random from one sample to the next. Then I checked Footballguys' staff leagues and predicted that the teams with the worst schedule luck would outperform the teams with the best schedule luck once that random element was removed from their favor.
In Week 11, I walked through how to tell the difference between regression to the mean and gambler's fallacy (which is essentially a belief in regression past the mean). No specific prediction was made.
In Week 12, I showed how to use the concept of regression to the mean to make predictions about the past and explained why the average fantasy teams were close but the average fantasy games were not. As a bonus, I threw in another quick prediction on touchdown over- and underachievers (based on yardage gained).
In Week 13, I went through the rabbit hole and investigated how performance in Regression Alert was also subject to regression to the mean, and how our current winning streak was unsustainable and destined to end sometime.
In Week 14, I talked about why larger samples were almost always better than smaller subsamples and how "hot streaks" were often just an illusion. I made two specific predictions: that a group of "hot" players would cool back down to their season average, and that the ice-cold Ja'Marr Chase would heat back up again.
In Week 15, I walked through the sad math predicting that your best fantasy team was going to lose in the playoffs.
In Week 16, I walked through how small edges become big edges when you exploit them over and over and over again. I also predicted that kicker performance over the final two weeks would be more a function of the team's offense than the opposing teams' defenses.
In Week 17, I talked about how the quality of players entering the league varies a lot from year to year and showed how so many of the things we think are structural trends (careers getting longer, receivers producing at a younger age, etc.) are really just the natural result of stringing several good or bad draft classes in a row.
Statistic for regression | Performance before prediction | Performance since prediction | Weeks remaining |
---|---|---|---|
Yards per Carry | Group A had 10% more rushing yards per game | Group B has 4% more rushing yards per game | None (Win!) |
Yards per Touchdown | Group A scored 9% more fantasy points per game | Group B scored 13% more fantasy points per game | None (Win!) |
Passing vs. Rushing TDs | Group A scored 42% more RUSHING TDs | Group A is scoring 33% more PASSING TDs | None (Win!) |
Defensive Interceptions | Group A had 33% more interceptions | Group B had 24% more interceptions | None (Win!) |
Schedule Luck | Group A had a 3.7% better win% | Group B has an 18.5% better win% | None (Win!) |
Yards per Touchdown | Group A scored 10% more fantasy points per game | Group B has 19% more fantasy points per game | None (Win!) |
"Hot" Players Regress | Players were performing at an elevated level | They have regressed 75.5% to their season average | None (Win!) |
Yards per Route Run | Group A led by 93% | Group B led by 49% | None (Win!) |
Kicker Production | n/a | 50% of games are closer to offense | None (Loss) |
Amon-Ra St. Brown did his level best to carry the hot players across the finish line on his back. He averaged 8.42 points over the full season but 11.75 points from weeks 10-13, a 40% increase that made him the 10th-hottest player in our sample. But he was just getting started, as his point total increased in every subsequent week, from 15.3 in Week 14, to 23.5 in Week 15, to 26.0 in Week 16, to 33.4 in Week 17. If you were in the playoffs and you had him on your roster, there's a very good chance you'll have to make room on your mantle for another championship trophy.
But St. Brown is a sample size of one. Nine out of the 19 "hot" players who appeared in at least three weeks actually performed below their season-long average, to say nothing of their recent hot streak. (~50% of players performing above their average and 50% performing below is about what you'd expect if performance was distributed randomly.) If you remove St. Brown, the "hot" players averaged 11.8 points per game over the full season... and 11.9 points per game over the last four weeks. They regressed all the way back to their previous average.
(Of course, this is cheating. If you remove the player who regressed the most— Jonathan Taylor, whose totally respectable 17.4 ppg was nevertheless more than six points below his season average— the "hot" players would have only regressed by 65%, which would have gone down as a narrow loss since I stated I needed 66% to win.)
Speaking of hot and cold players, no player in our sample was colder from Weeks 10-13 than Ja'Marr Chase, who was performing at just 60% of his full-season average. I predicted that his yards per route run would rebound. Things weren't looking fantastic through three weeks, but I did note that sample sizes were still small so a single big game could change things. Well, I think 11 catches for 266 yards and 3 touchdowns qualifies as a "big game". (Actually, it's the most receiving yards by a rookie in NFL history. In PPR scoring, it's the highest-scoring game by a rookie at any position, and the third-best game of all time in the traditional fantasy championship week, trailing only 1995 Jerry Rice and 2020 Alvin Kamara.)
As a result, we went from Jaylen Waddle and DeVonte Smith leading Chase in yards per route run by 93% from weeks 10-13 to Chase leading Waddle and Smith by 49% from weeks 14-17 despite Waddle and Smith actually playing quite well themselves. (A note: I removed Elijah Moore from the comparison since he didn't play any games after Week 13.) This goes down as the biggest reversal in Regression Alert history.
Alas, no Week 17 miracle was in the cards for our other prediction on the ropes. I predicted kicker performance would be closer to offensive expectations than defensive expectations at least 60% of the time. Instead, it was a 50/50 split and our first loss of the year. I'll have more thoughts on this specific prediction below.
Our Final Report Card
To wrap up the season, I wanted to look back not just at this year's predictions, but at every prediction over the last four years. Remember, I'm not picking individual players, I'm just identifying unstable statistics and predicting that the best and the worst players in those statistics will both regress towards the mean, no matter who those best and worst players might be.
Sometimes this feels a bit scary. Predicting that stars like Jonathan Taylor and Cooper Kupp, in the middle of historically great seasons, are going to start falling off is an uncomfortable position. But looking back at our hit rate over time makes it a bit easier to swallow.
Top-line Record
- 2017: 6-2
- 2018: 5-1
- 2019: 7-2
- 2020: 6-1
- 2021: 8-1
- Overall: 32-7 (82%)
The Misses
2017 Passing Yards per Touchdown Part 1
2017 Passing Yards per Touchdown Part 2
In our first prediction, Group A was outscoring Group B by 13%. I picked a bad four-week span to make the prediction, as they outscored Group B by 17% over our prediction span, but over the full season that fell to just 3%; solid regression, but not enough to count the prediction as a win. When I repeated the prediction later in the season it once again went poorly. My takeaway from this experience was that quarterback yard-to-touchdown ratios were much more skill-based than running back or receiver ratios (an idea that's backed up by looking at the leaderboard in the statistic), so I've stopped making this prediction anymore.
2018 Yards per Target
Just like with the last miss, I tried to make a prediction out of a statistic that had a large skill element to it. Over the full season, Group A's edge fell from 16% to 7%, which was at least movement in the right direction, but not enough to qualify as a win. Once again, I've stopped trying to figure out clever ways to make this prediction work, because the skill signal is just too strong, which means the movement going forward tends to be far less dramatic and the prediction is a bit less reliable. (We did log one hit to offset this one miss before I discontinued the prediction.)
2019 Patrick Mahomes II Touchdown Regression
I knew going into this prediction that it wasn't a great bet; in fact, I preceded the prediction with 18 paragraphs and 4 charts detailing the three biggest issues with the prediction I was about to make, then compounded the issues by breaking best practices again to make a prediction about a single player rather than a large sample (where the ups and downs would have more chance to even out), and broke them a second time by specifically choosing my player rather than sticking with whoever happened to be most extreme in the statistic I was betting on regression. Then when the original prediction lost in part because Mahomes was injured during the sample, I doubled down when he returned from injury and ran it again; this prediction was responsible for both of my losses that year. Really just a disaster from start to finish with a pair of humbling and well-deserved losses to show for it.
2020 Point Differential vs. Record
I paired teams who had the same record despite wildly different point differentials and predicted that the teams that were winning by bigger margins would win more games going forward than the teams that were winning by smaller margins. Not only did that prediction not work out over the four-week sample, extending it out through the entire season didn't help any; our Group A teams have actually won one more game than our Group B teams since the prediction. The lesson I take away from this failure is... nothing. Sometimes predictions fail because I got greedy or made an ill-advised design choice. But sometimes we just get unlucky. In the future, I'd be happy to make this bet again.
2021 Kickers (Offense vs. Defense)
The point I wanted to make is that a team's offense predicted future performance more than the opposing team's defense. It's a point I've made in the past using offensive and defensive production directly, but this time I wanted to add a twist on it by focusing on "matchups". I think it's a sound point and I'd be happy to make the prediction again, but my mistake was focusing on kickers; matchups aren't a big deal in fantasy football, but the positions where they're the biggest deal are kicker and fantasy defense (which actually reinforces the underlying point that offense is more predictable than defense). If and when I run this back, I'll pick a different position to focus on.
The Hits
Here's the outcome of all of my "Yards per Carry" predictions over the years, with the average at the time of the prediction, the average in the four weeks after the prediction, and the total swing.
- Group A had a 60% lead, Group B had a 16% lead, +76% total swing
- Group A had a 25% lead, Group B had a 16% lead, +41% total swing
- Group A had a 24% lead, Group B had a 4% lead, +28% total swing
- Group A had a 9% lead, Group B had a 23% lead, +32% total swing
- Group A had a 20% lead, Group B had a 30% lead, +50% total swing
- Group A had a 22% lead, Group B had a 23% lead, +45% total swing
- Group A had a 3% lead, Group B had a 36% lead, +39% total swing
- Group A had a 10% lead, Group B had a 4% lead, +14% total swing
We can't directly compare the total swings since the sample sizes vary so much (a 30% swing over a large sample might be more impressive than a 50% swing over a small one), but this prediction has gone 8-0 for me over the years with a median swing from Group A to Group B of of 36.5%. The minimum swing was 14%, but that was mostly just bad luck with the selected sample; over the full season, the swing would have been 25%. I've made a lot of jokes about yards per carry over the years. I've called it "pseudoscience" and said it's "not a thing" or even "maximally not a thing". Some people find these statements provocative, but they're not intended to provoke. Yards per carry genuinely is almost entirely noise, especially over the kinds of samples we're dealing with inside a single season. Here are the receipts.
Here's the outcome of all of my "Yard to Touchdown Ratio" predictions over the years. (Where necessary, I've reworked some of the predictions to adhere to our traditional "Group A vs. Group B" format. This is a purely cosmetic change for comparison; the underlying data remains untouched.)
- Group A had a 28% lead, Group B had a 1% lead, +29% total swing
- Group A had a 21% lead, Group B had an 8% lead, +29% total swing
- Group A had a 7% lead, Group B had a 20% lead, +27% total swing
- Group A had a 28% lead, Group B had a 23% lead, +51% total swing
- Group A had a 26% lead, Group B had a 4% lead, +30% total swing
- Group A had a 23% lead, Group B had a 47% lead, +70% total swing
- Group A had a 22% lead, Gorup B had a 23% lead, +45% total swing
- Group A had a 2% lead, Group B had a 40% lead, +42% total swing
- Group A had a 15% lead, Group B had an 11% lead, +26% total swing
- Group A had a 9% lead, Group B had a 13% lead, +22% total swing
- Group A had a 10% lead, Group B had a 19% lead, +29% total swing
This is my favorite prediction for a number of reasons. For starters, this entire column was inspired by a pair of articles I wrote on this ratio back in 2015. But mostly I love it because it's such an incredible, slam-dunk regression target that virtually no one pays any attention to. Statistically-minded writers have known about the issues with yards per carry for decades now, but when I started writing this column virtually none of the discussion of players who were scoring "too many" or "too few" touchdowns linked that judgment to their yardage profile. But as you can see, that's exactly the link we should be making; predictions that touchdowns will follow yards are 11-0 with a median swing of 29% and a minimum swing of 22%.
Here are the various other miscellaneous (successful) predictions from the past four seasons
- Group A had 16% more yards per target, Group B had 11% more yards per target, +27% total swing
- Group A had 17% fewer interceptions, Group B had 57% fewer interceptions, +74% total swing
- Group A had 13% fewer interceptions, Group B had 17% fewer interceptions, +30% total swing
- Group A had 20% more kicker points per game, Group B had 36% more kicker points per game, +56% total swing
- Group A had 42% more rushing TDs per game, Group A had 33% more passing TDs per game, +75% total swing
- Group A recorded 33% more interceptions, Group B recorded 24% more interceptions, +57% total swing
- Group A won 4% more fantasy matchups, Group B won 19% more fantasy matchups, +23% total swing
- Group A averaged 93% more yards per route run, Group B averaged 49% more yards per route run, +142% total swing
And general regression predictions that didn't follow the typical "Group A vs. Group B" format, instead predicting unidirectional regression for a single group.
- "Extreme" offenses and defenses regressed 11% toward the league average performance, as predicted.
- Defenses regressed 12% more than offenses, as predicted.
- Group A averaged 14% more passing yards per game, Group A continued to average 28% more passing yards per game, as predicted.
- "Hot" players regressed 108% of the way back to their full-season averages, as predicted.
- "Hot" players regressed 75% of the way back to their full-season averages, as predicted.
Anyway, the whole point of this column is to convince you that regression to the mean is real, it's implacable, and it's actionable with very little effort on our own part. Accountability is crucial to making that point, which is why I go to such great lengths to track and report my results. You don't have to take my word on the subject, you can go back and check my track record for yourself. You can see why I'm such a big believer in the power of regression, and hopefully, you become something of a believer yourself.
As always, I appreciate you reading along this season, and look forward to doing it all over again in 2021.