For those who are new to the feature, here's the deal: every week, I break down a topic related to regression to the mean. Some weeks, I'll explain what it is, how it works, why you hear so much about it, and how you can harness its power for yourself. In other weeks, I'll give practical examples of regression at work.
In weeks where I'm giving practical examples, I will select a metric to focus on. I'll rank all players in the league according to that metric and separate the top players into Group A and the bottom players into Group B. I will verify that the players in Group A have outscored the players in Group B to that point in the season. And then I will predict that, by the magic of regression, Group B will outscore Group A going forward.
Crucially, I don't get to pick my samples (other than choosing which metric to focus on). If I'm looking at receivers and Justin Jefferson is one of the top performers in my sample, then Justin Jefferson goes into Group A, and may the fantasy gods show mercy on my predictions.
And then, because predictions are meaningless without accountability, I track and report my results. Here's last year's season-ending recap, which covered the outcome of every prediction made in our seven-year history, giving our top-line record (41-13, a 76% hit rate) and lessons learned along the way.
Our Year to Date
Sometimes, I use this column to explain the concept of regression to the mean. In Week 2, I discussed what it is and what this column's primary goals would be. In Week 3, I explained how we could use regression to predict changes in future performance-- who would improve, who would decline-- without knowing anything about the players themselves. In Week 7, I explained why large samples are our biggest asset when attempting to benefit from regression.
In Week 9, I gave a quick trick for evaluating whether unfamiliar statistics are likely stable or unstable. In Week 11, I explained the difference between regression and the gambler's fallacy, or the idea that players are "due" to perform a certain way. And in Week 12, I showed how understanding regression can allow us to predict the past as easily as the future.
Sometimes, I point out broad trends. In Week 5, I shared twelve years worth of data demonstrating that preseason ADP held as much predictive power as performance to date through the first four weeks of the season. In Week 15, I offered sobering data on why the best team usually loses in the fantasy football playoffs.
Other times, I use this column to make specific predictions. In Week 4, I explained that touchdowns tend to follow yards and predicted that the players with the highest yard-to-touchdown ratios would begin outscoring the players with the lowest. In Week 6, I explained that yards per carry was a step away from a random number generator and predicted the players with the lowest averages would outrush those with the highest going forward.
In Week 8, I broke down how teams with unusual home/road splits usually performed going forward and predicted the Cowboys would be better at home than on the road for the rest of the season. In Week 10, I explained why interceptions varied so much from sample to sample and predicted that the teams throwing the fewest interceptions would pass the teams throwing the most.
In Week 13, I explained that rookies were the only players whose production increased as the season went on and predicted that this year's rookie receivers would score more down the stretch. And in Week 14, I noted that large samples were almost always more predictive than small ones, and therefore "hot" players would likely regress toward their full-season averages.
The Scorecard
Statistic Being Tracked | Performance Before Prediction | Performance Since Prediction | Weeks Remaining |
---|---|---|---|
Yard-to-TD Ratio | Group A averaged 17% more PPG | Group B averages 10% more PPG | None (Win!) |
Yards per carry | Group A averaged 22% more yards per game | Group B averages 38% more yards per game | None (Win!) |
Cowboys Point Differential | Cowboys were 90 points better on the road than at home | Cowboys are 64 points better on the road than at home | 2 |
Team Interceptions | Group A threw 58% as many interceptions | Group B has thrown 66% as many interceptions | None (Win!) |
Rookie PPG | Group A averaged 8.23ppg | Group A averages 9.26ppg | 1 |
Rookie Improvement | 40% are beating their prior average | 1 | |
Hot Players Regress | Players were performing at an elevated level | Players have regressed 82% to their season avg | 2 |
Our rookie receiver predictions are going in different directions. The group as a whole continues to heat up with almost all of the top performers up big; Malik Nabers, Rome Odunze, Brian Thomas Jr., Xavier Worthy, and Ladd McConkey have seen their ppg average rise from 12.3 to 16.4, a 33% bump. Marvin Harrison Jr. is the only top receiver who is underperforming his full-season average.
On the other hand, though, virtually all of the lower-performing receivers have underperformed since the prediction, mostly because they can't even get on the field. This is fairly unusual, and unless it reverses course, the second leg of our prediction will fail.
Our "hot" players are regressing nicely, though; they've returned 81.7% of the way to their full-season average, which is above the 66% we'll need to count this as a win in two weeks.
Time is On Your Side
Why do I make predictions in this column about what will happen over the next four weeks? Because five is too many, and three is not enough.
If I could, I'd make every prediction for the entire season-- or better, for the rest of a player's career. But the key to this column is accountability— I know that regression to the mean works, and I want to show it in action. For accountability, each prediction needs to have an end date so it can be graded and scored.
Given that each prediction needs a stated end, when should that end be? And here we get into the power of time. The more weeks a prediction covers, the more likely it is that random noise will wash out and regression will dominate the results.
Every extra week a prediction runs increases the chances that prediction pays off. This is also why I try to make my comparison groups as large as feasibly possible— the more players involved, the more likely regression shows up. Every observation-- every game from every player-- is a coinflip that's weighted slightly in my favor.
If I bet heads on a single flip of a coin that's weighted 55% to heads, I'll lose that bet 45% of the time. If I bet it a million times, I'm statistically guaranteed to walk out a winner.
This is all great in theory, but fantasy football isn't iterated a million times; we choose our players and live with their results. I can note that low touchdown scorers tend to reach the end zone, but if you start a player in your league's championship as a result and he doesn't score, the fact that he's even more likely to score next week doesn't help you; your championship is over next week, the trophy is already awarded.
Continue reading this content with a PRO subscription.
"Footballguys is the best premium
fantasy football
only site on the planet."
Matthew Berry, NBC Sports EDGE