Welcome to Regression Alert, your weekly guide to using regression to predict the future with uncanny accuracy.
For those who are new to the feature, here's the deal: every week, I dive into the topic of regression to the mean. Sometimes I'll explain what it really is, why you hear so much about it, and how you can harness its power for yourself. Sometimes I'll give some practical examples of regression at work.
In weeks where I'm giving practical examples, I will select a metric to focus on. I'll rank all players in the league according to that metric, and separate the top players into Group A and the bottom players into Group B. I will verify that the players in Group A have outscored the players in Group B to that point in the season. And then I will predict that, by the magic of regression, Group B will outscore Group A going forward.
Crucially, I don't get to pick my samples (other than choosing which metric to focus on). If the metric I'm focusing on is touchdown rate, and Christian McCaffrey is one of the high outliers in touchdown rate, then Christian McCaffrey goes into Group A and may the fantasy gods show mercy on my predictions.
Most importantly, because predictions mean nothing without accountability, I track the results of my predictions over the course of the season and highlight when they prove correct and also when they prove incorrect. Here's a list of my predictions from 2019 and their final results, here's the list from 2018, and here's the list from 2017.
THE SCORECARD
In Week 2, I opened with a primer on what regression to the mean was, how it worked, and how we would use it to our advantage. No specific prediction was made.
In Week 3, I dove into the reasons why yards per carry is almost entirely noise, shared some research to that effect, and predicted that the sample of backs with lots of carries but a poor per-carry average would outrush the sample with fewer carries but more yards per carry.
In Week 4, I talked about how the ability to convert yards into touchdowns was most certainly a skill, but it was a skill that operated within a fairly narrow and clearly-defined range, and any values outside of that range were probably just random noise and therefore due to regress. I predicted that high-yardage, low-touchdown receivers would outscore low-yardage, high-touchdown receivers going forward.
| Statistic for regression | Performance before prediction | Performance since prediction | Weeks remaining | 
|---|---|---|---|
| Yards per Carry | Group A had 3% more rushing yards per game | Group B has 28% more rushing yards per game | 2 | 
| Yard to Touchdown Ratio | Group A averaged 2% more fantasy points per game | Group B averages 50% more fantasy points per game | 3 | 
In the two weeks before our prediction, our high-ypc sample averaged 2.06 more yards per carry than our low-ypc sample. In the two weeks since our prediction, our low-ypc sample averages 0.38 more yards per carry than our high-ypc sample. There's a reason why I always start the season off with a yards per carry prediction; I want to get an easy win under our belt so new readers can begin to get a feel for how regression to the mean operates. And historically yards per carry has been the biggest slam dunk there is.
In the three weeks before our prediction, our "high-touchdown" receivers averaged 0.78 touchdowns per game and one touchdown for every 57 yards. Our "low-touchdown" receivers averaged 0.17 touchdowns per game and one touchdown for every 462 yards. Last week, the high-touchdown receivers scored twice in twelve games, going 312 yards for every touchdown. Our low-touchdown receivers, however, scored eight times in nineteen games and just 167 yards per touchdown. Because "touchdown scoring" is not a skill (or at least not at the levels we singled out through three weeks), our Group B receivers dominated our Group A receivers in fantasy.
Revisiting Preseason Expectations
In October of 2013, I wondered just how many weeks it took before the early-season performance wasn't a fluke anymore. In "Revisiting Preseason Expectations", I looked back at the 2012 season and compared how well production in a player's first four games predicted production in his last 12 games. And since that number was meaningless without context, I compared how his preseason ADP predicted production in his last 12 games.
It was a fortuitous time to ask that question, as it turns out, because I discovered that after four weeks in 2012, preseason ADP still predicted performance going forward than early-season production did.
This is the kind of surprising result that I love, but the thing about surprising results is that sometimes the reason they're surprising is really just because they're flukes. So in October of 2014, I revisited "Revisiting Preseason Expectations". This time I found that in the 2013 season, preseason ADP and week 1-4 performance held essentially identical predictive power for the rest of the season.
With two different results in two years, I decided to keep up my quest for a definitive answer about whether early-season results or preseason expectations were more predictive down the stretch. In October of 2015, I revisited my revisitation of "Revisiting Preseason Expectations". This time, I found that early-season performance held a slight predictive edge over preseason ADP.
With things still so inconclusive, in October of 2016, I decided to revisit my revisitation of the revisited "Revisiting Preseason Expectations". As in 2015, I found that this time early-season performance carried slightly more predictive power than early-season performance.
To no one's surprise, I couldn't leave well enough alone in October 2017, once more revisiting the revisited revisitation of the revisited "Revisiting Preseason Expectations". This time I once again found that preseason ADP and early-season performance were roughly equally predictive, with a slight edge to preseason ADP.
And of course, as a creature of habit, when October 2018 rolled around I simply had to revisit my revisitation of the revisited revisited revisitation of "Revisiting Preseason Expectations". And then in October 2019 I... well, you get the idea.
And now, as you've probably guessed, it's time for an autumn tradition as sacred as turning off the lights and pretending I'm not home on October 31st. It's time for "Revisiting Preseason Expectations"! (Or, I guess technically for Revisiting Revisiting Revisiting Revisiting Revisiting Revisiting Revisiting Revisiting Preseason Expectations.)
METHODOLOGY
If you've read the previous pieces, you have a rough idea of how this works, but here's a quick rundown of the methodology. I have compiled a list of the top 24 quarterbacks, 36 running backs, 48 wide receivers, and 24 tight ends by 2019 preseason ADP.
From that list, I have removed any player who missed more than one of his team’s first four games or more than two of his team’s last twelve games so that any fluctuations represent performance and not injury. As always, we’re looking by team games rather than by week, so players with an early bye aren't skewing the comparisons.
I’ve used PPR scoring for this exercise because that was easier for me to look up with the databases I had on hand. For the remaining players, I tracked where they ranked at their position over the first four games and over the final twelve games. Finally, I’ve calculated the correlation between preseason ADP and stretch performance, as well as the correlation between early performance and stretch performance.
Correlation is a measure of how strongly one list resembles another list. The highest possible correlation is 1.000, which is what you get when two lists are identical. The lowest possible correlation is 0.000, which is what you get when you compare one list of numbers to a second list that has no relationship whatsoever. (Correlations can actually go down to -1.000, which means the higher something ranks in one list the lower it tends to rank in the other, but negative correlations aren’t really relevant for this exercise.)
So if guys who were drafted high in preseason tend to score a lot of points from weeks 5-16, and this tendency is strong, we’ll see correlations closer to 1. If they don’t tend to score more points, or they do but the tendency is very weak, we’ll see correlations closer to zero. The numbers themselves don’t matter beyond “higher = more predictable”.
Here's the raw data for anyone curious. If you're willing to take my word for it, I'd recommend just skipping ahead to the "Overall Correlations" section below for averages and key takeaways.
Quarterback
| Player | ADP | Games 1-4 | Games 5-16 | 
| 1 | 2 | 19 | |
| 2 | 6 | 3 | |
| 3 | 16 | 10 | |
| 4 | 24 | 20 | |
| 5 | 10 | 14 | |
| 7 | 5 | 13 | |
| 8 | 18 | 18 | |
| 9 | 3 | 4 | |
| 11 | 1 | 1 | |
| 12 | 11 | 6 | |
| 14 | 14 | 9 | |
| 15 | 13 | 17 | |
| 16 | 12 | 25 | |
| 17 | 4 | 2 | |
| 18 | 26 | 12 | |
| Mitchell Trubisky | 19 | 28 | 23 | 
| 20 | 17 | 8 | |
| 21 | 21 | 16 | |
| 22 | 23 | 15 | 
Running Back
| Player | ADP | Games 1-4 | Games 5-16 | 
| 1 | 24 | 6 | |
| 2 | 5 | 14 | |
| 3 | 1 | 1 | |
| 4 | 11 | 3 | |
| LeVeon Bell | 7 | 8 | 20 | 
| 8 | 4 | 15 | |
| 9 | 3 | 7 | |
| 10 | 15 | 17 | |
| 11 | 35 | 10 | |
| 12 | 10 | 8 | |
| 14 | 17 | 13 | |
| 16 | 12 | 2 | |
| 17 | 26 | 19 | |
| 18 | 30 | 25 | |
| 20 | 46 | 30 | |
| Mark Ingram | 21 | 6 | 16 | 
| 23 | 9 | 4 | |
| 24 | 19 | 27 | |
| 25 | 32 | 18 | |
| 26 | 40 | 31 | |
| 27 | 14 | 21 | |
| 29 | 41 | 9 | |
| 30 | 2 | 5 | |
| 32 | 39 | 24 | |
| 33 | 59 | 23 | |
| 34 | 38 | 11 | |
| 35 | 45 | 53 | 
Wide Receiver
| Player | ADP | Games 1-4 | Games 5-16 | 
| 2 | 22 | 3 | |
| 3 | 5 | 6 | |
| 4 | 6 | 1 | |
| Odell Beckham | 7 | 24 | 32 | 
| 11 | 1 | 20 | |
| 12 | 56 | 14 | |
| 13 | 8 | 15 | |
| 14 | 33 | 4 | |
| 15 | 27 | 78 | |
| 16 | 3 | 7 | |
| 17 | 19 | 17 | |
| 18 | 9 | 19 | |
| 19 | 2 | 10 | |
| 20 | 11 | 11 | |
| 22 | 26 | 18 | |
| 25 | 74 | 35 | |
| 26 | 53 | 43 | |
| 28 | 34 | 8 | |
| D.J. Moore | 29 | 35 | 12 | 
| Robby Anderson | 31 | 71 | 34 | 
| Allen Robinson | 32 | 32 | 5 | 
| 33 | 12 | 40 | |
| 35 | 45 | 39 | |
| 38 | 7 | 77 | |
| 39 | 68 | 95 | |
| 40 | 44 | 108 | |
| 41 | 95 | 44 | |
| 43 | 13 | 49 | |
| 45 | 62 | 65 | 
Tight End
| Player | ADP | Games 1-4 | Games 5-16 | 
| 1 | 4 | 1 | |
| 2 | 9 | 3 | |
| 3 | 7 | 2 | |
| 4 | 28 | 27 | |
| 6 | 33 | 5 | |
| 7 | 16 | 34 | |
| 11 | 52 | 12 | |
| 12 | 8 | 22 | |
| 14 | 3 | 7 | |
| 15 | 17 | 23 | |
| 16 | 36 | 9 | |
| 17 | 21 | 19 | |
| 18 | 20 | 18 | |
| 19 | 5 | 4 | |
| 20 | 24 | 20 | |
| 21 | 12 | 13 | |
| 22 | 41 | 10 | |
| Irv Smith | 23 | 40 | 25 | 
| 24 | 83 | 45 | 
Overall Correlations
| Quarterback | |||
| Season | ADP | Early-Season | Avg of Both | 
| 2014 | 0.422 | -0.019 | |
| 2015 | 0.260 | 0.215 | |
| 2016 | 0.200 | 0.404 | 0.367 | 
| 2017 | 0.252 | 0.431 | 0.442 | 
| 2018 | 0.435 | 0.505 | 0.579 | 
| 2019 | 0.093 | 0.539 | 0.395 | 
| Average | 0.277 | 0.346 | 0.446 | 
| Running Back | |||
| Season | ADP | Early-Season | Avg of Both | 
| 2014 | 0.568 | 0.472 | |
| 2015 | 0.309 | 0.644 | |
| 2016 | 0.597 | 0.768 | 0.821 | 
| 2017 | 0.540 | 0.447 | 0.610 | 
| 2018 | 0.428 | 0.387 | 0.447 | 
| 2019 | 0.490 | 0.579 | 0.603 | 
| Average | 0.489 | 0.550 | 0.621 | 
| Wide Receiver | |||
| Season | ADP | Early-Season | Avg of Both | 
| 2014 | 0.333 | 0.477 | |
| 2015 | 0.648 | 0.632 | |
| 2016 | 0.551 | 0.447 | 0.576 | 
| 2017 | 0.349 | 0.412 | 0.443 | 
| 2018 | 0.645 | 0.568 | 0.650 | 
| 2019 | 0.640 | 0.387 | 0.533 | 
| Average | 0.528 | 0.487 | 0.551 | 
| Tight End | |||
| Season | ADP | Early-Season | Avg of Both | 
| 2014 | -0.051 | 0.416 | |
| 2015 | 0.295 | 0.559 | |
| 2016 | 0.461 | 0.723 | 0.716 | 
| 2017 | 0.634 | 0.857 | 0.891 | 
| 2018 | 0.537 | 0.856 | 0.708 | 
| 2019 | 0.310 | 0.135 | 0.578 | 
| Average | 0.364 | 0.591 | 0.723 | 
| Overall | |||
| Season | ADP | Early-Season | Avg of Both | 
| 2010-2012 | 0.578 | 0.471 | |
| 2013 | 0.649 | 0.655 | |
| 2014 | 0.466 | 0.560 | |
| 2015 | 0.548 | 0.659 | |
| 2016 | 0.599 | 0.585 | 0.682 | 
| 2017 | 0.456 | 0.570 | 0.608 | 
| 2018 | 0.642 | 0.598 | 0.668 | 
| 2019 | 0.589 | 0.486 | 0.586 | 
| Average | 0.568 | 0.553 | 0.636 | 
2019 was the worst year on record for quarterback ADP and the worst year on record for early-season tight end performances. In fact, 2019 was the first year on record where preseason ADP outperformed early-season production at the tight end position. Last year I reached the conclusion that tight end performance probably stabilized earlier than the other positions, so this is unfortunate, but five years of evidence isn't overruled by one year to the contrary, so I still think the position probably stabilizes a bit faster.
The major takeaway, though, is on the very bottom line of the chart, comparing how well preseason ADP correlates with stretch performance, how early-season performance compares with stretch performance, and how an average of the two compares to stretch performance. The first two correlations (0.568 and 0.553) are basically identical. The latter correlation (0.636) is slightly higher.
We now have a decade worth of data and it's abundantly clear on one point: preseason ADP carries almost exactly the same amount of predictive power as production through four weeks. If you were going to draft a new team tomorrow, you would be just as well off drafting off of preseason ADP as you would be drafting off of production to date. (Although simply averaging the two measures would leave you even better off, still.)
Some years skew more towards early-season production, some years skew more towards preseason ADP, but there's really no pattern to which is which so any variation is probably just noise. Every year I revisit this gives us one more year of evidence to strengthen our central thesis.
But Will This Year Be Different?
Yes.
2020 is unlike any other season we've ever seen. There was no preseason, games are being moved, games are being canceled. When I run this analysis next year I don't know what I expect I'll find. To be honest, I don't even know if I'll be able to run this analysis next year, especially if teams wind up playing a different number of games. But a global pandemic is too big and too disruptive to not have an impact on these results.
So What Are You Doing Differently This Year?
Absolutely nothing.
Let me give you an analogy. I show you a coin and ask you what chance it will come up heads. You tell me 50%. I then tell you that this coin is actually weighted and ask you once again how likely it is to come up heads when I flip it. And you again tell me 50%.
Why? Because knowing a coin is weighted is useless unless you know which side it's weighted toward. Given what you know in the coin-flip hypothetical, the coin is as likely to be weighted toward heads as it is to be weighted toward tails, so knowing the coin is special doesn't change the estimated odds.
Similarly, I know this year is different and I expect that to manifest somehow. But I don't know whether that will favor preseason ADP or early-season performance. There are equally-plausible arguments in favor of either effect. So my estimate of how things will play out in expectation doesn't actually change any.
Now, once you start flipping the coin once or twice, you can quickly update your estimated odds. And similarly, once the season is over I can probably tell you whether the unique circumstances made early-season performance more or less predictive. But of course, by that point, it's too late for that knowledge to do any good.
But as of this moment, I'm operating under the assumption that preseason ADP and performance-to-date are both equally valid estimates of player value, which means I'll probably wind up buying low on a bunch of players who have disappointed so far this year.
