Welcome to Regression Alert, your weekly guide to using regression to predict the future with uncanny accuracy.
For those who are new to the feature, here's the deal: every week, I dive into the topic of regression to the mean. Sometimes I'll explain what it really is, why you hear so much about it, and how you can harness its power for yourself. Sometimes I'll give some practical examples of regression at work.
In weeks where I'm giving practical examples, I will select a metric to focus on. I'll rank all players in the league according to that metric, and separate the top players into Group A and the bottom players into Group B. I will verify that the players in Group A have outscored the players in Group B to that point in the season. And then I will predict that, by the magic of regression, Group B will outscore Group A going forward.
Crucially, I don't get to pick my samples, (other than choosing which metric to focus on). If the metric I'm focusing on is yards per target, and Antonio Brown is one of the high outliers in yards per target, then Antonio Brown goes into Group A and may the fantasy gods show mercy on my predictions. On a case-by-case basis, it's easy to find reasons why any given player is going to buck the trend and sustain production. So I constrain myself and remove my ability to rationalize on a case-by-case basis.
Most importantly, because predictions mean nothing without accountability, I track the results of my predictions over the course of the season and highlight when they prove correct and also when they prove incorrect. Here's a list of all my predictions from last year and how they fared. Here's a similar list from 2017.
The Scorecard
In Week 2, I opened with a primer on what regression to the mean was, how it worked, and how we would use it to our advantage. No specific prediction was made.
In Week 3, I dove into the reasons why yards per carry is almost entirely noise, shared some research to that effect, and predicted that the sample of backs with lots of carries but a poor per-carry average would outrush the sample with fewer carries but more yards per carry.
In Week 4, I explained why touchdowns follow yards, (but yards don't follow back), and predicted that the players with the fewest touchdowns per yard gained would outscore the players with the most touchdowns per yard gained going forward.
In Week 5, I talked about how preseason expectations still held as much predictive power as performance through four weeks. No specific prediction was made.
In Week 6, I talked about why quarterbacks tended to regress less than other positions but nevertheless predicted that Patrick Mahomes II would somehow manage to get even better and score ten touchdowns over the next four weeks.
In Week 7, I talked about why watching the game and forming opinions about players makes it harder to trust the cold hard numbers when the time comes to put our chips on the table. (I did not recommend against watching football; football is wonderful and should be enjoyed to its fullest.)
Statistic For Regression
|
Performance Before Prediction
|
Performance Since Prediction
|
Weeks Remaining
|
Yards per Carry
|
Group A had 20% more rushing yards per game
|
Group B has 30% more rushing yards per game
|
None (Success!)
|
Yard:Touchdown Ratio
|
Group A had 23% more points per game
|
Group B has 47% more points per game
|
None (Success!)
|
Mahomes averaged 2.2 touchdowns per game
|
Mahomes averages 2.0 touchdowns per game
|
2 (Failure?)
|
Another prediction on the books and another decisive victory for regression to the mean. Over the first three weeks, Group A receivers averaged 0.83 touchdowns per game while Group B receivers averaged 0.19. In the four weeks since, Group A receivers averaged 0.35 touchdowns per game, while Group B averaged 0.33.
Here's the really neat thing, though. Remember, Group A consisted exclusively of receivers averaging fewer than 100 yards for every touchdown, while Group B was made up of receivers averaging more than 200 yards per touchdown. I stipulated that differences between receivers in yard-to-touchdown ratio could absolutely be meaningful, but almost all sustainable ratios would fall between 100 to 200 yards per touchdown.
What have we seen since then? Over the last four weeks, Group A receivers averaged 107 yards for every touchdown and Group B receivers averaged 197 yards for every touchdown. Exactly what regression predicted would happen wound up happening — the between-group differences remained, but both groups regressed from outside the sustainable range to inside the sustainable range and once that happened the dominant volume edge for Group B (who averaged 75% more yards per game over the last four weeks) made them much better fantasy options.
I love it when regression proves right on the overall prediction. But I love it even more when regression proves right on all of the individual details that lead to the overall prediction, too.
As for our other live prediction... Patrick Mahomes II needed 5 touchdowns through 2 games to give us a win. He scored three in his first game, scored one more early against the Broncos, and had 1st-and-goal from the three early in the 2nd quarter when he left to injury. A fifth touchdown would have been all-but-assured had he stayed in the game, because Mahomes will almost certainly miss the last two games in the window. By all accounts this prediction should be a moral victory for regression.
But we don't care about moral victories here. "Moral victory" is just another way of saying failure, and we don't hide from failure. Failure is a chance to learn, and Mahomes reinforces a lesson that we've talked about before. Regression operates best over large samples. On a player-by-player basis, regression is maybe a 60/40 prospect. If you pile a bunch of 60/40 prospects together into a single prediction it becomes more like a 90/10 prospect. If you hang your hopes on a single player, though... well, sometimes he's going to dislocate his kneecap in the second quarter and leave you holding the bag. That's the risk you run.
Just look at our Yard-to-Touchdown Ratio prediction. Overall, Group B dominated Group A. But some individual Group A receivers were just fine! Adam Thielen scored four touchdowns in the last four weeks. Kenny Golladay had 209 yards and 2 touchdowns in three games.
Similarly, some individual Group B receivers were busts. Christian Kirk had 43 yards and then got hurt and missed the rest of the sample. Odell Beckham (of all people) averaged 49 yards with no touchdowns and was held under 30 yards in two of his three games. Most of the possible one-on-one predictions I could have made still would have come up in favor of Group B. But some wouldn't have. Had I pitted Adam Thielen against Odell Beckham, Group B would have gotten trounced.
The idea that individual players are maybe 60/40 bets but making a bunch of 60/40 bets can turn into an overall 90/10 bet is cold comfort when one of those 60/40 bets fails, which will happen quite often (40% of the time, in fact). Again, this is why I spend so much time tracking the results and detailing them here, to provide some comfort that even when we go astray, the underlying process is still sound.
Tight Ends Also Regress!
So far we've had a prediction for regression from running backs, wide receivers, and quarterbacks. (Well, one quarterback at least). I wouldn't want you to think I've been avoiding tight ends so far, but... well, I kind of have been avoiding tight ends so far.
Like with quarterbacks, predicting regression for tight ends has some challenges. As I mentioned when looking at preseason ADP, tight end production seems to stabilize much earlier in the season than quarterback, running back, or wide receiver production. Tight ends typically get less volume than receivers; when I made my yard-to-touchdown ratio prediction after three weeks, there were 29 wide receivers with at least 200 receiving yards. After seven weeks, there are still only 17 tight ends with at least 200 yards.
There's also one unique challenge this season. If you look objectively at which tight ends are most likely to regress, the players who I'd most want to put in my Group B, the top three names are... Travis Kelce, Zach Ertz, and George Kittle. Which would certainly give off the appearance of stacking the deck in my favor and would make any resulting prediction much less impressive.
But here's the deal... those three tight ends are due to regress. Out of 31 tight ends with at least 20 fantasy points, those three guys rank 1st, 2nd, and 3rd in yard-to-touchdown ratio. They have combined for 1321 yards but only three touchdowns, a whopping 440 yards per touchdown. And more than that, the only other tight end with a yard-to-touchdown ratio over 200 is Darren Waller!
What are we to do when the players most likely to regress all happen to be the best players? It seems like a shame to back away from some genuine regression targets just because they're highly-regarded.
So let's perhaps stack the deck against ourselves until our prediction becomes suitably difficult. Kelce, Ertz, Kittle, and Waller currently have 1806 yards and 5 touchdowns. They've scored a combined 213.1 fantasy points in 26 games, good for 8.2 fantasy points per game. They're going to score more touchdowns going forward, which means they're our Group B. Now we just need to find a worthy Group A to compare them to.
Ricky Seals-Jones, Jimmy Graham, Eric Ebron, Vance McDonald, Darren Fells, Foster Moreau, Blake Jarwin, and Cameron Brate all have the fewest yards per touchdown of any tight end with at least 20 fantasy points (not counting Will Dissly, who is done for the year with an Achilles injury). This is going to be our Group A.
I get that the players in Group A certainly don't look as imposing, but there's twice as many of them, and collectively they've produced 1236 yards and 19 touchdowns, outscoring Group B by 22% in standard scoring in the process. Sure, Group B is outscoring Group A by 75% per game to this point, but those touchdown totals are flukes and Group B should be dominating much more thoroughly than they currently are.
So here's the prediction: over the next four weeks, Group B tight ends will score at least twice as much per game as those in Group A. Anything less than a full-on doubling of Group A's production counts as a loss. Anything more than double counts as a win.