Welcome to Regression Alert, your weekly guide to using regression to predict the future with uncanny accuracy.
For those who are new to the feature, here's the deal: every week, I dive into the topic of regression to the mean. Sometimes I'll explain what it really is, why you hear so much about it, and how you can harness its power for yourself. Sometimes I'll give some practical examples of regression at work.
In weeks where I'm giving practical examples, I will select a metric to focus on. I'll rank all players in the league according to that metric, and separate the top players into Group A and the bottom players into Group B. I will verify that the players in Group A have outscored the players in Group B to that point in the season. And then I will predict that, by the magic of regression, Group B will outscore Group A going forward.
Crucially, I don't get to pick my samples (other than choosing which metric to focus on). If the metric I'm focusing on is touchdown rate, and Christian McCaffrey is one of the high outliers in touchdown rate, then Christian McCaffrey goes into Group A, and may the fantasy gods show mercy on my predictions.
Most importantly, because predictions mean nothing without accountability, I track the results of my predictions over the course of the season and highlight when they prove correct and also when they prove incorrect. Here's a list of my predictions from 2019 and their final results, here's the list from 2018, and here's the list from 2017.
THE SCORECARD
In Week 2, I opened with a primer on what regression to the mean was, how it worked, and how we would use it to our advantage. No specific prediction was made.
In Week 3, I dove into the reasons why yards per carry is almost entirely noise, shared some research to that effect, and predicted that the sample of backs with lots of carries but a poor per-carry average would outrush the sample with fewer carries but more yards per carry.
In Week 4, I talked about how the ability to convert yards into touchdowns was most certainly a skill, but it was a skill that operated within a fairly narrow and clearly-defined range, and any values outside of that range were probably just random noise and therefore due to regress. I predicted that high-yardage, low-touchdown receivers would outscore low-yardage, high-touchdown receivers going forward.
In Week 5, I talked about how historical patterns suggested we had just reached the informational tipping point, the time when performance to this point in the season carried as much predictive power as ADP. In general, I predicted that players whose early performance differed substantially from their ADP would tend to move toward a point between their early performance and their draft position, but no specific prediction was made.
In Week 6, I talked about simple ways to tell whether a statistic was especially likely to regress or not. No specific prediction was made.
In Week 7, I speculated that kickers were people, too, and lamented the fact that I'd never discussed them in this column before. To remedy that, I identified teams that were scoring "too many" field goals relative to touchdowns and "too many" touchdowns relative to field goals and predicted that scoring mix would regress and kickers from the latter teams would outperform kickers from the former going forward.
In Week 8, I noted that more-granular measures of performance tended to be more stable than less-granular measures and predicted that teams with a great point differential would win more games going forward than teams with an identical record, but substantially worse point differential.
In Week 9, I talked about the interesting role regression to the mean plays in dynasty, where the mere fact that a player is likely to regress sends signals that that player is probably quite good and worth rostering long-term, anyway. No specific prediction was made.
In Week 10, I explained why Group B's lead in these predictions tended to get smaller the longer each prediction ran and showed how a small edge over a huge sample could easily be more impressive than a huge edge over a small sample. No specific prediction was made.
In Week 11, I wrote that yards per pass attempt was an example of a statistic that was significantly less prone to regression, and for the first time I bet against it regressing.
In Week 12, I talked about "on pace" stats and how many of the players who wound up setting records were not the players who were "on pace" to do so.
In Week 13, I came up with a list of players who were getting hot just in time for the playoffs... and then explained why they probably weren't getting hot just in time for the playoffs, predicting that they'd cool off back to their normal production level going forward.
In Week 14, I offered the cold comfort that if you lose in the fantasy playoffs, the odds were never in your favor, anyway.
Statistic for regression | Performance before prediction | Performance since prediction | Weeks remaining |
---|---|---|---|
Yards per Carry | Group A had 3% more rushing yards per game | Group B has 36% more rushing yards per game | Success! |
Yard to Touchdown Ratio | Group A averaged 2% more fantasy points per game | Group B averages 40% more fantasy points per game | Success! |
TD to FG ratio | Group A averaged 20% more points per game | Group B averages 36% more points per game | Success! |
Wins vs. Points | Both groups had an identical win% | Group B has a 4% higher win% | Failure |
Yards per Attempt | Group B had 14% more yards per game | Group B has 28% more yards per game | Success! |
Recent Performances | Players were "hot" for the playoffs | Players regressed 98% back to their previous avg | 2 |
Four weeks ago we looked at yards per pass attempt and found that Ben Roethlisberger surprisingly ranked among the worst in the league. At the time, the Steelers were undefeated and Roethlisberger was starting to get MVP buzz. But we trusted the evidence that yards per attempt was a "sticky" stat that didn't tend to strongly regress across samples. In the intervening four weeks, Roethlisberger's low YPA average has gone from an oddity to a serious cause for concern, falling all the way down to 5.48.
(To put that value into context: the last five quarterbacks to finish a season with 200+ pass attempts and a YPA average below 5.5 were four rookies— Jared Goff, Derek Carr, Blaine Gabbert, and Jimmy Clausen— along with Ryan Mallett in a year where he played for two different teams after being cut at midseason.)
Now, regression to the mean didn't suggest that Roethlisberger's struggles would somehow get worse; if anything, it predicted that Roethlisberger's yards per attempt average would improve, albeit only very slightly. But mostly it predicted that all ten quarterbacks in our sample would largely be the same players after the prediction that they were before. And that's what we saw. At the time of the prediction, our Group A quarterbacks averaged 6.63; since our prediction that has fallen to 6.02 (although without Roethlisberger it would instead by 6.39). Our Group B quarterbacks went from 8.37 yards per attempt to 8.24 yards per attempt. As a result, despite Group A maintaining their approximately 10% edge in passing volume, Group B remained comfortably ahead in passing yards per game.
As for our sample of players who were "heating up" just in time for the playoffs, they've cooled off almost entirely back to their overall season-long level of performance. They averaged 11.08 fantasy points per game over the full season but 15.63 points per game in the four weeks before our column. They average 11.19 points per game in the two weeks since.
One Final Prediction
As you've probably noticed, I like to make predictions over a four-week sample to give variance enough time to even out. As you've probably also noticed, we don't have four weeks left together this season, so satisfying that preference is impossible.
In the face of this, I could opt to wrap up all predictions for the year, but I'd like to get one last win on the scoreboard if possible. So instead, I'll go in another direction. Halving the time for regression to operate shouldn't be a problem if I merely double the sample size of my prediction.
The easiest way to double my sample size is to issue a prediction that covers both running backs and wide receivers (and even tight ends, too). Which means, as much as I'd like to dunk on yards per carry again this year, it's probably best to turn to my other standby, yard-to-touchdown ratios.
If you remember from earlier in the year, some of the variation in how frequently players score touchdowns relative to the rate they gain yards is meaningful. Julio Jones is less of a touchdown threat than Calvin Ridley. But the range of that variation is generally constrained over a long timeline. Typically, players will average between 100-220 yards for every touchdown they score.
In the short run, touchdowns are pretty random and the resulting ratios can wind up well outside that range. In the long run, things always even out, and to the extent that they do it's because the touchdowns follow the yards more than because the yards follow the touchdowns.
At the moment, there are 27 players who have scored at least 70 fantasy points and are averaging fewer than 100 yards rushing and receiving for every touchdown. These players are: Jonnu Smith, Jared Cook, Robert Tonyan Jr, Jimmy Graham, Mike Evans, Christian McCaffrey, Jeff Wilson, Adam Thielen, Chase Claypool, David Moore, Gabriel Davis, Rex Burkhead, Tyreek Hill, Antonio Gibson, Davante Adams, Chris Carson, Todd Gurley, Jerick McKinnon, Christian Kirk, Mark Andrews, Nelson Agholor, A.J. Brown, JuJu Smith-Schuster, DAndre Swift, Logan Thomas, Nyheim Hines, and Gus Edwards. That's our Group A.
On the other end, there are 27 players who have scored at least 70 fantasy points while averaging more than 200 yards from scrimmage for every touchdown. These players are: Devin Singletary, Austin Ekeler, J.D. McKissic, Robby Anderson, Damien Harris, Jarvis Landry, Myles Gaskin, Terry McLaurin, Michael Gallup, Jerry Jeudy, Russell Gage, Cooper Kupp, Hunter Renfrow, Brandin Cooks, Julio Jones, D.J. Moore, Stefon Diggs, Jamaal Williams, DeAndre Hopkins, Tyler Boyd, Cole Beasley, Darius Slayton, Corey Davis, Clyde Edwards-Helaire, Raheem Mostert, CeeDee Lamb, and Cam Akers. (Whew.) That's our Group B.
To this point of the season, players in Group A average 53.8 yards and 0.70 touchdowns per game, good for a ratio of 76.9 yards per touchdown and a fantasy average of 9.58 points per game. Meanwhile, Group B players average 68.2 yards and 0.25 touchdowns per game, good for a ratio of 277.2 yards per touchdown and a fantasy average of 8.36 points per game. Group A is averaging 15% more points per game to this point largely on the back of an unsustainably high touchdown rate. Because touchdown rates tend to regress pretty strongly, I'm betting that over the next two weeks, Group B will average more fantasy points per game than Group A.