Welcome to Regression Alert, your weekly guide to using regression to predict the future with uncanny accuracy.
For those who are new to the feature, here's the deal: every week, I dive into the topic of regression to the mean. Sometimes I'll explain what it really is, why you hear so much about it, and how you can harness its power for yourself. Sometimes I'll give some practical examples of regression at work.
In weeks where I'm giving practical examples, I will select a metric to focus on. I'll rank all players in the league according to that metric, and separate the top players into Group A and the bottom players into Group B. I will verify that the players in Group A have outscored the players in Group B to that point in the season. And then I will predict that, by the magic of regression, Group B will outscore Group A going forward.
Crucially, I don't get to pick my samples, (other than choosing which metric to focus on). If the metric I'm focusing on is yards per target, and Antonio Brown is one of the high outliers in yards per target, then Antonio Brown goes into Group A and may the fantasy gods show mercy on my predictions.
Most importantly, because predictions mean nothing without accountability, I track the results of my predictions over the course of the season and highlight when they prove correct and also when they prove incorrect. Here's a list of all my predictions from last year and how they fared.
The Elephant In the Room
If you have even passing familiarity with what regression to the mean is all about, you probably know who I'm going to talk about this week. There are two players whose performance through two weeks stands out as just glaringly unsustainable, two guys who are miles ahead of their peers at their position. I'm talking, of course, about Matt Breida and Phillip Lindsay.
Okay, okay, so maybe you expected me to talk about Patrick Mahomes and Ryan Fitzpatrick this week. That'd be a reasonable assumption, and believe me, I'll be getting to them sooner or later. In the meantime, I'll say that Mahomes probably won't throw for 80 touchdowns this year, and Fitzpatrick seems like a pretty reasonable bet to finish with fewer than 6,000 passing yards. #Regression
But for the first prediction of the year I wanted to start this column off with a bang by focusing on the single worst, least-predictive, most useless stat out there: yards per carry. Yards per carry is not a "sticky" stat— that is, it's not a stat that tends to stay pretty stable between one sample and the next. It is extremely sensitive to outliers. It jumps all over the map not just from one season to the next, but even within a single season. It takes 177 games or 1978 carries in the same system— essentially eleven years— before a player's yard per carry average "stabilizes" and reaches a point where it represents more skill than chance.
So if a player has a high yard per carry average, (and especially if that average comes over a small sample), that average means essentially nothing going forward. The league average yards per carry typically fluctuates from between 4.1 and 4.3. Most players with very high and very low averages early in the year will perform somewhere near that range going forward. This presents an opportunity for savvy fantasy owners to trade players with few carries but lots of yards for players with lots of carries but few yards.
Which brings me to Breida, Lindsay, and my first prediction of the season. There are currently 21 players with 100 rushing yards. Here they are, sorted from highest to lowest yards per carry.
Now, how these predictions are going to work is I'm going to take the top performers in a statistic, lump them into Group A, take the bottom performers, lump them into Group B, and then predict that Group B will outperform Group A going forward. In many ways, this is a very restrictive format. I don't get to pick and choose who I want in my Group B or who I'd rather exclude from Group A.
But on the other hand, it does offer me quite a bit of leeway. Beyond picking which statistic to focus on, I also get to select my cutoffs. The easiest thing to do would be to just say anyone with a yard per carry average over 5 is bound to regress; this nets me a Group A of Breida, Ekeler, Crowell, Lindsay, and Coleman. Or I could say that I want to divide my sample into thirds, with the top third representing Group A and the bottom third comprising Group B— that adds Lamar Miller and Joe Mixon to Group A, but Mixon is going to miss much of the next four weeks anyway, so that's no big deal.
Those are things I might consider doing if I didn't actually believe in regression to the mean, or I was just trying to make things look as impressive as possible. But I'm a true believer, so let's crank the difficulty settings all the way up. The mid-point of those 21 players is Jordan Wilkins with 4.2 yards per carry. The top 10 players all have 4.5 yards per carry or better, while the bottom 10 all have 4.0 yards per carry or worse.
Grouping the players this way gets some heavy hitters into Group A; Saquon Barkley and Ezekiel Elliott were first-round picks in fantasy drafts this last offseason because they're expected to rush for a ton of yards. But again, I believe in regression to the mean enough to really put it to the test. So these are your groups.
Matt Breida, Austin Ekeler, Isaiah Crowell, Phillip Lindsay, Tevin Coleman, Lamar Miller, Joe Mixon, Ezekiel Elliott, Saquon Barkley, and T.J. Yeldon have combined to carry the ball 271 times for 1476 yards, a 5.4 YPC average. This is your Group A.
Jordan Howard, Kenyan Drake, James Conner, Dion Lewis, Todd Gurley, Marshawn Lynch, Kareem Hunt, Jamaal Williams, Adrian Peterson, and Carlos Hyde have combined to carry the ball 331 times for 1194 yards, a 3.6 YPC average. This is your Group B.
To this point in the season, Group A has outrushed Group B by 24%. I predict that, through the magic of regression to the mean, Group B will average more rushing yards per game over the next four weeks. Be sure to check back in the weeks to come to see how well my prediction has fared.