Welcome to Regression Alert, your weekly guide to using regression to predict the future with uncanny accuracy.
For those who are new to the feature, here's the deal: every week, I dive into the topic of regression to the mean. Sometimes I'll explain what it really is, why you hear so much about it, and how you can harness its power for yourself. Sometimes I'll give some practical examples of regression at work.
In weeks where I'm giving practical examples, I will select a metric to focus on. I'll rank all players in the league according to that metric, and separate the top players into Group A and the bottom players into Group B. I will verify that the players in Group A have outscored the players in Group B to that point in the season. And then I will predict that, by the magic of regression, Group B will outscore Group A going forward.
Crucially, I don't get to pick my samples (other than choosing which metric to focus on). If the metric I'm focusing on is touchdown rate, and Christian McCaffrey is one of the high outliers in touchdown rate, then Christian McCaffrey goes into Group A, and may the fantasy gods show mercy on my predictions.
Most importantly, because predictions mean nothing without accountability, I track the results of my predictions over the course of the season and highlight when they prove correct and also when they prove incorrect. Here's a list of my predictions from 2019 and their final results, here's the list from 2018, and here's the list from 2017.
THE SCORECARD
In Week 2, I opened with a primer on what regression to the mean was, how it worked, and how we would use it to our advantage. No specific prediction was made.
In Week 3, I dove into the reasons why yards per carry is almost entirely noise, shared some research to that effect, and predicted that the sample of backs with lots of carries but a poor per-carry average would outrush the sample with fewer carries but more yards per carry.
In Week 4, I talked about how the ability to convert yards into touchdowns was most certainly a skill, but it was a skill that operated within a fairly narrow and clearly-defined range, and any values outside of that range were probably just random noise and therefore due to regress. I predicted that high-yardage, low-touchdown receivers would outscore low-yardage, high-touchdown receivers going forward.
In Week 5, I talked about how historical patterns suggested we had just reached the informational tipping point, the time when performance to this point in the season carried as much predictive power as ADP. In general, I predicted that players whose early performance differed substantially from their ADP would tend to move toward a point between their early performance and their draft position, but no specific prediction was made.
In Week 6, I talked about simple ways to tell whether a statistic was especially likely to regress or not. No specific prediction was made.
In Week 7, I speculated that kickers were people, too, and lamented the fact that I'd never discussed them in this column before. To remedy that, I identified teams that were scoring "too many" field goals relative to touchdowns and "too many" touchdowns relative to field goals and predicted that scoring mix would regress and kickers from the latter teams would outperform kickers from the former going forward.
In Week 8, I noted that more-granular measures of performance tended to be more stable than less-granular measures and predicted that teams with a great point differential would win more games going forward than teams with an identical record, but substantially worse point differential.
In Week 9, I talked about the interesting role regression to the mean plays in dynasty, where the mere fact that a player is likely to regress sends signals that that player is probably quite good and worth rostering long-term, anyway. No specific prediction was made.
In Week 10, I explained why Group B's lead in these predictions tended to get smaller the longer each prediction ran and showed how a small edge over a huge sample could easily be more impressive than a huge edge over a small sample. No specific prediction was made.
In Week 11, I wrote that yards per pass attempt was an example of a statistic that was significantly less prone to regression, and for the first time I bet against it regressing.
In Week 12, I talked about "on pace" stats and how many of the players who wound up setting records were not the players who were "on pace" to do so.
In Week 13, I came up with a list of players who were getting hot just in time for the playoffs... and then explained why they probably weren't getting hot just in time for the playoffs, predicting that they'd cool off back to their normal production level going forward.
Statistic for regression | Performance before prediction | Performance since prediction | Weeks remaining |
---|---|---|---|
Yards per Carry | Group A had 3% more rushing yards per game | Group B has 36% more rushing yards per game | Success! |
Yard to Touchdown Ratio | Group A averaged 2% more fantasy points per game | Group B averages 40% more fantasy points per game | Success! |
TD to FG ratio | Group A averaged 20% more points per game | Group B averages 36% more points per game | Success! |
Wins vs. Points | Both groups had an identical win% | Group B has a 4% higher win% | Failure |
Yards per Attempt | Group B had 14% more yards per game | Group B has 21% more yards per game | 1 |
Recent Performances | Players were "hot" for the playoffs | Players regressed 87% back to their previous avg | 3 |
Contrary to the chart above, our "high yards per attempt" group technically leads our "low yards per attempt" group by 30% since the prediction, but this number is inflated because Carson Wentz was benched halfway through the third quarter this week and finished the game with just 79 yards. If this had happened to a Group B quarterback, I absolutely would have counted it against my prediction. But the goal is to always make the strongest possible case for regression, so I'm throwing it out rather than leaving Group A operating at a handicap.
With Joe Burrow and Daniel Jones getting hurt and Carson Wentz getting benched, most of the Group A games are going to come from Ben Roethlisberger and Tom Brady, two unambiguously good quarterbacks. But all of the games could come from Brady and Roethlisberger and it still wouldn't make a difference; since our prediction, Brady is averaging just 6.3 yards per attempt and Roethlisberger is netting just 5.6. Both quarterbacks are shoo-ins for the Hall of Fame, but the fact that they had such low yards per attempt averages early in the year indicated that they'd likely to maintain low yard per attempt averages going forward, because yards per attempt is one of the stickiest, most predictive stats we have.
As for our "hot" players, they've already cooled off considerably. They averaged 11.06 points over the full season, but 15.54 points in their last four games. In Week 13, they averaged 11.63 points, almost all the way back to their original performance level— 87% of the way back, to be exact. Remember, they need to finish at a mark of at least 67%, which represents the point twice as close to their original level as their most recent level. It's only one week, but we're off to a promising start.
May the Best Team Win. (Though it Probably Won't.)
There's an old joke that fantasy football is 50% skill and 50% luck. When you win, that's the skill, and when you lose, that's the luck.
If that's the case, then the fantasy playoffs are way more luck than that. You might think that the best team has a decent shot at winning the title, but (depending on your definition of decent, I suppose), you'd probably be surprised.
You're almost certainly familiar with Bill James, the father of sabermetrics in baseball and probably the man who has done more for sports analytics than anyone else in history. Bill James has a formula he calls his "favorite toy", a simple tool to estimate how likely a given player is to reach a given statistical milestone. For instance, Matt Stafford has 44,000 career passing yards and is about to turn 33; given what we know about aging in football, how likely is it that Stafford reaches 60,000 passing yards for his career? As the name "favorite toy" suggests, the formula isn't meant to be very rigorous and scientific. It's mostly just meant as a fun way to get into the right ballpark.
If I had a "favorite toy" like James, it would probably be a tool to quickly estimate a team's championship odds based on its regular-season production. I've already written six articles on the topic, from using a team's share of pooled all-play wins to discussing the capriciousness of playoff matchups to using probability trees based on existing brackets and future projected points to multiplying out theoretical probabilities to correlating regular-season performance to playoff performance on an individual level to calculating actual titles won by teams based on their historical percentile ranking.
Every time I address this topic, I try to stress that if you have a good team, your championship odds are probably nowhere near as good as you think they are. And if you have a bad team they're probably nowhere near as bad as you think they are, either. Each of these methods yielded high-end title odds between 30 and 50%. The best odds I've ever found using any method were 56%, and that was using a truly historic team that had a bye and a soft bracket.
The last method I used (historical odds based on percentile rank) was actually the most depressing of the bunch. Going through the history of my oldest dynasty league, out of 130 possible team-seasons, only eight times did a team score 20% more points than the league average. Of those eight teams— the eight best teams in league history— as many lost their first playoff game as won a championship (3 teams each).
I'm obviously not Bill James; if I was, I probably would have found a satisfactory quick-and-dirty estimate my first time out, but instead, I'm out here continuing to hack away at the problem. Every approach I've used has its flaws, but the fact that they all point in the same direction certainly says something. And that thing is this: if you have the best team in your league, even if you have the best team your league has ever seen, you're probably going to lose in the next three weeks.
The whole focus of this column is on how random results are over short timelines, in football more than in any other fantasy sport. Patrick Mahomes II, Kyler Murray, and Russell Wilson are much better fantasy quarterbacks than David Carr, and yet last week Carr was the #1 fantasy quarterback while Mahomes ranked 15th, Murray was 19th, and Wilson was 20th. Justin Herbert was the #4 quarterback from Weeks 2-12, but the #29 quarterback in Week 13.
And of course, there's nothing special about quarterbacks that makes them prone to swings like this. David Montgomery was the #1 fantasy running back in PPR leagues last week, while Derrick Henry ranked 35th. Corey Davis, Cole Beasley, Marvin Jones, Keke Coutee, Jamison Crowder, Michael Gallup, and Rashard Higgins were top ten fantasy receivers last week; Keenan Allen and Terry McLaurin weren't in the Top 50. In standard scoring, Darren Waller outscored T.J. Hockenson by about a point over the first 12 weeks... and by about 24 points in Week 13.
Weird stuff like this happens every single week of the season. When it happens in a regular-season week, it's quickly forgotten. When it happens in the playoffs (and it will happen in the playoffs), seasons will be ended.
There's an upside to all this doom and gloom, however. Fantasy football is zero-sum; everyone's title odds always add up to 100%. If the best team's chances are usually worse than people might think, that means every other team's odds must be better. While top teams are usually a 35-40% bet to win it all, the 2nd- and 3rd-best teams still have around a 20% chance of getting the glory. And even the worst teams still have a decent shot.
The most dominant playoff performance in my oldest dynasty league came from a team that had the top weekly score in Week 14, Week 15, and Week 16. During the regular season, that team ranked 109th out of the 130 teams in league history in points scored. It only made the playoffs because of an extremely fortunate schedule, but once there it caught the positive side of variance and proceeded to dismantle all challengers.
You're probably not going to catch positive variance like that. Rare outcomes are rare for a reason. But the key point to remember is that you might. Or your opponent might, instead. We do everything we can to put ourselves in a position to win a title, but when we make it to the end it's largely out of our hands. I always enter the playoffs with the expectation that I'll lose, and then I'm never upset to discover I was wrong.
To all of you readers who are still alive in your fantasy leagues, I hope you buck the odds. I would wish you good luck, but as we all know, it's only luck when you lose.
So instead, I'll wish you good skill.