Welcome to Regression Alert, your weekly guide to using regression to predict the future with uncanny accuracy.
The Scorecard
Returning readers, you know how this works by now, but for new readers here's the deal. Every week I take a look at a specific statistic that is prone to regression and identify high and low outliers in that statistic, and then I wave my hands in the air and shout “regression!”
But since predictions aren't any fun without someone holding your feet to the fire afterward, I don't stop there. I lump all of the high outliers into Group A. I lump all of the low outliers into Group B. I verify that Group A is outperforming Group B. And then I predict that Group B will outperform Group A over the next four weeks.
I don't get to pick and choose my groups, beyond being free to pick and choose what statistics are especially prone to regression. If I'm tracking yards per target, and Antonio Brown is one of the high outliers in yards per target, then Antonio Brown goes into Group A and may the fantasy gods show mercy on my predictions.
And then, groups chosen and predictions made, I track my progress. That's this.
In Week 2, I outlined what regression was, what it wasn't, and how it worked. No prediction was made.
In Week 3, I listed running backs with exceptionally high and low yards per carry averages and predicted that the low-ypc cohort would outperform the high-ypc cohort over the next four weeks.
In Week 4, I looked at receivers who were overperforming and underperforming in yards per target and predicted that the underperformers would outperform the overperformers over the next four weeks.
In Week 5, I compared the predictive accuracy of in-season results to the predictive accuracy of preseason ADP. Outside of a general prediction that players would tend to regress in the direction of their preseason ADP, no specific prediction was made.
In Week 6, I looked at quarterbacks who were throwing too many or too few touchdowns given the amount of passing yards they were accumulating, then predicted that the underperformers would score more fantasy points than the overperformers going forward.
In Week 7, I looked at receivers who were catching too many or too few touchdowns based on their yardage total, then predicted that the underperformers would score more fantasy points than the overperformers going forward.
In Week 8, I revisited yards per carry, again predicting that the high-carry, low-ypc group would outrush the low-carry, high-ypc group going forward.
In Week 9, I went back to yard to touchdown ratios, predicting that the low-touchdown group would close the gap substantially with the high-touchdown group going forward.
In Week 10, I discussed the pitfalls of predicting regression over 4-week windows. No specific prediction was made.
In Week 11, I once more delved into the theory behind regression and highlighted the importance of not cherrypicking which players are “too good” or “not good enough” to regress.
Statistic for regression | Performance before prediction | Performance since prediction | Weeks remaining |
---|---|---|---|
yards per carry | Group A had 60% more rushing yards per game | Group B has 16% more rushing yards per game | None (Win!) |
yards per target | Group A had 16% more receiving yards per game | Group B has 11% more receiving yards per game | None (Win!) |
passing yards per touchdown | Group A had 13% more fantasy points per game | Group A has 17% more fantasy points per game | None (Loss) |
receiving yards per touchdown | Group A had 28% more fantasy points per game | Group B has 1% more fantasy points per game | None (Win!) |
yards per carry | Group A had 25% more fantasy points per game | Group B has 16% more fantasy points per game | None (Win!) |
rushing yards per touchdown | Group A had 21% more fantasy points per game | Group B has 20% more fantasy points per game | 1 |
Let's talk about Samaje Perine. Through seven weeks he was averaging 3.02 yards per carry. In the last of those weeks, he didn't even get a touch. I wrote last week about not cherry-picking the samples, and nothing has tested that resolve more than running the yards per carry list in week 8 and seeing Perine pop up in my Group B.
It would have been easy to rationalize Perine's exclusion. I mean, the guy had just lost his role and didn't even touch the ball the week before. But I refrained because this column isn't about me, it's about regression. The point isn't to prove that Adam is very good at predicting the future. (He's not all that great, to be honest.) The point is to prove that *regression to the mean* is very, very good at predicting the future. So into Group B Perine went.
Perine was still the weakest back in Group B, especially after putting up a second straight zero in week 8 and following it up with a 0.9 in week 9. At that point, I figured he'd be an anchor, and the rest of Group B would essentially have to score more with four players than Group A did with five to pull things out.
And then a funny thing happened. Perine had a decent game in week 10, and then he had the best game in either sample in week 11, gaining 126 yards and scoring a touchdown.
Yes, injuries in front of Perine played a role in his rise, just as injuries hurt Group A when they lost Aaron Jones, (though it should be noted that Group B lost Ty Montgomery at the same time). But it wasn't just that Perine got a role, it's that he performed very well in that role, too.
Again, Perine's yards per carry through seven weeks was 3.02. His yards per carry average over the four-week sample was 4.74. Yards per carry isn't really “a thing”, it's mostly just random noise.
And the next time you believe a player is playing too poorly to regress... remember Samaje Perine.
Now on to the prediction.
Yes, Carson Wentz, Touchdowns Do Follow Yards.
Now that the wounds have had a little bit of time to heal, I wanted to revisit regression to the mean's greatest defeat of the season: quarterback yard-to-touchdown ratios.
As I said back in week 6, historically most quarterbacks have averaged around 160-180 passing yards for every touchdown. Truly exceptional quarterbacks like Peyton Manning, Tom Brady, and Drew Brees average around 140 yards per touchdown, instead. Aaron Rodgers— and only Aaron Rodgers— averages about 125 yards per touchdown.
As the season has gone on, quarterbacks near the middle of this distribution have trended even more towards the middle, but a few persistent outliers have remained.
Now, the whole purpose of this column is to stand as a champion for the concept of regression to the mean. In order to accomplish that purpose, I have two mandates. First, my predictions should be correct. Obviously, I'm never going to convince anyone of the value of a concept that's never actually right.
Second, however, my predictions should be relatively dramatic. If Group A outscores Group B by 3% out of sample, and Group B outscores Group A by 4% in-sample, that was a correct prediction... but not a very dramatic one. The entire thing could be simplified as “these two groups were scoring about the same, and after regression... they were still scoring about the same”.
These two mandates are often at odds with each other. As I detailed last week, the best way to improve my accuracy is to increase my sample size. But larger samples don't diverge as sharply, so the results will necessarily be less dramatic.
The “safe” play here is to compare the top 10 quarterbacks in touchdowns over expectations to the bottom 10 quarterbacks in touchdowns over expectations. But Jared Goff, the 10th-biggest underperformer, has three-tenths of a passing touchdown fewer than we'd expect based on his passing yardage total. Calling him an underperformer at all seems kind of silly.
Similarly, “overperformer” Cam Newton has an expected yard-to-touchdown ratio of 160, and an actual yard-to-touchdown ratio of... 159.43. Calling that overperformance seems a bit finicky. My “most quarterbacks will average around 160 yards per passing touchdown” heuristic is a hand-wave, and quibbling over a difference of a fraction of a yard implies a level of precision that just isn't there.
So instead, in honor of Samaje Perine and his newfound fantasy relevance, we'll take a risk and go for the dramatic. There are three quarterbacks who have overperformed their expected total by at least one touchdown, and there are likewise three quarterbacks who have underperformed by at least one touchdown.
Even if you believe Carson Wentz is inherently a 140-yard-to-touchdown guy— and remember, “140 yards per touchdown” guys are basically all inner-circle Hall of Famers— he still has a whopping 3.1 more passing touchdowns than you'd expect based on his passing yardage total.
At the same time, Andy Dalton and Dak Prescott are also both under the 140-yard-to-touchdown line. (So is Russell Wilson, but Wilson is barely under, and he's been a 140-yard-to-touchdown guy for his entire career, so it's not much of an overperformance.) So there's your Group A: Wentz, Dalton, and Prescott.
On the other end of the spectrum, Marcus Mariota and Jacoby Brissett are both averaging more than 240 yards for every passing touchdown, and Drew Brees is averaging 185. Even if you believe that Brees is no longer a 140-yard-to-touchdown guy like he's been for his entire career with the Saints, he's still probably at least 25 yards per touchdown above where he should be. Meet your Group B: Mariota, Brissett, and Brees.
Through eleven weeks, Group A averages 21.63 fantasy points per game, while Group B averages 18.94, a 14% edge for Group A. Meanwhile, if every quarterback had performed to expectations in yards per touchdown, (again, counting Wentz as a 140-yard-to-touchdown player and Brees as a 160-yard-to-touchdown player), Group A would be averaging 19.86 fantasy points per game and Group B would be averaging 20.45, a 3% edge for Group B.
So based on current passing yard to touchdown ratios, (and with a bit of trepidation given the size of the samples), I predict that Group B will outscore Group A over weeks 12-15.
Be sure to tune in to find out if Carson Wentz can remain undefeated against the forces of regression!