Running Back Deterioration, Part II
by Doug Drinen, Exclusive to Footballguys.com
For reference, here is a link to Part I.
The general question we want to answer here is: assuming age and talent are equal, does previous workload help us
predict future career length?
There is a mathematical technique called regression whose exact purpose is to answer questions like this. Suppose
Factors A, B, and C play a role in determining Quantity D. Assuming you've got enough past data and assuming
certain technical conditions are met, regression will give you a formula that tells you how to take a known A, B,
and C and use them to predict the value of Quantity D.
And that's exactly what we want to do. We want a formula that will predict the future career length of a back given
his his level of quality and his previous workload. The formula we get will tell us how important previous workload
is (if at all).
The big problem here is that we can't just input each running back's "level of quality" into the formula. We have
to decide on how to measure this. I'm going to use career-to-date VBD value as my measure of quality. While not perfect, I
believe it does a pretty decent job of giving us a rough estimate of a running back's quality.
So I took all running back seasons since 1978 by running backs age 27 or older, and I recorded the following data:
- His VBD value for that year
- His career VBD prior to that year
- His career workload prior to that year
- His age
- The number of career rushes he had after that season
I plugged all that data into the computer and it spit out the following formula:
|
Future rushes =~ 3203 - 104*age + 2.3*VBDLastYr + .813*PreviousVBD - .13*PreviousRsh
|
For the purposes of this discussion, the key number is the -.13. It says: all else equal, every rushing attempt you
had before last year will cost you .13 predicted future rushes. So if two backs are completely equal in every way,
but one of them had an extra 500 rushes when he was young, you would expect the player with the higher workload to
have 500*.13 = 65 fewer rushes during the rest of his career. The 104 next to "age" indicates that, all else equal,
a player who is one year older will expect to have 104 fewer carries left in the tank. Combining these two numbers,
we could infer that it would take about 800 previous rushes to age a back as much as one chronological year does.
Just for grins, let's see what this formula predicts for some of today's backs. The formula was created using data
from backs who had completed their age 27 season, had at least 100 rushes the previous season, and at least 400
rushes prior to that, so we should only apply it to players meeting those conditions. Here they are:
Proj Fut.
Player Age rushes
=================================
Shaun Alexander 29 973
Edgerrin James 28 946
Tiki Barber 31 636
Thomas Jones 28 624
Ricky Williams 29 564
Fred Taylor 30 486
Michael Bennett 28 467
Marcel Shipp 28 466
Warrick Dunn 31 350
Priest Holmes 33 318
Curtis Martin 33 291
Corey Dillon 32 217
Stephen Davis 32 140
Mike Anderson 33 105
You might think that Alexander's projection of 973 future rushing attempts seems a little low, and you might think
Edgerrin James' 946 seems even lower. But remember that this isn't supposed to be interpreted as the most likely
outcome. Rather, it's an expected value, or a weighted average. The formula is not saying, "I project Shaun
Alexander to have 973 more rushes in his career." It's saying something closer to, "there is some chance that
Alexander will suffer a catastrophic injury early next year and never play again, there is some chance that he will
lose effectiveness and only play for two more unimpressive seasons, there is some chance that he will play five
more seasons, and there is some chance that he will play eight more seasons and shatter Emmitt Smith's rushing
record. When I average these possible outcomes together, taking into account my best guess at the probabilities of
each, I get 973 future rushes."
While we're pretending the computer can talk, I may as well let you know that it also told me, "this formula is
the best I can do with the data you gave me, but there was some wacky stuff in there and there will be some more
wacky stuff in the future, so don't expect perfection or anything close to it. Being a computer just means I'm
able to do computations quickly --- it doesn't mean I can predict the unpredictable."
In some ways, the formula seems smart. Even though Thomas Jones is three years younger than Tiki Barber, the
formula "recognizes" that Barber has a much longer history of excellence than Jones does, and so it projects him to
get more future carries. Of course, the formula doesn't really recognize anything; it doesn't know Thomas Jones
from a hole in the ground (or even from a binary string of 1s and 0s that represents a hole in the ground). All
it's doing is attempting to predict the future in the way that best mimics the past. The past data we fed into the
computer said that, in general, players who didn't accumulate much value earlier in their career --- like Thomas
Jones --- don't have careers as long as those who did (like Barber).
The formula estimates that Tiki Barber has 636 carries left in him right now. It's instructive to look at what
Tiki's projection will look like at the beginning of next year. If he gets hurt, let's say after 130
carries and zero VBD, then this time next year the formula will project that he is essentially finished: about 70
carries left. If, on the other hand, he has a year just like 2005, then the formula will project him to have about
500 more carries remaining.
No matter how old you are (within reason), as long as you were productive in your most recent season, the formula
thinks you've got something left. But if you're on the north side of 30 and have a bad season, it will turn on you
in a hurry. Since the formula was generated in such a way as to best fit the past data, the lesson is clear: age
isn't much of a problem --- and neither is workload --- if you're productive. But once you start sliding, it's
hard to put the brakes on.
Unfortunately, what I just said amounts to: old-but-productive running backs will continue to be productive right
up until the point that they cease being productive. Genius. The trick is to figure out when the productivity
will stop, and this does not help us with that at all.
But we've gotten off track. This post was supposed to be about age vs. workload and for the first time we can
actually put a number on it. The number is .13. That's how many future rushes each past rush costs you.
Let's talk a bit about that number and the uncertainty associated with it. Regression answers two basic
questions:
- What is our best guess at the number?
- given the sample size and the amount of variation we saw in our input data, how sure are we that the number
isn't zero?
We answered #1 above. It's .13. I didn't tell you, though, that the answer to #2 is "not very." [For regression
buffs, the p-value is about .22.] The point is: even though we have an estimate of .13, we do not have
statistically significant evidence, in the generally agreed-upon sense, that workload has any effect on future
career length.
|