A Statistical State of mind
In recent years, statistics have become a larger and larger part of daily life in America. The 2012 presidential election and the bestseller-turned-blockbuster “Moneyball” have helped turn statisticians like Nate Silver and Bill James into household names. We are reaching a point where phrases like “confidence intervals” and “correlation coefficients” are penetrating into the public consciousness in a way that seemed unthinkable ten years ago.
Most modern statistical analysis uses something called “frequentist inference”, which entails a very specific process. In the most basic version, one selects two parameters to test for a relationship. He or she creates a “null hypothesis”, which is basically just a formal statement that there is no relationship between the two sets. He or she gathers data, and then calculates the probability of getting that particular set of data if the null hypothesis is true. If the odds are sufficiently low, he or she can say there’s enough evidence to reject the null hypothesis. To give an example of how this might look in practice, let’s say that I want to test whether there’s a relationship between being tall and playing in the NBA. I would start by creating a null hypothesis- in this case, “There is no relationship between being tall and playing in the NBA”. Then I would gather the data, which will surely show a very strong correlation between height and NBA participation, (for instance, some estimates suggest that 17% of all men in America who are 7’ or taller will play in the NBA). After I have gathered my data, I will note that the relationship is very strong and the sample size is very large, so it is exceedingly unlikely that I would have seen a relationship like that if my null hypothesis were true- basically, I'm saying “based on the data, I have enough evidence to say it is exceedingly unlikely that there is no relationship at all between being tall and playing in the NBA”. There’s a lot of math and other steps involved, but that’s the underlying idea. When most people perform statistical analysis, this is what they are doing- state a null position, gather the data, and if possible, reject the null.
Frequentist inference is not the only form of statistical analysis, though. There is a competing school of thought called “Bayesian inference” that operates in a different manner. Frequentist inference leaves no room for prior beliefs- you gather the data and then make all judgments of probability based exclusively on the data that you gathered. Bayesian inference is a method of combining the data you gather with your pre-existing beliefs in order to reach a new hypothesis. Again, there is a lot of messy math involved, but the math only codifies the process. We can understand the underlying principles without performing a single calculation.
Applying the principle
If this sounds like gibberish, perhaps it’s best to illustrate with another example. At the risk of getting myself into trouble, let's pretend that I wanted to know whether my wife was a talented cook. Now, I am quite sure she is- she feels about Food Network much the same way that I feel about the Red Zone Channel- but let's all pretend in the name of science. The frequentist approach would be to create a null hypothesis, (“my wife is not a talented cook”), gather data (ask my wife to cook me something), and then calculate whether the data gathered was sufficient to reject my null hypothesis. If my wife produced a flawlessly-executed five-course meal, I might say “there is only a 1% chance that someone who was not a talented cook could produce such a feast; therefore, I can say with 99% confidence that my wife is a talented cook.” If she accidentally over-salted the sweet potato risotto, I might say “this risotto is pretty good, but there’s a 65% chance that someone who was not a talented cook could make something this tasty entirely by chance; therefore, I do not have enough data to reject my null hypothesis. My wife may or may not be a talented cook. The data are inconclusive.”
If I were a Bayesian, though, I would eat the meal my wife cooked and then afterwards say “the risotto might have been over-salted, but I have tasted my wife’s risotto hundreds of times before this and it has always been delicious; therefore, I conclude that it’s far more likely that she’s a talented cook who accidentally over-salted the risotto in this one particular instance”. In other words, while the new evidence was considered, so was the entire wealth of pre-existing evidence I already had. If the pre-existing evidence was scant (say, if I had only had my wife’s cooking once before), then this new piece of evidence would carry a lot of weight. If the pre-existing evidence was abundant, (say, if I’d been eating her cooking for over a decade), then one more risotto isn’t going to sway my belief very much one way or another. Or, in other words, if she burnt the second thing she ever cooked for me, I might be worried about the third. If she burnt the 1,002nd thing she ever cooked for me, I probably would not be particularly worried about the 1,003rd. My wife is an excellent cook, and on the exceedingly rare occasions when something falls flat, I do not discard that belief so easily. All of this probably sounds quite logical and intuitive to you; in fact, you probably operate in this manner in much of your life. I'll bet you are surprised to learn that you've been a closet Bayesian for years.
What does my wife’s (excellent!) cooking have to do with fantasy football? Well, simply put, when we are valuing players after week 1, it is possible to approach it like a Bayesian or like a frequentist. Allen Hurns had 110 receiving yards and 2 touchdowns in his very first game as a rookie; he became just the second player in the last 30 years to accomplish both feats, joining Anquan Boldin. If I were a frequentist, I would create a null hypothesis, (“Allen Hurns is not a great wide receiver”), and then calculate the odds that Hurns could have accomplished this feat if that null were true. Of the more than 1,000 WRs who entered the league, only two accomplished this feat. This was extremely unlikely to happen if the null were true. Therefore I can say with 99% confidence that Allen Hurns is a great receiver!
But let’s instead approach this question as a Bayesian. Instead of forgetting everything we knew about Hurns heading into week 1, let’s keep it and merely add this new data point to the pile. 33 receivers were selected in the 2014 NFL draft and Allen Hurns was not among them. That suggests he is not likely to be a great receiver. Allen Hurns was the #4 receiver (and 3rd-best rookie) on Jacksonville’s roster this preseason. That suggests that Hurns is not likely to be a great receiver. Allen Hurns had a historically great professional debut. That suggests that Hurns is likely to be a great receiver. The positive indicator gets added to the negative indicators. The result is that we are higher on Hurns than we were before he had his historically great debut, but not as high on him as we would be on, say, fellow Jaguars receiver Marquise Lee had he had a historically great debut. Or, in other words, it takes a lot less to convince us that a highly-regarded rookie WR is very good than it takes to convince us that a lightly-regarded rookie WR is very good.
Some dynasty owners feel that it is inappropriate to consider indicators like draft position once a player has made a roster and started to play games, but that’s not how Bayesian inference works. All evidence is counted and used to form a belief. As more evidence comes in, it gets integrated and beliefs get updated, but the old evidence does not vanish or drop out of consideration; instead, it becomes a progressively smaller piece of the pie. For a player with one career game played, draft position represents a large percentage of the total information we have on him. For a player with one hundred career games played, draft position represents an insignificantly small percentage of the total information we have available. For someone like Larry Fitzgerald, the fact that he was a top-5 overall selection is perhaps the tiniest and most meaningless piece of information we have on him… but we still do not actively discard or disregard it.
A time for bayes
After week 1 of the season is the time when it is easiest to think like a frequentist, but the most important to think like a Bayesian. The return to football feels so significant that it’s tempting to discard months worth of work over the results of a small handful of touches of the football. I wrote last year that preseason ADP predicts rest-of-year production better than early-season production does, and this fact illustrates the importance of keeping a Bayesian mindset. We need to cling to our initial evaluations and merely update them for the new information we have received. Which is just one more reason why it’s so important to write down our initial evaluations before the season starts and new knowledge starts biasing our memories.
Like everything, there is a flip-side to this coin. If we must guard against the temptation to overwrite our existing beliefs with new information, we can rest assured that many of our league mates will fall into the same trap. While we are behaving like proper Bayesians, we should keep an eye open for those in our leagues who are overreacting to a single week and replacing their previous expectations wholesale. Most of the biggest steals in dynasty will be bought and sold in the first few weeks of the season by owners who were a little bit too zealous in discarding everything they previously knew over a few stellar minutes of football.