Statistics Is Useless Part II: The Consequences
As we have seen in Part I, statistics is hard. The improper use of statistics can lead us to having a false sense of security, to making mistakes, even to disaster, both financial and otherwise.
Uncertainty management
In a financial context, this is due to a failure of risk management. More specifically, a failure to understand what risk management domain one is operating in.
There are essentially three domains that one can find oneself in (see Keynes and Knight for a more in-depth discussion):
- Certainty. Existance of A gives certainty about existance of B. In probability terms, P(B|A) = 1 (or 0). If I wake up in the morning and see my laptop still sitting on my desk, I know with certainty that someone didn’t steal it overnight.
- Risk. Existence of A gives certainty about the probability of B. P(B|A) = C, where C is known. If I know a dice has 6 faces, I know the probability of rolling a 3 is 1/6.
- Uncertainty. Even if we know that A exists, the probability of B is still unknown.
The problem is that some believe they are operating in a domain of risk but in actuality, they are operating in a domain of uncertainty.
Generate THIS
One reason this happens is due to incorrect identification of the generating function of the distribution that produces the observations that we observe.
If one considers FTSE 100 price returns for the roughly 10-year period from the start of 2010 to late February 2020, one could be forgiven for assuming that the generating function produces almost normal observations. The distribution is pretty symmetrical with rare events roughly as uncommon as one would expect. From these 2,555 daily observations, it seems reasonable to conclude that the sample mean of 0.00016 (0.016%) and sample standard deviation of 0.0093 (0.93%) are a pretty good approximation for the true moments.
However, if we extend the sample period by just one month, we see that our normal approximation might not be that accurate after all. Something happened in late February 2020 (any ideas?) that produced some rather non-normal observations.


According to our sample standard deviation, here we have a 10-sigma event. Something that should happen roughly once every 524,900,000,000,000,000,000 days. Maybe the generating function isn’t normal after all…
Another particularly nasty problem that we talked about in Part I is that the generating function can change. Even if the generator is of Type 1 or 2, it can morph into something substantially less fun to play with. One can be operating under risk, only to go back to work on Monday after a quiet weekend in the countryside to find oneself operating under uncertainty. Models designed around classical statistics developed for dealing with problems of risk would no longer work. Bugger.
When we CAN say something
When we know that the generating function is of Type 1 or 2, we are dealing with a situation of risk. How do we know this? Via our old friend deduction.
We know that coin flip will either land heads or tails. We know that drawing balls from an urn will only produce a finite, known set of colours. We know that a ball in roulette can only land on one of 37 slots.
We also know things about the physical world. Mother Nature tends to be bounded. We know humans can’t grow to be 100 ft tall. We know insects can’t be as big as humans (their respiratory/circulatory system won’t allow it). We know that nothing can travel faster than the speed of light.
Physical systems are bounded and, therefore, much easier to define statistically.
Some artificial systems in the real world are also bounded to some extent. Stocks cannot decrease by more than 100% of their value, for example. However, this is not usually the case in our artificial world. Socioeconomic variables tend to be of Type 3 or 4, meaning that we are operating in the domain of uncertainty.
But all hope is not lost. Uncertainty simplifies decision making. If I know an asset can go to 0 tomorrow, I probably will not invest all my savings into it a month before retirement. If the generating function of the share price of a company is unknown, I probably would not heavily short that company and be exposed to huge losses. I probably would not rely on some historic level of returns from the market in general.
In any case, knowing the likelihood of an event isn’t enough to make a decision. One also needs to consider goals, options, and consequences of one’s choices. By adding more information to your beliefs, you are exposing yourself to the risk that you are wrong. The benefits of adding this new information must outweigh the risks associated with it.