Coin Tossing ... and Reversion to the Mean
motivated by a discussion on Morningstar

Once upon a time there was once an article where the author, in explaining Reversion to the Mean (RTM), said something like:

The mathematical principle of reversion (or regression) to the mean states that "the greater the deviation of a random variate from its mean, the greater the probability that the next measured variate will deviate less far."

The classic example is a series of coin tosses. If a coin comes up Heads 90 times out of the first 100 tosses, look for Tails to make a comeback over the next 100.

>And you're saying that ain't true?
No, not at all. I was intrigued by the response to that claim.
The gambler's fallacy is the mistaken notion that the odds for something with a fixed probability increase or decrease depending upon recent occurrences, so that, after 90 Heads (that's the recent occurrence), the gambler assumes that "Tails will make a comeback over the next 100".

>And you're saying that ain't true?
No, I'm not. Give me a chance to explain!
Consider five tosses of a "fair" coin ("fair", meaning there's a 50% chance of getting Heads, on each toss).
There are six possible results, ranging from 0 Heads to 5 Heads. See Figure 1
There's only one way to get 5 Heads, but several ways to end up with, say, 3 Heads.
Figure 1 shows one way: you toss H T T H H.
But you could also toss H H T T H and you could also toss H H H T T and you ..

>Yeah, so?
So, as it turns out, there are precisely 5C3 ways to get 3 Heads.
Hence, it's more likely you'll get 3 Heads than 5 Heads since there are more ways to get there.

>5C3? Huh?
5C3 is a binomial coefficient and equals 5(4)/2 = 10.


Figure 1

By the binomial coefficient 5C3, I mean you multiply out (1+x)5 and you get 1 + 5x + 10x2 + 10x3 + 5x4 + x5.
See the coefficients of the various powers of x?
They're 1, 5, 10, 10, 5 and 1 and them's the binomial coefficients 5C0, 5C1, 5C2, 5C3, 5C4 and 5C5.

>And 5C3 = 10 and that's the probability of getting 3 Heads ?
No! It just means that, of all the possible ways to toss a coin five times, 10 of them will give 3 Heads.
>Yeah, so?
So we can calculate the probability of getting 3 Heads when tossing a coin five times.
We just calculate the totality of possible sequences of Heads and Tails ... and that's 25 = 32.
Hence the probability is ...

>It's 10/32, right?
Right, and that's 0.3125 or about 31.3%.
So we can calculate the probability of tossing 0, 1, 2, 3 or 5 Heads in five tosses and get the probability distribution in Figure 2.


Figure 2

>So what about that 90 Heads in 100 tosses?
The number ways to toss a hundred coins - the totality of possible sequences of Heads and Tails - that's 2100 ... about 1.3x1030.
How many will end up with 90 Heads? That's 100C90 or about 1.7x1013.
The probability of getting 90 Heads is then ...

>That's ... uh, (1.7x1013) / (1.3x1030)?
Yes, and that's about 0.0000000000000000137.

>And for the next hundred tosses?
There's exactly the same probability of tossing 90 Heads, but the probability of tossing fewer is ...

>I'd say it's 0.999999999999999863, right?
Yes. For the next hundred tosses, we'd expect ...

>We'd expect Tails to make a comeback!
Exactly. In fact, there's a 99.9% probability that we'd get fewer than 65 Heads the next time we toss a hundred coins.
>Are you saying that the author was right about RTM?
Yup! It has nothing to do with whether or not the coins have a memory, it has to do with probability distributions.
In fact, this coin-tossing RTM has everything to do with probability distributions.

For our hundred tosses of a "fair" coin, the distributions look like Figure 3 ... and you can see that, for N = 90, the probabilities are microscopic.

>It looks like there's a 50% chance of getting less than 50 Heads, right?
Yes, and ...

>And a 20% chance of getting less than 45 Heads, right?
Yes, and ...

>And a 98% chance of getting less than 60 ...
Yes, but aren't you going to ask about the red curve in the upper picture?

>What's that red curve in the upper picture?
That's our old friend the Normal distribution, with Mean = 50 and Standard Deviation = 5.

For the Binomial distribution, the Mean is N/2 = 100/2 = 50 and the the SD is SQRT(N)/2 = 10/2 = 5 and for large values of N (like N = 100 tosses) the Binomial distribution looks like the Normal distrbution.

>Is that it?
That's it!  


Figure 3

>But what about that Gambler's Fallacy?
As NewGuy said (on M*):

The gambler's fallacy is about the probability of the next **single** event. In this case, the next **single** coin toss.

Suppose the authors had reasoned as follows.
The coin has come up 90 Heads in the last 100 tosses.
Thus, the coin is more likely to come up Tails on the next toss. Now that would be the Gambler's Fallacy.