Coin Tossing     an appendix to Binomial Distribution

Part 1

Here's something interesting:

We start with $1.00 and flip a coin once each year ... for N successive years.
We want to calculate some sort of "average" return, but first some terminology:

  • If $1.00 grows to $G in N years, we'll call G the Cumulative Gain Factor.
    (Example: $1.00 grows to $2.34 in 5 years, so G = 2.34.)
  • One can achieve this N-year Gain Factor with an Annual Gain Factor of G1/N.
    (Example: For G = 2.34 in 5 years, 2.341/5 = 1.185 meaning multiplying $1.00 by 1.185 five times gives $2.34)
  • We'll call G1/N - 1 the Annualization of the Gain Factor G.
    (Example: For G = 2.34 in 5 years, G1/5 - 1 = 0.185 so an annual 18.5% return yields 2.34 after 5 years.) This may or may not be an "Annualized Return" in the usual sense. For example, consider the AVERAGE of a thousand 30-year returns achieved by a thousand investors. If we calculate the "Annualization" of this AVERAGE 30-year Gain Factor, it may not be the Annualized Return of any of the investors. It will, however, be the annual return necessary to achieve the 30-year AVERAGE return.
  • If there are several Gain Factors, say G1, G2, ..., Gn, then the Average Gain Factor is (1/n){G1 + G2 + ... + Gn) = (1/n) Σ Gk.
  • With this background we consider the following:

    • Each time we get Heads, our money gets multiplied by 2.
    • Each time we get Tails, our money gets multiplied by 1/2.
    • There are 2N possible sequences of Heads and Tails.
    • The Value of our moneypot after m Heads and N-m Tails is 2m (1/2)N-m
    • The Number of Ways to get m Heads and N-m Tails is
      These are the binomial coefficients, which, for N = 10, are: 1, 10, 45, 120, 210, 252, 210, 120, 45, 10, 1
    • The Average moneypot, after N coin tosses, is then:     (1/2N) Σ (Number of Ways)(Value), namely
      (1) ....     (1/2N)Σ 2m (1/2)N-m = (1/2N){2 + (1/2)}N
      ... where we used the magic formula:
      [M]     (x+y)N = N0xN + N1 xN-1 y + N2 xN-2 y2 + ... + NNyN = Σ xN-m ym    
      with x = 1/2 and y = 2.

To be specific, we'll pick N = 10 so there are 210 = 1024 possible Heads/Tails sequences.
Consider the final moneypot for 1024 investors, each of whom gets one of the 1024 sequences of Heads/Tails.

We'll consider the final value of the moneypot for each of these 1024 people.
We'll call these final moneypots: $M(1), $M(2), ... $M(1024).
We want to calculate (1/1024){$M(1) + $M(2) + ... + $M(1024)}, the Average Final Moneypot.
However, there are only 11 different moneypots (for m = 0, 1, 2, ... 10 Heads), which we'll call M(m=0), M(m=1), M(m=2), ..., M(m=10):
$0.0010, $0.0039, $0.0156, $0.0625, $0.2500, $1.00, $4.00, $16.00, $64.00, $256.00, $1,024.00
corresponding to m = 0 heads to m = 10 heads. (The case M(m = 4) is in green.)

These 11 different moneypots occur several times among the 1024 people we're considering. In fact, the frequency with which each occurs is given by the binomial Coefficients: 1, 10, 45, 120, 210, 252, 210, 120, 45, 10, 1 (The case m = 4 is in green.)

The 11 different moneypots and the number of ways that each occurs is shown in Figure 1.

Figure 1

The median of the 11 possible moneypot values is just $1.00 (that's the $1.00 we started with ... see the big red dot?) and most coin-toss sequences (in fact, 252 of them) end up with 5 Heads and 5 Tails (and a final moneypot of just the original $1.00).

Since 210 of our 1024 people receive $0.2500 after getting only 4 heads (and 6 tails), the TOTAL moneypot for these 210 people is then
210*($0.25) = $52.50.

To find the sum of the 1024 final moneypots, we just add: frequency*moneypot, like so:

1*M(m=0) + 10*M(m=1) + 45*M(m=2) + 120*M(m=3)+ 210* M(m=4) + 252*M(m=5) + ... + 10*M(m=9) + 1*M(m=10)
    using the binomial coefficients ... and highlighting the case m = 4

This gives the sum of the 1024 moneypots, namely:
$0.001 + $0.039 + $0.703 + $7.500 + $52.50 + $252.00 + $840.00 + $1,920.00 + $2,880.00 + $2,560.00 + $1,024.00 = $9,536.74

Dividing by 1024 we get the Average Final Moneypot: $9,536.74/1024 = $9.313.
(We could also have used equation (1), above, giving   (1/210){2 + (1/2)}10 = 9.313.)

Since each of these 1024 people started with just $1.00, the average Cumulative Gain Factor is 9.313.
Since this Gain is over N = 10 years, the Annualization of this Gain Factor is 9.3131/10 - 1 = 0.25 or 25%.

>Huh? I'd expect that, on average, half heads, half tails ... we'd end up with our original $1.00 and that's a gain of 0%!
If every person on Earth were to toss a "fair" coin, once, then every sequence of Heads and Tails is equally probable. If we were to calculate the AVERAGE moneypot it'd be $1.25 (or very close to it, assuming there are lots of people on this small planet :^).
Or, you could toss the "fair" coin a jillion times, yourself. Then the AVERAGE 1-toss Gain Factor would be 1.25 so, on average, you'd make $1.25 for every $1.00 invested.

On the other hand, if we gain 100% for getting Heads and lost 100% for getting Tails (instead of just losing 50%), then the AVERAGE
1-toss Gain Factor would be 1.00 and the "Annualization" would be 0%.

Now, to get this 1.25 average Gain Factor, we've assumed a "fair" coin, where the probability of heads is 50%. However, we'd also get 1.25 if we averaged all possible sequences of Heads and Tails ... even if heads comes up 99% of the time. The 1024 final moneypots, corresponding to all possible sequences of Heads and Tails, depends only upon the numbers 2 and 1/2. How often each occurs (in the 1024 sequences) depends only upon the binomial coefficients. The probability of getting heads doesn't even show up in these calculations. A damaged coin that comes up Heads 99% of the time still has the same sequence of possibilities.

>But what about every person on Earth, when they toss ...
Aah, that's a horse of a different hue. In that case we'd be considering not the ordinary average of all possible sequences, but the probability-weighted average. If the coin was so crooked that Tails never came up, then the totality of Heads/Tails sequences doesn't change, but there'd be nobuddy getting Tails.

Later we'll consider a coin which give heads, say, 65% of the time ... more like the stock market

Look at how our "return" was calculated:

  1. Calculate the average of all 1024 possible final money pots, after 10 years, starting with $1.00.
  2. Each of these moneypots is a Cumulative Gain Factor (since each person started with $1.00)
  3. Their average is {M(1) + M(2) + ... + M(1024)}/1024
    Remember! Each of M(1), M(2), etc. is an N-year Cumulative Gain Factor (since the initial moneypot was just $1.00).
  4. If we "Annualize" this average Cumulative Gain Factor we get:
    [{M(1) + M(2) + ... + M(1024)}/1024 ]1/10.

This is what we calculated above:
the Annualization of the Average Cumulative Gain Factor = [(1/2N)ΣM(k)]1/N
We can also do the following:
  1. The 1024 Annualized Gain Factors (for each of our 1024 investors) are:
    M(1)1/10, M(2)1/10, M(3)1/10, ..., M(1024)1/10.
  2. The average of these 1024 Annualized Gain Factors is:
    {M(1)1/10 + M(2)1/10 + M(3)1/10 + ... + M(1024)1/10}/1024.
the Average of the Annualized Gain Factors = (1/2N)ΣM(k)1/N
The 11 different Annualized Returns (for N = 10) are shown in Figure 2a.
They run from -50% (m = 0 heads) to 100% (m = 10 heads).

>Yeah, so?
So, what is the better estimate of how well these 1024 investors did?
In one case we averaged then annualized.
In the other we annualized (see Figure 2b), then averaged.
Multiplying (Annualized Gain Factors) x (Number of Ways) gives Figure 2c.

Figure 2c

Figure 2a

Figure 2b

Notice that Figure 2c is no longer symmetrical.

>And for our 1024 investers? What are these returns?
It's investors.
The first we've already done: the annualization of the average gain factor is 1.25 meaning a 25% return.
The second, the average of the annualized Gain Factors (see Figure 2b), is 1.024 meaning a 2.4% annual return.
Which is the better estimate of how well these 1024 investors did: 25% or 2.4%, annually.

>I like the 25%.
Yes, you would.
>So why do you do this "Annualization" thing?
If I told you that a 20-year return was 1000%, would you say it was a good return?
>How would I know?
A 1000% return means a "Cumulative Gain Factor" of 11 which corresponds to an "Annualization" of 111/20 - 1 = 0.1274 or 12.74%.
>Hey! That's good, eh?

Part 2

Okay, now we'll consider the case where the probability of getting Heads is p, which may or may not be equal to 1/2.
If p = 1/2 (a "fair" coin) then all 2N possible Head/Tail sequences are equally probable, the probability of each being 1/2N.

>And if p isn't 1/2, then all the stuff you did above is out the window, eh?
Not at all. For N = 10, there are still 1024 possible Head/Tail sequences and, among the 1024 sequences, m = 4 heads still occurs 210 times (just as before) and the Value of the associated moneypot (after getting m = 4 heads) is still 24 (1/2)6 = $0.2500 and the average of all the 2N possible Cumulative Gain Factors (after N coin tosses) is still:

(1) ...     (1/2N) Σ (Number of Ways)(Value) = (1/2N)Σ 2m (1/2)N-m = (1/2N){2 + (1/2)}N

>So where does p come in?
Aah, before, all 2N possible sequences were equally probable (with probability 1/2N). In fact, we weren't even considering probabilities!
We just looked at the total number of sequences of heads and tails, determined the Gain Factor for each, then divided the sum of these Factors by 2N to get an average.

Now some of the sequences are more probable than others.
In fact, the probability of getting m heads (hence N-m tails) is now pm (1-p)N-m (which, if p = 1/2, is your old friend 1/2N).

Most important, our formula now becomes:

the average of the probability-weighted Cumulative Gain Factors (after N coin tosses) is:
(2) ...     Σ (Number of Ways)[Probability of m heads] (Value) = Σ [ pm(1-p)N-m ] 2m (1/2)N-m
= Σ [2p]m [1/2(1-p)]N-m
= { 2p + (1/2)(1-p) }N

We can achieve this AVERAGE Gain Factor with an Annual Gain Factor of 2p + (1/2)(1-p)
... and the Annualization is then   2p + (1/2)(1-p) - 1.

>What happened to the 1/2N out in front?
That was before.
Now it's replaced by Probability of m heads, namely pm(1-p)N-m, which changes with m ... so it appears inside the summation.

>Yes, I notice that, for p = 1/2, we'd get pm(1-p)N-m = 1/2N ... your old freind.
It's friend ... and it's your old friend.
But notice that, for p = 1/2, the Probability-weighted Average Cumulative Gain is ...

>That's a mouthfull. Can't we call it the P-wACG?
You mean mouthful ... and the Probability-weighted Average Cumulative Gain is { 2(1/2) + (1/2)(1-(1/2)) }N = 1.25N
and that's your old Annualization of 25%.

>That's your old Annualized Return, not mine. So, where's the pictures?
It's not an annualized return. It's the "annualization" of an average return.

If you tossed 5 heads and a tail you'd end up with $25(1/2) = $16, for each $1.00 invested. For one toss per year, that's a 6-year Gain Factor of 16 and an Annualized Return of 161/6 - 1 = 0.587 or 58.7%.

>I'll take it!
Pay attention! The point is, this is a real, live "Annualized" return and it ain't 25%. That 25% was the "Annualization of the Average Return" ... the annual return that would achieve the AVERAGE return, after N years.

If we have a time sequence of returns then we can calculate a real, live Annualized Return.
If we have a bunch of N-year returns we can compute an AVERAGE
... and invent an "Annual Return" which would generate that AVERAGE.

>And that's your "annualization" thingy, eh?
Now you got it.

To generalize, we change the Gain Factor for Heads, namely 2, to (1+H) and the Gain Factor for Tails, namely 1/2, to (1+T), and the Annualization formula
2p + (1/2)(1-p) - 1 becomes just: H p + T (1-p)

Here are a few pictures, in Figure 3 ... and a calculator below:
H p + T (1-p)
Probability of Heads p = %
Return for Heads H = % expressed as a percentage
Return for Tails T = % expressed as a percentage
Annualization = %
The Annual Return that would achieve the same result as the AVERAGE Return
Try p = 50%, H = 100% and T = -100%

Figure 3

>What would such an investment look like, if, say, p = 60% and you consider real, live annualized returns instead of that annualization stuff?

Here's a sample portfolio, starting with $1.00, and going for N = 40 years
where we get a 20% return for a heads and lose 5% for a Tails and we select a Heads and Tails sequence at random with heads showing up 60% of the time and (remembering that it's just one of a 240 possible sequences of Heads/Tails) we're now considering a time series of returns, then we get a real, live annualized return of ...

>Yes, it's on the chart. I can read!

... but you can't spell.

Figure 4
You might also want to look at Figure 5.
It shows (for N = 40 coin tosses) , the Number of Ways each of the 41 different Cumulative Gains occur, in the sum:

Σ [2p]m [1/2(1-p)]N-m

as well as the terms themselves (that's the [2p]m [1/2(1-p)]N-m). Notice how the terms shift right when the probability of Heads increases. Although the Number of Ways that m heads can occur (in the 240 possible Heads/Tails sequences) doesn't change (when p changes), the probability of getting a particular sequence DOES change ... hence the average Cumulative Gain (and Annualized Return) will change.

Figure 5

>Isn't that obvious?
Uh ... yeah, I guess so.

But just remember: It's not too useful to calculate the average gain for all possible sequences of heads and tails (as we did in Part 1) since that's independent of whether Heads occurs 50% of the time ... or 95% of the time. In particular, if we're using this coin toss scenario to mimic real world investments, we must assume different probabilities for Heads and Tails. Further, recall that the probability-weighted result is a Binomial Distribution and, for large N, it looks much like a Normal distribution. Also ...


Figure 6

See also: Kelly Ratio

Thanks to Ron for suggesting Buffon & Kelly

There's this neat problem called Buffon's Needle.
You drop a needle on a lined sheet of paper and determine the probability that the needle crosses one of the lines on the page.

>You're going to calculate the probability?
No. It's 2/π ... about 64%. However, suppose you lose $1.00 every time the needle crosses a line and win $2.00 when it doesn't.
What are the chances of winning over the long haul ... say 50 tosses of the needle?

>Winning what?
I mean ending up with more money than you started.

The probability of winning is only 1- 2/π (about 36%) but you win twice as much. The expected win (per toss) is
(probability of winning)*(win amount) +(probability of losing)*(loss amount) = (1- 2/π)($2.00) + (2/π)(-$1.00) = $0.09 or 9 cents per toss.
But you can lose 50 times (that's a loss of $50) or win 50 times (that's $100 in winnings) and there's a bunch of win/loss sequences in between (as shown in Figure 7a), but there's a probability associated with each sequence ... and some hardly ever occur !!

>Umpteen possible sequences?
Yes. There are 250 possible sequences of wins and losses and, for m wins (and N-m) losses, we'd make
$2.00 (m times) and $1.00 (N-m times) for a total win of 2m-(N-m) = 3m-N = 3m - 50.

>So, if we toss the needle 50 times and avoid those lines m = 0 times, we'd get -50 dollars and ...
And for avoiding the lines all m = 50 times we'd get 3(50)-50 = 100 dollars.
But they're not equally probable. Indeed, although you'd get $(3m-50) for m wins, the probability of getting
m wins in 50 tosses is (1- 2/π)m(2/π)50-m, but there are sequences (of the 250) that give m wins (and the total probability of getting m = 0 or 1 or 2 ... or 50 wins is the sum Σ(1- 2/π)m(2/π)N-m and ...

>Getting m = 0 or 1 or 2 ... shouldn't that be 100%?
Yes. Σ(1- 2/π)m(2/π)N-m = [(1- 2/π)+(2/π)]N = 1N = 1, for any N ... using the magic formula [M], above.

But, we'd lose money whenever 3m-50 < 0 or m

Figure 7a

Figure 7b

>Where does Kelly come in?

Probability of a Win:   p = %   note:   (1-2/π ) = 0.3634
Average amount of winnings (when you win):   W =
Average amount of losses (when you lose):   L =
the Kelly Ratio = p - (1-p)/{ W/L } = %

Kelly% = the percentage of your capital to be put into a single trade

>So I bet 4.5% of my bankroll on each toss?
According to Kelly. That 4.5% comes from:

  • Your expected win per toss is $0.09 (as we saw above).
  • If you win, you'd win $2.00.
  • $0.09 is 4.5% of $2.00