the Median Return  
motivated by a comment by Loren C.

While I was writing Ito Calculus I learned something I didn't know and ...
>Is that unusual?
No. I should have known it, but these aging grey cells ...
>Just tell us, okay?
Suppose the returns for some stock (daily, weekly, monthly, whatever) ... suppose they have a lognormal distribution.

  • If r is a return, then log(1+r) has a normal distribution. That's what "lognormal" means
  • Hence 1+r = ex or r = ex - 1, where x is normally distributed.
  • So, if M[x] is the Mean of the x's, then half the x's lie below M[x]. That's the property of the Mean, for a normal distribution
  • Then half the returns are less than eM[x] - 1.

Okay, now we find out what M[x] is.
M[x] is the Mean of the logarithms of 1+r so we consider a jillion rs and take the Mean, like so:

                (1/N)[log(1+r1) + log(1+r2) + ... +log(1+rN) ]
                = (1/N)log[(1+r1)(1+r2)...(1+rN) ]   since log(A) + log(B) + log(C) + ... = log(A*B*C...)
                = log[(1+r1)(1+r2)...(1+rN) ]1/N   since k log(A) = log(Ak)

>N is a jillion?
Pay attention.
G(N) = (1+r1)(1+r2)...(1+rN) is the total Gain Factor over N time periods.
That means that $1.00 will grow to $G(N) after N time periods.

>Time periods? Like months or years or ...
Yes, yes ... let's say years, okay?

Then G(N)1/N = [(1+r1)(1+r2)...(1+rN) ]1/N is the Gain Factor over one year ... the Annualized Gain Factor.

This Annualized Gain Factor is 1 + Annualized Return.
(The Annualized Return is also called the Compound Annual Growth Rate or CAGR).

We then get M[x], the Mean of logs of 1+r, as M[x] = log{1+ CAGR}.
Let's write things down:
                M[x] = log{1+CAGR}
                Half of the annual returns are less than eM[x] - 1
                Half of the annual returns are less than CAGR.

Unless the annualized return is very large ...
>Are you kidding? My returns are more like ...
Pay attention!
Unless the annualized return is very large then this should be a good approximation even if the returns aren't exactly lognormally distributed.

Now, if the returns are lognormally distributed, then (very nearly):
                CAGR = M[r] - (1/2) V[r]     where M[r] and V[r] are the Mean and Variance of the annual returns.

That's (Standard Deviation)2.

>So you conclude ... what?
If M[r] and V[r] are the Mean and Variance of returns, then
half the returns are less than M[r] - (1/2) V[r]     approximately

>And if the returns are weekly or monthly or ...
Then use them in the formula.

>And that formula is exact?
For real live returns? No. I said approximately didn't I?
There are assumptions made about the distribution of returns that may or may not be valid.

We might also note that, if the returns r are lognormally distributed, there's a theoretical relation between the Means and Variances which goes like so:

  1. Suppose returns r have a Mean M[r] and Variance V[r].
  2. If the returns r are lognormally distributed, then 1 + r = ex where the x's are normally distributed.
  3. Suppose the Mean and Variance for the x's are M[x] and V[x].
  4. Because of the lognormal association, the relation between M[r], V[r], M[x] and V[x] is
          V[x] = log[1+V[r]/{1+M[r]}2]
          M[x] = log[1+M[r])]- V[x]/2
  5. Hence eM[x] = (1+M[r]) e- V[x]/2 = (1+M[r]) [1+V[r]/{1+M[r]}2]-1/2
  6. The Median return is then r = eM[x] - 1 = (1+M[r]) [1+V[r]/{1+M[r]}2]-1/2 - 1
  7. For small M[x] and V[x], r is approximately
    r = (1+M[r]) [1- (1/2)V[r]] - 1 which is approximately r = M[r] - V[r]/2.

>How good is it ... I mean how good are they, the easy guy and the weird guy?
You mean M[r] - V[r]/2 or the more complicated (1+M[r]) [1+V[r]/{1+M[r]}2]-1/2 - 1?

Okay, let's try it out. Here's what we'll do:

  1. We'll collect a bunch of returns and calculate their Mean and Variance.
  2. We'll count how many are less than Mean - (1/2)Variance.
  3. We'll repeat for a bunch of stocks.
StockTime PeriodPercentage
GEweekly: Jan/95-Jan/0349.4%
GEdaily: Jan/95-Jan/0350.1%
Microsoftmonthly: Jan/90-Jan/0350.6%
Exxondaily: Jan/98-Jan/0350.3%
GMmonthly: Jan/50-Jan/0351.2%
Coca Colamonthly: Jan/50-Jan/0349.4%
S&P500monthly: Jan/50-Jan/0347.8%
S&P500yearly: 1928-200052.0%
DOWmonthly: Jan/50-Jan/0347.0%

>Which formula did you use, the easy one or the weird one?
They're the same, to one decimal place.
>They do better for stocks than for an index, eh?
Looks like it.
>Do you have any pictures?
I'm working on it, but in the meantime you can play with a spreadsheet which arose from the Ito stuff. It gives the Median, not of stock returns, but of the distribution of stock prices umpteen years into the future ... given the Mean and Standard Deviation of the returns. Just click here.

>A picture is worth a thousand ...

Sample (possible) future scenarios and the Median portfolio

Recently I got to thinking about ...
>Thinking? Forgive me, but I think you'd better stop ...
Pay attention!
In that Ito stuff we found the distribution of stock prices (or buy-and-hold portfolios)
assuming the Annual Return and Standard Deviation were r and s respectively, namely

Figure 0
In fact, if we accept the above prescription for the distribution of stock prices, then:

  1. If starting value is P(0) and, T years later, the value is P(T), then
  2. the logarithm of the Gain Factor, namely log(P(T)/P(0)) is Normally distributed
  3. The Mean of this T-year distribution for log(P(T)/P(0)) is:   (r - s2/2)T
  4. The Standard Deviation of this T-year distribution for log(P(T)/P(0)) is:   s SQRT(T)
  5. The "annualized" Gain (or Compound Annual Growth Rate: CAGR) is therefore:   CAGR = (P(T)/P(0))1/T
  6. If we consider the gain distribution over T = 1, 2, 3 etc. years (as given above), we can generate the "annualized" gain.
  7. That means ...

>A picture is worth a thousand words.
Okay, here's a picture if we assume an Annual Return of 8% and SD of 15%.
It shows distribution of the T-year gains, but "annualized".

>I don't understand ...
Okay, look at the green curve.
We calculate the probability that, in 5 years, our portfolio will have an annualized gain less than, say 17%. That's the yellow dot on the green curve. It says the probability is (about) 91%.

>Let me do one. There's a 61% probability that, in 2 years the annualized return is less than 10%.
Yes. The black dot.
>And the probability that my annualized gain is negative is ... all the other dots.
Yes, those located at r = 0%.

Figure 1
But don't you see something interesting?
All the curves pass through one point, at probability = 50%.
So 50% of the annualized returns are less than some magic return, regardless of the number of years.
>And that magic return is ... what?
I'll give you a hint. In Excel, it'd be: NORMINV(0.50,r-s^2/2,s)

>I notice that, in Figure 1, the curve for 2-year is spread out.
Yes, the volatility of annualized returns is larger for T = 2 years than for 5, 10, 15 etc. and that means ...
>That means the result is less certain, eh?
For the annualized return? Yes. If you annualize 2-year returns the distribution has a higher volatility than if you annualize 5- or 10-year returns. But what you're really interested in is your return at the end of T = 2 years, or 5 or 10 years.

>And that's less certain, eh?
Less certain for T = 2 years than for 5, 10, 15 ...?
No. It's the other way around. It's now the longer time periods have less certainty
... as shown in Figure 2  

>Less certainty, meaning higher volatility?
Yes, because ...

>That's weird.
Weird? Look again at number 4, above. It says:
The Standard Deviation of the T-year distribution is s SQRT(T).
For smaller T you get smaller volatility. After all, if you wait just T = 1 minute, the stock price won't change much from P(0), eh? Look at Figure 4 which shows the distribution of prices after 1 and 5 years. After T = 1 year the price distribution doesn't vary much from the initial price of $1.00, but, give it time and, after 5 years ...

>I get it, but for smaller T, it's the "annualized" return that has the larger volatility? That's weird.
Look at the 5-year chart in Figure 3. It actually gives 5-year Gain Factors (since we started with $1.00). For example, if, after 5 years, our $1.00 stock is worth $2.25 then the 5-year Gain Factor is 2.25, but the "annualized" Gain Factor is obtained by taking the 5th root:   2.251/5 = 1.176.

>That's 17.6% annualized, eh?
Yes. So now we can shrink the horizontal scale for the 5-year chart by taking the 5th root of all the numbers there ...

Figure 2

Figure 3

>To get annualized Gain Factors!
Yes, and for a 10-year chart we'd shrink by taking the 10th root and for a 15-year chart we'd shrink ...

>Yeah, yeah. I get it. So?
So you wouldn't believe how that shrinks the horizontal scale! It turns a wide curve, meaning large volatility, into a skinny curve, meaning small volatility.
>And that's what annualizing does? Shrinks to a skinny curve?
Yes, it shrinks the horizontal scale ... in the distribution chart.
For example, the horizontal distance between 3 and 7 is 7 - 3 = 4.
Now take 5th root of each. Now the horizontal distance between 31/5 = 1.246
and 71/5 = 1.476 is 1.476 - 1.246 = 0.23
See? And notice that the Gain Factor = "1" doesn't change since 11/5 is still 1.
See the blue dot?
But horizontal deviations from this point get shrunk. See the red arrow?
That's a Gain Factor of 2 getting shrunk to 21/5 = 1.149 so ...
>Shrunk is a technical term?
>And fat curves get skinny?

Figure 4
>And if I know the Standard Deviation for T-years, what is it ... annualized?
Look at numbers 2, 3 and 4, above, namely:
  2.  The logarithm of the Gain Factor, namely log(P(T)/P(0)) is Normally distributed
  3.  The Mean of this T-year distribution for log(P(T)/P(0)) is:   (r - s2/2)T
  4.  The Standard Deviation of this T-year distribution for log(P(T)/P(0)) is:   s SQRT(T)

Note that P(T)/P(0) is the T-year Gain Factor and [P(T)/P(0)]1/T is the annualized Gain Factor.
Take logarithms.
The logarithms are log[P(T)/P(0)] and (1/T)log[P(T)/P(0)] respectively, so we just divide by T to change from T-year to annualized.
The T-year Mean of (r - s2/2)T becomes (r - s2/2) when annualized (by dividing by T).
The T-year Standard Deviation of s SQRT(T) becomes s / SQRT(T) when annualized (by dividing by T).
>So when annualizing the 5-year distribution, the 5-year volatility gets divided by SQRT(5)?
No, it gets divided by 5. However, what you get is the annual Standard Deviation (that's our s) divided by SQRT(5).

>So for an annual volatility of 15% I'd get 0.151/5 if I annualized the 5-year gains?
Yes, and 0.151/5 = 0.067 or 6.7% ... so annualizing 5-year gains gives a skinnier, less volatile distribution than even the 1-year distribution.
>So if you wait, your chances of losing is less. Right?
Yes, but the chances of losing BIG TIME are greater
In Figure 2 it appears that the chances of losing on your investment is less, if you wait longer.
That is, 15 years as opposed to 5 years or 2 years.
However, if we blow up the tails of the distributions (in Figure 2), we get Figure 5 which shows that, at large negative returns, the ordering changes.

Now the chance of getting a devastating, negative return over 15 years is larger than over 5 years.

>Now that IS weird, but you're talking about some mathematical model, eh?
Well, we could look at daily, weekly and monthly returns for some stock like GE, over the past few years.
Notice that the Standard Deviation increases when we consider returns over longer time intervals.

Figure 5A

>How rapidly does the SD increase as you ...?
As you increase the time interval? Roughly as the square root of the time interval.

>So the longer you wait the greater the risk, eh?
It depends upon your definition of "risk". The Standard Deviation surely has increased for GE stock.
Figure 6 shows the SD as well as a square-root curve, for comparison.

>Shouldn't the red curve start at 0%?
Actually, the first point is for 1 day ... about 3%.

Figure 6

>Okay. You assumed a lognormal distribution. What about a NORMAL distribution of returns?
That'd be a bad idea. Look at the probability of having a negative portfolio, starting with $1.00 (after years = 10, 12, 14 etc. years).

Figure 7

>A negative portfolio? How is that possible?
Easy. Assume a Normal distribution ... instead of Lognormal.