Bollinger Bands revisited

A continuation of Standard Deviation ... of Prices and Returns.
Bollinger Bands: Introduction

Years ago, when I first ran across
Bollinger Bands,
I thought they were pretty neat ... stock prices bouncing between two
curves, the Upper and Lower boundaries of the "band" and ...
>Remind me. Bollinger bands?
We look at the last n stock prices P_{1}, P_{2}, ... P_{n} (where we include
P_{n}, today's price, and where P_{0} is the price n days ago) and we calculate their average,
P_{av}, and their Standard Deviation, SD:
P_{av} = (1/n)(P_{1}+ P_{2}+ ... +P_{n}) = (1/n)ΣP_{k}
SD^{2} = (1/n) [ (P_{1}P_{av})^{2}+ (P_{2}P_{av})^{2}+ ... + (P_{n}P_{av})^{2}) ] = (1/n)ΣP_{k}^{2}  P_{av}^{2}


(See SD stuff.)
Then, each day, we plot the two points:
[a] U = P_{av} + k SD
[b] L = P_{av}  k SD
These points trace out two curves and we see the current stock price bounce between
the two curves, U and L, as in Figure 1.
>So what are n and k?
You can pick anything., but we'll choose n = 20 days and k = 2 standard deviations.
>So you buy at L and sell at U?
I didn't say that!
 Figure 1

>So what are you saying?
I just want to look at Bollinger Bands, again, because
although one often calculates the SD (or Volatility) of stock returns, it's strange to see the
SD of stock prices and ...
>As in Bolli bands?
Yes, as in Bollinger Bands. We did, at one time, try to find a relationship between the statistical properties of returns
and of prices,
here. What we want to do now is investigate WHY one would expect stock prices
to oscillate between U and L.
When we consider the SD of daily returns, we often assume they have a Normal distribution ... in which case it's unlikely that returns
will lie too far from the Mean return. In fact, we would expect most returns to lie within 2 Standard Deviations of the Mean return. In fact,
if they were Normally distributed, the probability that the returns lie within two SDs of the Mean is X%. However, if we
consider prices, do we also expect them to lie (mostly) within 2 Standard Deviations of the Mean price P_{av}?
>That'd be like choosing k = 2, eh?
Exactly! When today's price is larger than U or smaller than L, then it's outside that 2S band
centred on P_{av} ... so we might expect tomorrow's price to return to the band.
That says something about tomorrow's price, eh?
>And the last n = 20 prices are Normally distributed?
What do you thnk?
>Huh? You're asking me?
That was a rhetorical question.
>Can I just go to the final result ... huh?
Sure. Just click here
the Distribution of Stock Prices

Suppose that, over the last n days, the daily Gain Factors are g_{1}, g_{2}, g_{3}, ... g_{n}.
>Gain Factors?
Yes, if a stock price goes from $P to $Pg in a day, then g is the Gain Factor for that day.
For example, g = 1.056 corresponds to a 5.6% daily return.
Then n successive daily stock prices (after the starting price of $P_{0})
are P_{0}g_{1}, P_{0}g_{1}g_{2},
P_{0}g_{1}g_{2}g_{3}, ...
P_{0}g_{1}g_{2}g_{3}...g_{n}
... the last being today's stock price.
So here's the question:
What's the distribution of the numbers G_{n} = g_{1}g_{2}g_{3}...g_{n} ??

We have the following:
Results
 The price n days ago is given as P_{0}.
 The daily Gain Factors, g, have Mean[g] = M and Variance[g] = Var = S^{2}.
These are determined from historically data and are assumed to be independent!
 The nday Gain Factor G_{n} = P_{n} / P_{0} = g_{1}g_{2}g_{3}...g_{n} has:
M = Mean[G_{n}] = Mean[g_{1}]Mean[g_{2}]...Mean[g_{n}] = M^{n} the Mean of a Product = the Product of the Means
S^{2} = Variance[G_{n}] = (M^{2}+S^{2})^{n}  M^{2n}
See this

Okay, now we'll assume that the g's are Lognormally distributed.
>Lognormal? I thought you wanted Normal?
Well, it's common practice to consider daily gains (or, in our case, Gain Factors) to be Lognormally distributed.
Besides, it makes the math easier.
In any case, if g has a Lognormal distribution, then g = e^{y} where y = log(g) has a Normal distribution.
That's the definition of Lognormal!
Further, if we let y_{k} = log(g_{k}), then
log(G_{n}) = log(g_{1}g_{2}g_{3}...g_{n}) =
log(g_{1}) + log(g_{2}) + ... + log(g_{n}) = y_{1} + y_{2} + ... +y_{n}.
Since the y's are independent Normally distributed numbers, their sum is also Normally distributed.
That makes log(G_{n}) Normally distributed hence G_{n} itself is Lognormally distributed.
Remember what it means to say that F(x) is the cumulative distribution for a variable Y:
It means that the probability that a randomly chosen Y is less than some x is F(x) (as in Figure 2).
So, for any x and n random g's, what is the probability that G_{n} =
g_{1}g_{2}g_{3}...g_{n} < x ?
That requires that log(G_{n}) < log(x).
 Figure 2

But, as we've said, G_{n} is Lognormally distributed, then Y = log(G_{n}) is Normally distributed.
Suppose we call N[u,Mean,SD] the cumulative Normal
distribution function with prescribed mean and Standard Deviation.
Then log(G_{n}) has a cumulative distribution described by N[u,Mean,SD]
where Mean and SD are the Mean and Standard Deviation of log(G_{n}).
>We know the mean and SD of G_{n}. That's results #3 ... but what about log(G_{n})?
Good question. In fact, for Lognormally distributed G_{n} there's a relation between the Means and Standard Deviations ... like so:
If M and S are the Mean and Standard Deviation of G_{n}, then the Mean and Standard Deviation of the logarithm is:
M = Mean[log(G_{n})] = log(M )  (1/2)S^{2}
S^{2} = Variance[log(G_{n})] = log(1 + S^{2} / M^{2})
... assuming that G_{n} is Lognormally distributed.

Since we now have labels for the Mean and SD of log(G_{n}), we can write the cumulative distributon for log(G_{n}) as:
N[u,M,S]
Probability that Today's Price lies within some interval centred on the nday Mean: P_{av}

Notice that, if P_{0} = $1.00, then the numbers G_{1}, G_{2}, G_{3}, etc.,
are just the subsequent stock prices. We'll assume that's the case.
>Huh? What's the case?
That P_{0} = $1.00 so the products G_{k} = g_{1}g_{2}...g_{k} are the stock prices. We'll stick P_{0} in our formulas ... later.
Okay, we have that today's price P_{n} = g_{1}g_{2}...g_{n}
is Lognormally distributed with a known Mean and Variance as given is Results #5.
Now we ask:
If the random variable G has a Lognormal distribution with given Mean and Variance,
what is the probability that G < x ... for a given number x ??

>You're asking me?
That was a rhetorical question. Now pay attention. We've been here before.
 If G < x then log(G) < log(x)
 Since G is Lognormal then log(G) is Normal
 The distribution of log(G) is then described by
N[u,Mean,SD]
the Normal cumulative distribution function
and Mean and Standard Deviation are the Mean and Standard Deviation of log(G) ... not of G itself!
 The probability that log(G) < log(x) is then N[log(x),Mean,SD]
 But log(G) < log(x) is the same condition as G < x so the probability is the same:
N[log(x),Mean,SD]
>What about our stock prices?
Yes, of course. I'm sure you've recognized our G. It's today's stock price G_{n} ... assuming the starting price was P_{0} = $1.00, n days ago.
In fact, we know the Mean and SD to use in this formula: M and S.
In other words:
The probability that G < x is
N[log(x),M,S]
>So the chances of being in that Bollinger band is ... what?
If A is the probability of being less than U and B is the probability of being less than L, then ...
>It's B  A, eh?
Actually, it's A  B as in:
N[log(U), M, S)] 
N[log(L), M, S)]
Note:
 Remember that we're talking about the probability that the nday Gain Factor lies between two numbers.
 Don't confuse Gain Factors with daily returns.
 In fact, a Gain Factor is 1 + (daily return).
 The Mean of the Gain Factors, that's M, is "1" greater than the mean of the daily returns.
 If the Mean of the daily returns is 0.0123 (that's 1.23%), then M = 1.0123.
>When are you going to insert some other starting price ... P_{0}?
Other than $1.00? Right now.
The numbers U and L given in [a] and [b] assume an arbitrary P_{0} value.
To generate the appropriate numbers for the case P_{0} = $1.00, we'd divide each of U and L by P_{0}.
Assume we've divided U and L by P_{0}. We'll call these U' = U/P_{0}
and L' = L/P_{0}, okay?
Now we're talking about the case where P_{0} = $1.00 (as we did above).
As we've seen above, the probability that G = P_{n}/P_{0} < U' is N[log(U'), M, S)]
But U' = U/P_{0} so P_{n}/P_{0} < U' is the same condition as P_{n} < U.
If we then want the probability that P lies within the Bollinger Band for arbitrary starting Price, then ...
>Why don't you just give the result, okay?
Here's our final result:
Assuming n random daily Gain Factors which are Lognormally distributed with Mean = M
and Standard Deviation = S then the probability that the price,
P_{n}, will lie within between L and U is given by:
Prob[L < P_{n} < U] = N[log(U/P_{0}), M, S)] 
N[log(L/P_{0}), M, S)]
where P_{n} is the stock price at the end of the n day period
P_{0} is the stock price at the start of the n day period
N[x, Mean, SD] is the Normal cumulative distribution function
M = M^{n}
S^{2} = (M^{2}+S^{2})^{n}  M^{2n}
M = log(M)  (1/2)S^{2}
S^{2} = log(1 + S^{2} / M^{2})
 

>There a lot of coloured numbers!
Sure.
Remember: the probability (in the Magic Formula ) is really the probability
that the Gain Factor (over n days) lies between L/P_{0} and U/P_{0}.
That's the same as:
Prob[L/P_{0} < P/P_{0} < U/P_{0}].
If, for example, both L and U are less than the starting Price P_{0}, then you're asking for
the probability that, after n random gains, the price P_{n} has dropped to some range of lower values.
Similarly, if both L and U are greater than P_{0} and ...
>Do you have an example?
Okay, we'll consider GE over the past n = 20 days and ...
>Doesn't that Magic Formula work for the next n days?
Okay let's start with today's GE price (that's our P_{0}).
We'll look 20 days into the future to get a probability
that the stock price will lie between L and U.
But we have to decide what historical data we use to calculate that
Mean and Standard Deviation ... like maybe the past 150 days or maybe ...
>Don't you have a spreadsheet?
Yeah, it gives a picture ... like Figure 3.
>That 27% probability ... do you really believe that?
Of course! Don't I offer a moneyback guarantee on my spreadsheets?
 Figure 3

To download the .ZIPd spreadsheet, RIGHTclick on Figure 3 and Save Target ...
>Is it any good ... that probability?
Okay, here's what we'll do:
 We'll use a Mean and Standard Deviation based upon the past D days (example: D = 150 days).
 We'll start a year ago (that's December, 2002) and look at the price of GE stock at that time: That's our P_{0}.
 We then look ahead 20 days and look at the stock price. That's P_{n} for n = 20.
 Then we see if the stock price P_{n} is $1.00 to $3.00 higher than P_{0}. That is: L = P_{0} + 1 and U = P_{0} + 3.
 We repeat this for every day from Dec/02 to to Dec/03.
 Then we see how many times the stock price did lie in the prescribed range L < P_{n} < U, 20 days later.
 Was it the percent suggested by our Magic Formula ?
>So U and L will always be $1 and $3 higher than the starting price, right?
Right, and n will stay fixed at 20 days.
>But the Magic Formula percent will depend upon your D, right?
Yes, that determines M and S ... so we'll do this for various values of D.
>But the actual percent won't change, right?
No. It depends only upon the past year's daily prices. They won't change.
>And?
Here's the result for various values of D:
Predicting n = 20 days ahead: L = P_{0} + 1 and U = P_{0} + 3
>So you're checking to see if the price, 20 days hence, is between $1 and $3 higher than the current price.
Well, we're seeing what the Magic Formula says and what the actual result was, over the past year.
Here's another result for different U and L and n:
Predicting n = 30 days ahead: L = P_{0} + 2 and U = P_{0} + 5
>That's looking 30 days into the future to see if the price has increased by between $2 and $5.
Yes.
>Some are pretty lousy, eh?
Ya win some, ya lose some ...
