Variance of a Special SUM

We assume that g_{1}, g_{2}, g_{3}, ... are independent random variables with
Mean and Variance (= StandardDeviation^{2}) given by:
M[g] = g 
VAR[g] = V = s^{2}

We want to determine the Variance of a Special Sum, namely:
[1a] SUM(n) = g_{1} + g_{1}g_{2} + g_{1}g_{2}g_{3} + ... + g_{1}g_{2}g_{3}...g_{n}
>What's so special about ...?
Pay attention. We'll assume that the gs are the daily Gain Factors for some stock price.
If the gs are daily Gain Factors, then this corresponds to the sum of stock prices over the past n days, assuming the price started at $1.00 n days ago
... or it's the sum of the prices over the next n days, if today's price is $1.00 ... or they're the numbers g_{1} = P_{1}/P_{0},
g_{2} = P_{2}/P_{1}, ... g_{n} = P_{n}/P_{n1} where the P's are the stock prices.
In what follows we'll assume that the starting price P_{0} = $1.00.


For convenience, we'll set: G_{m} = g_{1}g_{2}g_{3}...g_{m} for m = 1, 2, 3, ... n
(This corresponds to the price after m days: G_{m} = P_{m}/P_{0})
We can then write:
[1b] SUM(n) = G_{1} + G_{2} + G_{3} + ... + G_{n}
Since we're assuming that the gs are independent, then the Mean of G_{m} = g_{1}g_{2}g_{3}...g_{m} is g^{m}.
>Huh?
Okay, we'll recall some magic Stat Stuff regarding the
Mean, Variance, Standard Deviation and Covariance of random variables (which we'll call M[x], VAR[x], S[x] and COVAR[x]):
Stat Stuff
If x, y, x_{1}, x_{2} etc. are random variables and C is a constant, then:
 M[x+y] = M[x] + M[y] and M[x+C] = M[x] + C since M[C] = C
 VAR[x] = S^{2}[x] = M[(xM[x])^{2}] = M[x^{2}]  (M[x])^{2} so VAR[xM[x]] = VAR[x+C] = VAR[x]
 COVAR[x,y] = M[xy]  M[x] M[y] = COVAR(x+C, y]
and notice that COVAR[x,x] = VAR[x]
 VAR[x+y] = VAR[x] + VAR[y] + 2 COVAR[x,y] = VAR[x] + VAR[y] + 2 r(x,y) S[x]S[y]
where r(x,y) = COVAR[x,y] / S[x]S[y] is the Correlation Coefficient
 VAR[x_{1}+x_{2}+...+x_{m}] =
ΣVAR[x_{i}] + 2 ΣCOVAR[x_{j}, x_{k}]
i = 1 to m, k = 2 to m and j < k
 COVAR[x_{1}+x_{2}+...+x_{n},y] = COVAR[x_{1},y]+COVAR[x_{2},y]+...+COVAR[x_{n},y]
 If COVAR[x,y] = 0 so r(x,y) = 0, then:
 M[xy] = M[x] M[y] and M[x_{1}x_{2}...x_{m}] = M[x_{1}]M[x_{2}]...M[x_{m}]
 VAR[x+y] = VAR[x] + VAR[y] and VAR[x_{1}+x_{2}+...+x_{m}] = VAR[x_{1}]+VAR[x_{2}]+...+VAR[x_{m}]
 VAR[xy] = M^{2}[x]VAR[y] + M^{2}[y]VAR[x] + VAR[x]VAR[y]

In addition, we'll need some other magic formulas:
Magic Formulas
 (1 + x)^{n} = 1 + nx approximately, for n and x small.
 1 + 2 + 3 + ... + n = n(n + 1)/2
 1 + x + x^{2} +...+ x^{m1} = (x^{m}  1) / (x1)
 1 + 2x + 3x^{2} +...+ (m1)x^{m2} =
[(m1)x^{m}  mx^{m1} + 1]/(x  1)^{2}

>Can I just bypass the math and go directly to the result ... please?
Well ... okay. Click here.
Continuing ... to get the Variance of the SUM(n) we'll use Stat Stuff #5:
[2a] VAR[G_{1} + G_{2} + ... + G_{n}] =
ΣVAR[G_{i} ]+ 2 ΣCOVAR[G_{j}, G_{k}]
where i goes from 1 to n and the latter sum is for j < k and k goes from 2 to n
>Huh? Do you really expect me to ...?
Okay, in all its grandeur, it looks like:
VAR[G_{1} + G_{2} + ... + G_{n}]
 = VAR[G_{1}]+VAR[G_{2}]+...+VAR[G_{n}]
 
+ 2COVAR[G_{1}, G_{2}]
  + 2COVAR[G_{1}, G_{3}]+ 2COVAR[G_{2}, G_{3}]
  + 2COVAR[G_{1}, G_{4}]+ 2 COVAR[G_{2}, G_{4}]+ 2COVAR[G_{3}, G_{4}]
  ...
  + 2COVAR[G_{1}, G_{n}]+ 2COVAR[G_{2}, G_{n}]+ ...+ 2COVAR[G_{n1}, G_{3}]

Consider COVAR[G_{j},G_{k}]. Remember that k = 2, 3, ... n and j < k.
From Stat Stuff #3:
COVAR[G_{j},G_{k}] = M[G_{j}G_{k}]  M[G_{j}]M[G_{k}]
Mean of the Product  the Product of the Means
But the g's are independent, so that
M[G_{k}] = M[g_{1}g_{2}...g_{k}] = M[g_{1}]M[g_{2}]...M[g_{j}] = g^{k}
Mean of a Product = the Product of the Means
So we can rewrite: COVAR[G_{j},G_{k}] = M[G_{j}G_{k}]  g^{j+k} so ...
>If Mean of a Product equals the Product of the Means, why isn't that COVAR[G_{j},G_{k}] zero?
Because G_{j} and G_{k} aren't independent since G_{k} contains all the factors of G_{j} ... and more!
>Huh?
Pay attention:
Consider the term M[G_{j}G_{k}] = M[(g_{1}g_{2}...g_{j})*(g_{1}g_{2}...g_{k})] =
M[(g_{1}g_{2}...g_{j})^{2}g_{j+1}g_{j+2}...g_{k}] noting that j is less than k.
Now we use "Mean of a Product equals the Product of the Means" because
(g_{1}g_{2}...g_{j})^{2} and g_{j+1}g_{j+2}...g_{k} are independent:
M[(g_{1}g_{2}...g_{j})^{2}g_{j+1}g_{j+2}...g_{k}] =
M[(g_{1}g_{2}...g_{j})^{2}] M[g_{j+1}g_{j+2}...g_{k}]
However, for the second factor we have:
M[g_{j+1}g_{j+2}...g_{k}] = M[g_{j+1}]M[g_{j+2}]...M[g_{k}] = g^{kj}.
For the first factor we use Stat Stuff #2: Mean[x^{2}] = (Mean[x])^{2}+ VAR[x] with x = G_{j} = g_{1}g_{2}...g_{j} and get:
Mean[(g_{1}g_{2}...g_{j})^{2}] = (Mean[g_{1}g_{2}...g_{j}])^{2} +
VAR[g_{1}g_{2}...g_{j}] = (g^{j})^{2} + VAR[g_{1}g_{2}...g_{j}]
where (again!) the Mean of a Product = the Product of the Means (since the g's are independent).
It might look more elegant if we rewrite this like so:
Mean[G_{j}^{2}] = (Mean[G_{j}])^{2} +
VAR[G_{j}] = g^{2j} + VAR[G_{j}]
>Let's forget elegance, okay?
Putting it all together:
COVAR[G_{j},G_{k}]  = M[G_{j}G_{k}]  M[G_{j}]M[G_{k}]
  = M[(g_{1}g_{2}...g_{j})^{2}g_{j+1}g_{j+2}...g_{k}]  g^{j+k}
  = M[G_{j}^{2}] M[g_{j+1}g_{j+2}...g_{k}]  g^{j+k}
  = [g^{2j} + VAR[G_{j}]] g^{kj}  g^{j+k}
  = VAR[G_{j}] g^{kj}

So far we have:
VAR[G_{1} + G_{2} + ... + G_{n}] =
ΣVAR[G_{i} ] + 2 ΣVAR[G_{j}] g^{kj}
But we know those VAR[G_{i}] for each i = 1, 2, 3, ... n
>We do?
Yes, we did it here and it looks like this:
[3] VAR[G_{m}]
= VAR[g_{1}g_{2}g_{3}...g_{m}]
= (g^{2}+s^{2})^{m}  g^{2m}

For typical parameters, namely daily Gain Factors and Standard Deviations, we'd have g = 1+r with r small (r is the daily return, say 0.01 or less) so g is close to "1"
and s small (say 0.02 or less) ... so s/g is small ... so we can use Magic Formula #1, like so:
VAR[G_{m}] = (g^{2}+s^{2})^{m}  g^{2m} =
g^{2m}[ (1+s^{2}/g^{2})^{m}  1 ] =
g^{2m}[ (1+ms^{2}/g^{2})  1 ] = m g^{2m2} s^{2}
This says that (approximately), the Standard Deviation of mday gains is SQRT(m g^{2m2} s^{2}) = SQRT(m)g^{m1}s.
That's just the 1day Standard Deviation, s, increased by a factor: the square root of the time period SQRT(m) ... a familiar result
>And increased by g^{m1}, too.
Yes. That's like applying the average 1day Gain Factor m1 times.
Anyway, [2a] becomes ...
>We're talking approximation, right?
Yes, but I won't keep repeating that word. Anyway, [2a] becomes ... approximately:
VAR[G_{1} + G_{2} + ... + G_{n}]  =
ΣVAR[G_{i} ]+ 2 ΣCOVAR[G_{j}, G_{k}]
  =
ΣVAR[G_{i} ]+ 2 ΣVAR[G_{j}] g^{kj}
  =
Σ[ i g^{2i2} s^{2}] + 2 Σ[ j g^{2j2} s^{2}] g^{kj}
  =
(s^{2}/g^{2})Σ[ i g^{2i} ] + 2(s^{2}/g^{2}) Σ[ j g^{k+j}]
where i = 1 to n, k = 2 to n and j < k

[!] VAR[G_{1} + G_{2} + ... + G_{n}] =
ΣVAR[G_{i} ] + 2 ΣVAR[G_{j}] g^{kj} =
(s^{2}/g^{2})Σ[ i g^{2i} ] + 2(s^{2}/g^{2}) Σ[ j g^{k+j}] approx
i from 1 to n, k from 2 to n and j < k (meaning j = 1, 2, ... k1)

From [!], we have two sums to evaluate:
Σ[ i g^{2i} ]
and Σ[ j g^{k+j}]
We have:
Σ[ i g^{2i} ]  = g^{2} + 2g^{4} + 3g^{6} + ... + n g^{2n}
  =
x + 2x^{2} + 3x^{3} + ... + n x^{n} = x (1 + 2x + 3x^{2} + ... + n x^{n1}) where x = g^{2} ... and we have a magic formula for that sum
  = x [n x^{n+1}  (n+1)x^{n}+1]/(x1)^{2}
  = g^{2} [n g^{2n+2}  (n+1)g^{2n}+1]/(g^{2} 1)^{2}

Magic Formula #4 was used:
1+2x+3x^{2}+...+nx^{n1} =
[nx^{n+1}  (n+1)x^{n}+1] / (x1)^{2}
>That looks awful. How about in all its grandeur, eh?
Okay, in all its grandeur, it looks like:
Σ[ j g^{k+j}]  =
[g^{3}] + [g^{4}+2g^{5}] + [g^{5}+2g^{6}+3g^{7}] + [g^{6}+2g^{7}+3g^{8}+4g^{9}] + ... +
[g^{n+1}+2g^{n+2}+3g^{n+3}+...+(n1)g^{2n1}]
  = g^{3} + g^{4}[1+2g] + g^{5}[1+2g+3g^{2}] + ... + g^{n+1}[1+2g+3g^{2}+...+(n1)g^{n2}] for n1 terms
  = (g^{3}+g^{4}+g^{5}+...+g^{n+1}) +
(g^{4}+g^{5}+g^{6}+...+g^{n+1})2g +
(g^{5}+g^{6}+...+g^{n+1})3g^{2} + ... +
g^{n+1}(n1)g^{n2}
where we've collected the terms multiplying 1 then 2g then 3g^{2} etc. ... ending with the term multiplying (n1)g^{n2}
  = g^{3}[1+g+g^{2}+...+g^{n2}] +
2g^{5}[1+g+g^{2}+...+g^{n3}] +
3g^{7}[1+g+g^{2}+...+g^{n4}] + ... +
(n1)g^{2n1}[1]
  = g^{3}[(g^{n1}1)/(g1)] +
2g^{5}[(g^{n2}1)/(g1)]+
3g^{7}[(g^{n3}1)/(g1)] + ... +
(n1)g^{2n1}[(g1)/(g1)]
where we've used another magic fromula: 1 + x + x^{2} + ... + x^{m1} = (x^{m}  1)/(x1)
  = [
[g^{n+2}g^{3}] +
[2g^{n+3}2g^{5}] +
[3g^{n+4}3g^{7}] +
[4g^{n+5}4g^{9}] +... +
[(n1)g^{2n}(n1)g^{2n1}]
] / (g1)
  = [
g^{n+2}[1+2g+3g^{2}+...+(n1)g^{n2}] 
g^{3}[1+2g^{2}+3g^{4}+...+(n1)g^{2n4}]
] / (g1)
where we've collected like terms
  = [
g^{n+2}[(n1)g^{n}  ng^{n1}+1] / (g1)^{2} 
g^{3}[(n1)g^{2n}  ng^{2n2}+1] / (g^{2}1)^{2}
] / (g1)
where we've used magic formual [5] with x = g and again with x = g^{2}
  = [
{ (n1) g^{2n+2}  n g^{2n+1} + g^{n+2} } (g+1)^{2}  (n1) g^{2n+3} + n g^{2n+1}  g^{3}
] / [ (g  1)(g^{2}  1)^{2} ]
where we've taken a (g^{2}  1)^{2} out, to the right
  = ...

>Can't you just give the final result?!
Okay, we've calculated both sums ... so here it is:
[!!!] VAR[G_{1} + G_{2} + ... + G_{n}] = s^{2} [n g^{2n+2}  (n+1)g^{2n}+1]/(g^{2} 1)^{2}
+
2(s^{2}/g^{2}) [
{(n1) g^{2n+2}  n g^{2n+1}+g^{n+2}} (g+1)^{2}  (n1) g^{2n+3} + n g^{2n+1}  g^{3}
] / [ (g1)(g^{2}1)^{2}]
where
g_{1}, g_{2}, ...g_{n} are random Gain Factors over n days,
they are from a distribution with Mean = g and Standard Deviation = s,
G_{m} = g_{1}g_{2} ...g_{n} are the cumulative Gain Factors
and the formula is good for daily gains and n not too large (say n < 50)

>Isn't there something more elegant?
You said to forget elegance. Besides, we won't be using it ... not with pencil and paper. We'll use a spreadsheet and ...
>How good ... uh, how bad is it?
Okay, here's what we'll do (again!):
 Generate n daily returns: g_{1}, g_{2}, ...g_{n}.
 With these, construct the numbers G_{1}=g_{1}, G_{2}=g_{1}g_{2}, ... G_{n}=g_{1}g_{2}...g_{n}.
 Calculate the SUM(n) = G_{1} + G_{2} + ... + G_{n}.
 Repeat steps 1, 2 and 3 ten thousand times and calculate the Variance of the 10,000 numbers SUM(n).
 Repeat steps 1, 2, 3 and 4 for n = 1, 2, 3, ... 40.
 Compare the Variances obtained (using this actual data) with the formula [!!!].
The result is shown below where we also plot the Standard Deviation = SQRT(Variance).
It assumes an average daily return of 1% and a Standard Deviation (of daily returns) of 2%:
Figure 2
Notice an interesting thing, in [!!!].
The Variance of the sum of Gain Factors for the past n days is proportional to the Variance of the Returns, namely s^{2}.
That means that the Standard Deviation of this Special Sum sum is proportional to s.
It looks like:
SD[G_{1} + G_{2} + ... + G_{n}] = f(n,g)s.
If we assume that the starting stock price was P_{0}, n days ago, then we have:
SD[P_{1} + P_{2} + ... + P_{n}] = P_{0}f(n,g)s.
>Yeah, so what good is it?
Some time ago I was looking for the Variance of stock prices over the past n days, in connection with Bollinger Bands,
here.
>I remember. You got a lousy result.
Uh ... yes, thanks. I took "the Variance of a Sum = the Sum of the Variances" as an approximation and ...
>That's your creeping senility ... again?
Yes, thanks ... again.
