I got email from Ron McEwan about generating ellipses that contain points in a scatter plot where ...
>Wait! What are you talking about?
Okay, we do this:
 Plot a bunch of points (x_{1}, y_{1}), (x_{2}, y_{2}) ... (x_{n}, y_{n}) (that's our scatter plot)
 Generate an ellipse centred on the origin.
 Vary the lengths of the axes to that it'll contain "most" of the points.
>Most?
Okay, let's do this:
 Assume we have stock returns for two stocks: p_{k} and q_{k} for k = 1, 2, ... n.
 Assume that they are random selections from some distribution.
 We plot the deviations from the Means.
That is, if M[p] = (p_{1} + p_{2} + ... p_{n}) / n, then we actually plot x_{k} = p_{k}  M[p].
 Do the same for the yvalues. That is, plot y_{k} = q_{k}  M[q].
Note: The origin (0,0) represents a point where both returns are at their respective Means.
 Fig. 1 Ellipse and scatter plot 
If the returns have a normal distribution, then each set of x and yvalues has Mean = 0
and, let's say, Standard Deviations S_{x} and S_{y}.
Then there's a probability of finding the xvalue between x and x + dx given by the magic formula
Similarly for y values, so we imagine a wee rectangle and ...
>Rectangle?
Can you see that wee rectangle in Figure 1? The width is dx and the height is dy and it's located at (x,y).
>I really think that ...

Fig. 2 Normal distribution: (zero mean) f(x) dx is the probability of x being in an interval about x of width x 

Patience. All will be clear in a minute or three.
Okay, knowing the probability of the horizontal value being in a wee interval at x and the vertical value being in a wee interval about y,
the probability of finding a point in a wee rectangle at (x,y) should be (using the magic formula in Fig 2) proportional to:
e^{(1/2) { (x/Sx)2 + (y/Sy)2} } 
 For sanitary reasons, we'll let S_{x} = a and S_{y} = b so we get:
 e^{(1/2) { (x/a)2 + (y/b)2} } 

>You just multiply the probabilities, right?
Yes, that's what I've done.
Then there are curves where the probability is constant, namely ...
>Don't tell me! They're the curves (x/a)^{2} + (y/b)^{2} = constant.
Indubitably! And those curves have a name. They're called ...
>Don't tell me! They're ellipses.
You got it.
Okay, now comes the interesting part ... rotating the ellipse.
>And the significance of that is ... what?
Patience.
 Fig. 3 A rotated ellipse 
Let's look at a real, live scatter plot, say GE versus the DOW
Fig. 4 show the relationship between the daily returns, over the past year. See the four wee rectangles?
According to the above prescription, we should associate with each the same probability, but we see that ...
>But the rectangles in the 2nd and 4th quadrant don't have too many points. In fact, the points seem to ...
They seem to crowd together along the "regression line", right?
(See this or this.)
So, if the returns are not independent, but somehow tend to be linearly related (as implied by their hugging the regression line), then the prescription
above for the probability of finding a point in a wee rectangle needs to be modified.
Indeed, if we have a randomly selected xvalue, the corresponding yvalue is likely to be close to that regression line.
If the equation of that line is y = mx + k, then for a given xvalue, we expect the yvalue to be close to mx + k.
 Fig. 4 GE returns vs DOW returns 
>What's that tall skinny rectangle doing in Fig. 4?
Patience. By the way, if you want to see a scatter plot where the returns really hug the regression line, check this out:
Okay, we'll study Bayes Theorem. It gives us the probability of getting a particular pair (x,y) if y depends (in some way) upon x.
We'll call P[x] the probability of getting a particular xvalue. (Actually, P[x] will be the probability of getting a value in some wee interval about x.)
The probability of getting a particular yvalue given that xvalue is (according to Bayes):
[A] P[x and y] = P[x] P[y, given x] 
The guy called P[x] can be obtained from Fig. 2 (assuming the xvalues are normally distributed):
[B] P[x] is proportional to e^{(1/2) (x/a)2} 
To get P[y, given x] we have to stare carefully at the distribution of yvalues for a given xvalue. (Remember the tall, skinny rectangle?)
Since we're doing the math snowjob thing (by assuming normal distributions for convenience simplicity), we'll assume that the yvalues are normally distributed about mx + k.
(Them's the yvalues in the tall, skinny rectangle in Fig. 4 ... distributed about the value mx + k).
That is, the probability P[y, given x] looks like what's in Fig. 2, with x replaced by y  mx  k:
[C] P[y, given x] is proportional to
e^{(1/2) {(y  mx  k)/c}2}
where c is the standard deviation associated with the distribution of yvalues for a given xvalue. 
>Huh? Replace y by y  (mx + k)? Can you do that?
In Fig. 2, the variable x appears all by itself 'cause its mean is 0.
In general (for a nonzero mean), we'd have this
 
In our investigation of the distribution of yvalues, given x, the Mean value of the yvalues is assumed to be (mx + k).
>Can you do that?
This is mathmanship, remember?
Now we plug [B] and [C] into [A]
and get something that looks like this:
[D] P[x and y] is proportional to
e^{(1/2) (x/a)2}e^{(1/2) {(y  mx  k)/c}2} 
>But I don't see any characteristics of the y distribution. I mean, where's the standard deviation of the yvalues and ...
They're tied up in the values of m and k. Just wait until we're finished, okay? All will become clear.
>I doubt it!
>So, where's the spreadsheet?
Click on the picture:
>Uh ... is it useful?
How would I know?
