## The distance to a point from an arbitrary line

### An Exercise in Making Things Easier for Yourself

The problem: you have a point, and a line, and want to find the distance from the point to the line, specifically the shortest distance that meets that requirement.

Given: the point $(a_x,a_y)$ and the line $y=mx+b$ as in the picture, we want to find the point of intersection $(p_x,p_y)$ so we can then find the distance from $a$ to $p$.

### Don’t reinvent the wheel (use what you already know)

Things we remember1 (hopefully) that could be useful but we won’t prove or explain here:

1. The shortest distance from a point to a line is along the line that is perpendicular to the given line, and passes through the given point. This is a pretty good guess based on the picture, and conveniently enough happens to be true.
2. Perpendicular lines have negative reciprocal slopes. This one isn’t so obvious, and maybe I’ll have an entry for it someday, but not today.

If we do nothing to make the problem easier, then things will get ugly. Therefore, that’s exactly what we will do! After all, whoever learned something by watching someone else do it the easiest way first?

### The plan

A little planning first: we’ll find the equation of the red line, then use that to find the point of intersection, then use the Pythagorean Theorem to find the distance, and last of all try to clean things up a little bit. Also, relax! This is a first draft, so with a little luck we will be able to clean it up a bit.

### The Red Line

Let’s find the equation of the red line. We know that it has slope $-\frac{1}{m}$ (the negative reciprocal of $m$) and goes through the point $(a_x,a_y)$, so let’s do a little math and find an equation that fits. Find $\color{purple}b$, which is the $y$-intercept of the red line:

\begin{align*} y&=-\frac{1}{m}x+\color{purple}b \\ a_y&=-\frac{1}{m}a_x+\color{purple}b \\ a_y&=-\frac{a_x}{m}+\color{purple}b \\ a_y+\frac{a_x}{m}&=\color{purple}b \\ \end{align*}

We start with the equation of some line having slope $-\frac{1}{m}$, then put in the point $(a_x,a_y)$ because we know that has values of $a_x$ and $a_y$ that make the equation true. The key here is to recognize that when we know what everything is except for $b$, that tells us we should probably solve for $b$. Now we can write the equation of our line with slope $-\frac{1}{m}$ and passing through point $(a_x,a_y)$ as:

\begin{align} y=-\frac{1}{m}x + \frac{a_x}{m}+a_y \label{a1} \end{align}

### Point of Intersection

Great! Now all we have to do is find the point of intersection and then use the Pythagorean Theorem to find the distance between the two points. There are lots of ways to find the intersection of two lines, we’ll just use one of them. According to $\eqref{a1}$, another way to write $y$ is as $-\frac{1}{m}x + \frac{a_x}{m}+a_y$, so let’s do a little substitution:

\begin{align} {\color{blue}y}&={\color{blue}-\frac{1}{m}x + \frac{a_x}{m}+a_y} \notag \\ {\color{blue}y}&=mx+b \label{a2} \\ {\color{blue}-\frac{1}{m}x + \frac{a_x}{m}+a_y}&=mx+b \notag \\ \end{align}

Then all we need to do is to solve for $\color{red}x$:

\begin{align*} \def\x{{\color{red}x}} -\frac{1}{m}\x + \frac{a_x}{m}+a_y&=m\x +b \\ \frac{a_x}{m} + a_y - b &= m\x + \frac{1}{m}\x \\ \frac{a_x}{m} + \frac{ma_y}{m} - \frac{mb}{m} &= \x (m+\frac{1}{m}) \\ \frac{a_x + ma_y - mb}{m} &= \x (m\cdot\frac{m}{m}+\frac{1}{m}) \\ \frac{a_x + ma_y - mb}{m} &= \x (\frac{m^2+1}{m}) \\ \frac{a_x + ma_y - mb}{m}\cdot\frac{m}{m^2+1}&=\x \\ \frac{a_x + ma_y - mb}{m^2+1}&=\x \\ \end{align*}

Now we know what $x$ must be…brilliant! Since we know what $\color{red}x$ is now, let’s just substitute that into $\eqref{a2}$ and see what we get for $y$:

\begin{align*} y&=m{\color{red}x} +b \\ y&=m{\color{red}\frac{a_x + ma_y - mb}{m^2+1}}+b \\ y&=\frac{ma_x + m^2a_y - m^2b}{m^2+1}+b \\ \end{align*}

The point of intersection is a little messy:

$\left(\frac{a_x + ma_y - mb}{m^2+1}, \frac{ma_x + m^2a_y - m^2b}{m^2+1}+b\right)$

### Last step: Pythagorean Theorem

Well, no one ever said this was going to be easy, but we’re a little crazy so let’s just keep going. One Pythagorean Theorem coming right up! (To be fair, I have a bad feeling about this…)

\begin{align} d^2&=(\Delta x)^2 + (\Delta y)^2 \notag \\ d^2&=\left( \frac{a_x + ma_y - mb}{m^2+1} - a_x \right)^2 + \left(\frac{ma_x + m^2a_y - m^2b}{m^2+1}+b-a_y \right)^2 \notag \\ d^2&=\left( \frac{a_x + ma_y - mb}{m^2+1} - \frac{a_x(m^2+1)}{m^2+1} \right)^2 + \left(\frac{ma_x + m^2a_y - m^2b}{m^2+1}+\frac{(b-a_y)(m^2+1)}{m^2+1} \right)^2 \label{d1} \\ d^2&=\left( \frac{a_x + ma_y - mb - a_x(m^2+1)}{m^2+1} \right)^2 + \left(\frac{ma_x + m^2a_y - m^2b - (b-a_y)(m^2+1)}{m^2+1} \right)^2 \notag \\ (m^2+1)^2\cdot d^2&=( a_x + ma_y - mb - a_x(m^2+1) )^2 + (ma_x + m^2a_y - m^2b + (b-a_y)(m^2+1) )^2 \notag \\ (m^2+1)^2\cdot d^2&=( a_x + ma_y - mb - m^2a_x - a_x )^2 + (ma_x + m^2a_y - m^2b + m^2b + b - m^2a_y - a_y )^2 \notag \\ (m^2+1)^2\cdot d^2&=( ma_y - mb - m^2a_x )^2 + (ma_x + b - a_y )^2 \label{d2} \\ (m^2+1)^2\cdot d^2&=( -m(-a_y + b + ma_x ) )^2 + (ma_x + b - a_y )^2 \notag \\ (m^2+1)^2\cdot d^2&= (-m)^2 (- a_y + b + ma_x )^2 + (ma_x + b - a_y )^2 \notag \\ (m^2+1)^2\cdot d^2&= m^2 (ma_x - a_y + b)^2 + (ma_x + b - a_y )^2 \label{d3} \\ (m^2+1)^2\cdot d^2&= (m^2+1) (ma_x - a_y + b)^2 \notag \\ (m^2+1)\cdot d^2&= (ma_x - a_y + b)^2 \notag \\ d^2&= \frac{(ma_x - a_y + b)^2}{m^2+1} \notag \\ d&= \frac{ma_x - a_y + b}{\sqrt{m^2+1}} \label{d4} \\ \end{align}

…and there’s our answer. Wow, that was really ugly. The answer isn’t too bad, but getting from the third line through about the sixth really took being careful.

### Reflection: What went well and what didn’t

End of first draft. Looking back, what could we have done better? To be fair, up on line three, I considered multiplying everything out to make it even tougher, but I’m too lazy for that. Leaving it as it was probably helped, considering that the $m^2+1$ ended up factoring out and then canceling out, but the bigger prize was the other factor: $ma_x – a_y + b$. If I had multiplied out the squares at $\eqref{d1}$, I’m not sure I would have found the factors in $\eqref{d2}$ and $\eqref{d3}$.

What could we have done differently? The math was mostly straightforward: we used a few well-known formulas, did some substitution, and solved for a variable. The biggest obstacle was that it was a little more involved than solving $2x+7 = 5x-2$ for $x$. If I could find a way to reduce the number of variables involved, that would really help.

### Using Zero!

The best way I know of to make something disappear is to define it as being ZERO. Since the origin of a graph is defined as being $(0,0)$, why not take advantage of that? Moving the origin won’t change much, and it may reduce what we need to do.

Whoops! I don’t think that line has the same equation now. It still has the same slope, but the $y$-intercept has changed. When we moved the origin from $(0,0)$ to $(1,4)$, we effectively subtracted 4 from every $y$ value and 1 from every $x$ value. Let’s do a little colorful line graphing to figure out how to fix our equation. We know we want the blue line to pass through (on this graph) $(2,-3)$ and we want to use the 1 and 4 to get there.

To get the purple line, subtract 1 from the $x$-intercept (it sounds a little odd, but bear with me). How much did the $y$-intercept change? It moved up from -1 to $-\frac{1}{3}$, a change of $+\frac{2}{3}$. Coincidence? As one of my students once said, “Nah…there are no coincidences in math.” That looks suspiciously like we added $ma_x$ to $b$. Then to get the green line, we subtract 4 from the $y$-intercept. That change is pretty straightforward.

So to fix the equation of our line, we do the same:

\begin{align*} y&=mx+b \\ y&=mx+b \,{\color{purple}+\,ma_x}\, {\color{green}-\,a_y} \\ \end{align*}

Referring to my student above, is it really coincidence that our new $b$ value is $ma_x – a_y + b$, the same expression that shows up in $\eqref{d4}$?

### Shorter math is easier

Now that we have a new equation, and some convenient zeroes, let’s do this again, with $p$ being a point that just happens to be on both lines:

\begin{align*} p_y&=-\frac{1}{m}p_x \qquad && \text{(red line)} \\ p_y&=mp_x + (b+ma_x-a_y) &&\text{(blue line)} \\ -\frac{1}{m}p_x &= mp_x + (b+ma_x-a_y) \\ -(b+ma_x-a_y) &= mp_x+\frac{1}{m}p_x \\ a_y-b-ma_x &= p_x\left(m+\frac{1}{m}\right) \\ a_y-b-ma_x &= \frac{m^2+1}{m}p_x \\ p_x&=\frac{m(a_y-b-ma_x)}{m^2+1} &&\text{(x)} \\ p_y&=-\frac{1}{m}p_x \\ p_y&=-\frac{1}{m}\frac{m(a_y-b-ma_x)}{m^2+1} \\ p_y&=\frac{b+ma_x-a_y}{m^2+1} &&\text{(y)} \\ d^2&=\left(\frac{m(a_y-b-ma_x)}{m^2+1}\right)^2 + \left(\frac{b+ma_x-a_y}{m^2+1}\right)^2 \\ (m^2+1)^2\cdot d^2&=(m(a_y-b-ma_x))^2 + (b+ma_x-a_y)^2 \\ \end{align*}

…and we are back to $\eqref{d3}$. That was much easier!

(moral of the story: zero is your friend!)

So last time we were working on finding a good way to say “the darts I throw went more or less HERE”.

Points Distance $(x_i-\bar{x})^2$ $(y_i-\bar{y})^2$
(-2.2,4.1) 4.6 7.8 13.7
(-1.1,2.1) 2.4 2.9 2.9
(-0.5,0.7) 1.1 1.2 0.1
(-2.5,-3.1) 4.7 9.6 12.3
(-0.3,-3.5) 4 0.8 15.2
(2.2,3.8) 3.8 2.6 11.6
(2.4,1.8) 2.3 3.2 2
(3.8,-0.4) 3.3 10.2 0.6
(3.2,-1.8) 3.4 6.8 4.8
Avg=(0.6,0.4) Max=4.7 Avg=5 Avg=7
$\sqrt{Avg}=2.2$ $\sqrt{Avg}=2.6$

$(\sigma_x,\sigma_y)=\sqrt{\frac{1}{n}\sum_{i=1}^n((x_i-\bar{x})^2,(y_i-\bar{y})^2)}$

And in the end we came up with a formula for “standard deviation”:$\sigma=\sqrt{\frac{1}{n}\sum(x_i-\bar{x})}$
It looks like the box is at least moderately representative of where the darts went, but not many of the darts actually landed in the box. Why not? We came up with an oddball average distance away from center in the x- and y-directions, and based our rectangle on that, so why does it contain so few of the points? The picture above only contains two of the nine darts, so how can we justify calling it an ‘average’ center of any sort?

Consider for a moment what this would look like if all of the darts had a zero y-value (orange) or a zero x-value (green):

Looking at the orange dots, it looks like there are a bunch to the left and a bunch to the right and not many in the middle, so if our box is in the middle it’s probably OK if it only has a few darts in it. For the greens, five of the nine are in the box so I can’t complain too much about that. So where is the problem? The box we have drawn contains only darts that fit into BOTH categories!

Revisiting basic probability for a moment, recall how we would calculate the probability of rolling a ‘3’ on a normal die while also flipping a coin and getting ‘heads’: the probability of rolling a ‘3’ is $\frac{1}{6}$ and the probability of flipping for ‘heads’ is $\frac{1}{2}$ so the probability of both at the same time is $\frac{1}{6} \times \frac{1}{2} = \frac{1}{12}$. We have to multiply the fractions to find the probability of two independent events happening at the same time.

If we assume that our left-right error has nothing to do with the up-down error in our dart-throwing skills, then we can treat those two axes as independent and do a little multiplication to find out what is likely to be in the box. Just to see what happens, let’s redo the experiment with 100 points and keep an eye on how many points appear in the box when reduced to each axis, and also how many un-moved points stay in the box. We’ll call this our probability of being ‘in the box’. I’ll spare you the listing of 100 points, and finding the averages and standard deviations: we did that in the previous article about darts.

Wow. That graph is a bit of a mess. Even making the blue and red points semi-transparent didn’t help much. If you could count the individual dots, you would find that there are 72 blue dots in the box, 68 red dots in the box, and 50 orange ones. Coincidentally, $\frac{72}{100} \times \frac{68}{100} = 0.4896 \approx \frac{50}{100}$, which is about what we would expect since measured probability is not guaranteed to be exactly the same as a prediction.

Arguably, a dart at the far corner of the box is not as ‘good’ a shot as one in the middle of an edge, if you consider the distance from center. Perhaps something useful would be to consider a circle, but if we put a circle inside a rectangle, we lose more darts (we’re already down to about half, not very good if we want to refer to ‘most’ of them) and also circles don’t fit very well inside rectangles. It seems to me that the obvious solution is then to use an ellipse and put it outside, killing both birds with one equation. The math is going to get a little rough here, since finding out whether or not a point is inside an ellipse is a little harder than for a rectangle, and finding the equation of the right ellipse is a little tougher as well. Let’s start with a simple example and work our way up to the darts.

In this graph, there is a rectangle extending 3 units left and right from center, and 2 units up and down. The ellipse fits into the same space, and clearly is smaller than the rectangle. How are we going to figure out how big the ellipse ought to be in order to be outside the rectangle instead of inside? Let’s see if there is a value $k$ by which we can scale both the width $w$ and height $h$ of our ellipse in order to include the point $(w,h)$.

\begin{align*} y&=\frac{h}{w}\sqrt{w^2-x^2} \\ y&=\frac{h\cdot k}{w\cdot k}\sqrt{(w\cdot k)^2-x^2} \\ y&=\frac{h\cdot k}{w\cdot k}\sqrt{(w\cdot k)^2-x^2} \\ y&=\frac{h}{w}\sqrt{w^2\cdot k^2-x^2} \\ \end{align*}

Now that we have an equation with the proper scaling factor in it, let’s replace $(x,y)$ with the specific point $(h,w)$ which, coincidentally, has the same values as the height and width (measured from center, not all the way across) of our ellipse.
\begin{align*} y&=\frac{h}{w}\sqrt{w^2\cdot k^2-x^2} \\ h&=\frac{h}{w}\sqrt{w^2\cdot k^2-w^2} \\ h&=\frac{h}{w}\sqrt{w^2(k^2-1)} \\ h&=\frac{h}{w}\cdot w\cdot \sqrt{k^2-1} \\ h&=h\cdot \sqrt{k^2-1} \\ 1&=\sqrt{k^2-1} \\ 1&=k^2-1 \\ 2&=k^2 \\ \sqrt{2}&=k \\ \end{align*}

Well, perhaps that wasn’t too bad. Starting with the equation of an ellipse of width $w$ and height $h$, substitute $w$ in for $x$ and set the result to $h$, and also make sure to use $hk$ and $wk$ for the height and width of the ellipse we want to find. After than, cancel common factors, square everything to get rid of the square root, and solve for $k^2$. Finally, take the square root to get our answer in terms of $k$. Whoever would have guessed that all we have to do is multiply $w$ and $h$ by a simple constant? Then the final equation, which circumscribes the rectangle extending $w$ units left and right and $h$ units up and down, looks like this:

\begin{align*} y&=\frac{h'}{w'}\sqrt{w'^2-x^2} \\ y&=\frac{h\sqrt{2}}{w\sqrt{2}}\sqrt{((w\sqrt{2})^2-x^2} \\ y&=\frac{h}{w}\sqrt{2w^2-x^2} \\ \end{align*}

And the resulting graph looks like this:

Beautiful! Now how many of the points is this ellipse going to contain?

There are 32 orange dots outside the blue ellipse, so there are 68 inside. I just eyeballed that from the graph, so if that’s off by one don’t be too surprised. $100-32 = 68$, so 68% of the darts are within the ellipse. That’s in line with the 72 blue and 68 red dots above! So perhaps we are at the point where we can say, “Most of my darts are inside that ellipse!”

But…

What happens if our high darts tend to be to the right, and low ones to the left?

That is a story for next time. Until then, keep throwing those darts!

## Where did my darts go?

Suppose for a moment that you throw darts at a dartboard occasionally, and want to know ahead of time the most likely place for your darts to go when you are aiming at a particular spot. Assuming you are not a professional, this could be a fairly large patch. With that in mind, let us now take a journey through one of my other self-refocussing exercises that turned out to be useful.

Where is the center of the three darts thrown? That’s pretty easy: take the average of the points, which means take the average of the x-values, and of the y-values, and plot that as another point:

$(\bar{x},\bar{y}) = \frac{1}{n} \sum_{i=0}^{n}(x_i,y_i)$

More darts just means more points on the graph, but doesn’t really change any of the math, so we’ll go on. If we considered the average of the points as the center of a circle with the radius reaching to the farthest point, what would that circle look like?

Great! Now we know that of the darts thrown, all of them landed in that circle. True, yes, useful…not so much. What if had kept track of more darts and wanted to know where “most” of them went? That question is just a little bit more interesting.

Let’s start with a bunch of (made up, but work with me for a bit here) thrown darts in orange, and the average of them marked in blue:

Points
(-2.2,4.1)
(-1.1,2.1)
(-0.5,0.7)
(-2.5,-3.1)
(-0.3,-3.5)
(2.2,3.8)
(2.4,1.8)
(3.8,-0.4)
(3.2,-1.8)
Average=(0.6,0.4)

The question now is how far away is the farthest point? Sadly, the easiest way to find it is to calculate the distance for each point from center.

Points Distance
(-2.2,4.1) 4.7
(-1.1,2.1) 2.4
(-0.5,0.7) 1.1
(-2.5,-3.1) 4.7
(-0.3,-3.5) 4
(2.2,3.8) 3.8
(2.4,1.8) 2.3
(3.8,-0.4) 3.3
(3.2,-1.8) 3.4
Average=(0.6,0.4) Maximum=4.7

Since I’m not really sure what to do next, let’s repeat ourselves for a moment and consider taking another average. Let’s add up the distances for each of the $x$ and $y$ values from our center point and find the averages of those, and that will at least tell us whether my aim is better up-and-down or left-to-right. Since I’m not really sure what to call this average distance, I’ll just pick a random letter $\sigma$ and use that. Since we have two distances, one left-right and the other up-down, let’s subscript the $\sigma$ as $\sigma_x, \sigma_y$ so we know which one is which.

$(\sigma_x,\sigma_y)=\frac{1}{n}\sum_{i=1}^n(x_i-\bar{x},y_i-\bar{y})$
Points Distance $x_i-\bar{x}$ $y_i-\bar{y}$
(-2.2,4.1) 4.6 -2.8 3.7
(-1.1,2.1) 2.4 -1.7 1.7
(-0.5,0.7) 1.1 -1.1 0.3
(-2.5,-3.1) 4.7 -3.1 -3.5
(-0.3,-3.5) 4 -0.9 -3.9
(2.2,3.8) 3.8 1.6 3.4
(2.4,1.8) 2.3 1.8 1.4
(3.8,-0.4) 3.3 3.2 -0.8
(3.2,-1.8) 3.4 2.6 -2.2
Avg=(0.6,0.4) Max=4.7 Avg=0 Avg=0

Oops. What happened? Some of our $x_i-\bar{x}$ were positive, and some negative, so in the end they cancelled out. So how do we ensure that all of the differences are positive? The easiest way I can think of is to ignore the sign, but I don’t really know how to do sums when I have to treat some of the terms differently. The next best choice may be to square everything, which we know makes numbers positive.

$(\sigma_x,\sigma_y)=\frac{1}{n}\sum_{i=1}^n((x_i-\bar{x})^2,(y_i-\bar{y})^2)$
Points Distance $(x_i-\bar{x})^2$ $(y_i-\bar{y})^2$
(-2.2,4.1) 4.6 7.8 13.7
(-1.1,2.1) 2.4 2.9 2.9
(-0.5,0.7) 1.1 1.2 0.1
(-2.5,-3.1) 4.7 9.6 12.3
(-0.3,-3.5) 4 0.8 15.2
(2.2,3.8) 3.8 2.6 11.6
(2.4,1.8) 2.3 3.2 2
(3.8,-0.4) 3.3 10.2 0.6
(3.2,-1.8) 3.4 6.8 4.8
Avg=(0.6,0.4) Max=4.7 Avg=5 Avg=7

I’m not sure I want to make a box that extends 5 units left and right, and 7 units up and down (the blue box in the graph below). It seems way too big, well outside the darts farthest from center, and doesn’t look at all like an average of the darts. Perhaps I should follow the advice of my science teacher: “When something looks really wrong at the end, follow the units.” Let’s call everything inches even though the graph doesn’t have anything on it (my dartboard doesn’t either). In $(x_i-\bar{x})^2$ inches minus inches is still inches, and inches times inches gives inches squared. Wait! the $n$ in $\frac{1}{n}$ doesn’t have any units – it is just the number of darts thrown, so dividing by $n$ doesn’t change the square inches. So really, what I want to do is take the square roots of 5 and 7 to get the actual size (in inches from center, not square inches) of the box to draw, colored green in the graph below.

Points Distance $(x_i-\bar{x})^2$ $(y_i-\bar{y})^2$
(-2.2,4.1) 4.6 7.8 13.7
(-1.1,2.1) 2.4 2.9 2.9
(-0.5,0.7) 1.1 1.2 0.1
(-2.5,-3.1) 4.7 9.6 12.3
(-0.3,-3.5) 4 0.8 15.2
(2.2,3.8) 3.8 2.6 11.6
(2.4,1.8) 2.3 3.2 2
(3.8,-0.4) 3.3 10.2 0.6
(3.2,-1.8) 3.4 6.8 4.8
Avg=(0.6,0.4) Max=4.7 Avg=5 Avg=7
$\sqrt{Avg}=2.2$ $\sqrt{Avg}=2.6$
$(\sigma_x,\sigma_y)=\sqrt{\frac{1}{n}\sum_{i=1}^n((x_i-\bar{x})^2,(y_i-\bar{y})^2)}$

I’m still not really happy about those boxes, though. The green one is too small, and the blue one too big. On the bright side, we did ‘accidentally’ come up with the formula for standard deviation:$\sigma=\sqrt{\frac{1}{n}\sum(x_i-\bar{x})^2}$

The biggest problem that I see is that the green box clearly does not contain enough of the darts to be useful. We’ll figure out why next time, and see if we can work our way towards a solution.

## Fun With Series

In hindsight, one of the good things about teaching where I did was that I had time to think about other stuff during school. This was especially handy when I was studying for the actuarial exams (I passed the first two on first attempt) and needed to practice some of the little things I haven’t done in a while and that never comes up in the course of what little I was able to teach. So just for grins, here’s a little derivation practice.

Yes, $i$ is interest rate. It looks funny not being part of a complex number, but that’s how it goes sometimes
For instance, deriving the formula for annuities: $\annuai{n}{i}$

Since an annuity is just the sum of the present values of a series of payments, it’s pretty easy to derive. The key is to remember the equation for the present value $PV$ of some amount of money at a later date $FV$, given the periodic interest rate (per period) $i$ expected between now and later, and the number of periods $t$ between now and later. Valuation equations all derive from this most basic premise.

\begin{align} PV=\frac{1}{(1+i)^t} \cdot FV\label{a1} \end{align}

A useful shortcut. Why? Rearrangements.
\begin{align} v &=\frac{1}{1+i} \label{v1}\\ (1+i)\cdot v &=1 \notag\\ v + vi &= 1 \notag\\ vi &=1-v \label{v2} \end{align}
Consider the worst way to describe the present value of an annuity: “Figure a payment of, say, $50 at the end of each year for the next 10 years. Take the first payment and find its present value. Then the second, and add it to your first answer…” Ouch. Well, that’s what math is for. How about we write it in math instead? $PV_{annuity}=PV(first payment)+PV(second payment)+ \dotsb + PV(last payment) \notag$ Slightly closer, but not clean yet. We keep adding up the same sort of thing over and over again. Perhaps there’s a way to write that a little more concisely. $PV_{annuity}=\sum_{t=1}^{10}PV_{t} \notag$ The jump is a little big, but now it is starting to look like math!2 Substitute our definition of$PV$from$\eqref{a1}$, then use our definition of$v$at$\eqref{v1}, factor out the constant, and we get something useful. And as a side benefit, we also see that the amount of the payment really is irrelevant. \begin{align*} PV_{annuity}&=\sum_{t=1}^{10}50\cdot \frac{1}{(1+i)^t} \\ PV_{annuity}&=\sum_{t=1}^{10} 50 \cdot v^t \\ PV_{annuity}&=50 \cdot \sum_{t=1}^{10} v^t \\ \end{align*} The purpose of this isn’t the deriving of a formula, it’s about how we think about things and make them easier for ourselves. Thinking about it in English is difficult because the language is less than conducive to compacting a complex thought into an easily manipulated form. Crossing the bridge towards mathematical thinking is the single biggest hurdle my students face, and one they typically stumble on. Why? <rant>To paraphrase (and make suitable for a family publication): “Because math is a class to blow off, not a tool to be used to make life easier”2</rant> I take every opportunity I can find to point out that formalized math is there to let us both be lazier and also accomplish more at the same time. Back to the point… \begin{align} \annuai{n}{i}&=\sum_{t=1}^n v^t \label{q1} \\ \annuai{n}{i}&=v^1+v^2+v^3+\dotsb+v^{n-1}+v^n \label{q2} \\ v\cdot \annuai{n}{i}&=v\cdot v^1+v\cdot v^2+v\cdot v^3+\dotsb+v\cdot v^{n-1}+v\cdot v^n \label{q3} \\ v\cdot \annuai{n}{i}&=v^2+v^3+v^4+\dotsb+v^{n}+v^{n+1} \label{q4} \\ \annuai{n}{i} - v\cdot \annuai{n}{i}&=v^1 - v^{n+1} \label{q5} \\ \annuai{n}{i}(1-v)&=v^1 - v^{n+1} \label{q6} \\ \annuai{n}{i}\cdot i \cdot v&=v\cdot (1-v^n) \label{q7} \\ \annuai{n}{i}&=\frac{1-v^n}{i} \label{q8} \\ \end{align} In general, there are often steps that are either necessary to show or not. One such example here is\eqref{q3}$, which could easily be omitted. But if what you mean to do is multiply every term by$v$, then perhaps writing it out will help reduce mistakes. Another place is at$\eqref{q5}$, which could easily have another equation in front of it explicitly showing the subtraction. Getting MathJax up and running, and putting together the other pieces for this blog, have made this post take WAAAY longer than it should have. Doing a little math on paper is a good mind-clearing exercise sometimes, but this was nearly the opposite. Given our basic definition$\act{a}{n}{I}=\frac{1-v^n}{i}$, there are a few other useful varieties of basic annuities worth having fun with. Consider an annuity that pays 1 at the beginning of each month, instead of at the end. This is referred to as an “annuity due” and uses the symbol$\actd{a}{n}{i}. Each payment has one fewer compounding period, so \begin{align*} \actd{a}{n}{i} &= v^0 + v^1 + \dotsb + v^{n-2} + v^{n-1} \\ \actd{a}{n}{i} &= \frac{v^1}{v} + \frac{v^2}{v} + \dotsb + \frac{v^{n-1}}{v} + \frac{v^{n}}{v} \\ \actd{a}{n}{i} &= \frac{\act{a}{n}{i}}{v} \\ \actd{a}{n}{i} &= \frac{1-v^n}{iv} \\ \end{align*} Consider an annuity that pays 1 at the end of the first month, 2 the second, three the third, and so on fornmonths. The usual name for this is an increasing annuity. \begin{align*} \act{(Ia)}{n}{i} &= PV(1) + PV(2) + \dotsb + PV(n-1) + PV(n) \\ \act{(Ia)}{n}{i} &= \sum_{t=1}^n PV(t) \\ \act{(Ia)}{n}{i} &= \sum_{t=1}^n tv^t \\ \act{(Ia)}{n}{i} &= 1v^1 + 2v^2 + \dotsb + (n-1)v^{n-1} + nv^n \\ v\left(\act{(Ia)}{n}{i}\right) &= 1v^2 + 2v^3 + \dotsb + (n-1)v^n + nv^{n+1} \\ (1-v)\left(\act{(Ia)}{n}{i}\right) &= v^1 + v^2 + \dotsb + v^{n-1} + v^n - v^{n+1} \\ (iv)\left(\act{(Ia)}{n}{i}\right) &= \act{a}{n}{i} - v^{n+1} \\ \act{(Ia)}{n}{i} &= \frac{\act{a}{n}{i} - v^{n+1}}{iv} \\ \act{(Ia)}{n}{i} &= \frac{\actd{a}{n}{i} - v^n}{i} \\ \end{align*} And then there is the decreasing annuity, which starts by payingn$at the end of the first month,$(n-1)$the second, and so on down to 1 the$nth month. \begin{align*} \act{(Da)}{n}{i} &= PV(n) + PV(n-1) + \dotsb + PV(2) + PV(1) \\ \act{(Da)}{n}{i} &= \sum_{t=1}{n} PV(n-t+1) \\ \act{(Da)}{n}{i} &= \sum_{t=1}{n} (n-t+1)v^t \\ \act{(Da)}{n}{i} &= (n)v^1 + (n-1)v^2 + \dotsb + 2v^{n-1} + 1v^n \\ v\left(\act{(Da)}{n}{i}\right) &= (n)v^2 + (n-1)v^3 + \dotsb + 2v^{n} + 1v^{n+1} \\ (1-v)\left(\act{(Da)}{n}{i}\right) &= nv^1 - 1v^2 - v^3 - \dotsb - v^n - v^{n+1} \\ (iv)\left(\act{(Da)}{n}{i}\right) &= nv - v\left(\act{a}{n}{i}\right) \\ \act{(Da)}{n}{i} &= \frac{nv - v\left(\act{a}{n}{i}\right)}{vi} \\ \act{(Da)}{n}{i} &= \frac{n - \left(\act{a}{n}{i}\right)}{i} \\ \end{align*} Yes, this is what I did for relaxation when I was stressed at school. There’s another one that is entertaining (but I’m not sure I see the point) in which the annuity pays 1 the first month, 2 the second, up ton$in the$n^{\text{th}}$month, and then decreases back to 1 in the$(2n-1)^{\text{th}}$month. I have heard it called a “rainbow annuity” (even though it is a triangle, not an arc) and I don’t know the symbol for it so I’ll just call it$\act{RA}{n}{i}$. (Mind you,$nrefers to the peak payment, not the number of payments) \begin{align*} \act{RA}{n}{i} &= v^1 + 2v^2 + \dotsb + (n-1)v^{n-1} + nv^n + (n-1)v^{n+1} + \dotsb + 2v^{2n-2} + v^{2n-1} \\ v\left(\act{RA}{n}{i}\right) &= v^2 + 2v^3 + \dotsb + (n-1)v^{n} + nv^{n+1} + (n-1)v^{n+2} + \dotsb + 2v^{2n-1} + v^{2n} \\ (1-v)\left(\act{RA}{n}{i}\right) &= v + v^2 + \dotsb + v^n + (-1)v^{n+1} + (-1)v^{n+2} + \dotsb + (-1)v^{2n-1} + (-1) v^{2n} \\ (iv)\left(\act{RA}{n}{i}\right) &= \act{a}{n}{i} + (-1)v^{n+1} + (-1)v^{n+2} + \dotsb + (-1)v^{2n-1} + (-1)v^{2n} \\ (iv)\left(\act{RA}{n}{i}\right) &= \act{a}{n}{i} - v^n\left(v^1 + v^2 + \dotsb + v^{n-1} + v^{n}\right) \\ (iv)\left(\act{RA}{n}{i}\right) &= \act{a}{n}{i} - v^n\left(\act{a}{n}{i}\right) \\ (iv)\left(\act{RA}{n}{i}\right) &= (1-v^n)\left(\act{a}{n}{i}\right) \\ \act{RA}{n}{i} &= \frac{1-v^n}{iv} \act{a}{n}{i} \\ \act{RA}{n}{i} &= \frac{1}{v} \frac{1-v^n}{iv} \act{a}{n}{i} \\ \act{RA}{n}{i} &= \frac{1}{v} \act{a}{n}{i} \act{a}{n}{i} \\ \act{RA}{n}{i} &= \frac{\left(\act{a}{n}{i}\right)^2}{v} \\ \end{align*} One of the neat bits that shows up in each and every one of these is the ongoing pattern of “take the first line, multiply byv$, and subtract it off” which collapses the sequence into something more manageable, followed by representing the left side with a factor of$(1-v)$which turns into$iv\$, conveniently canceling out something on the right. One tool, many variations. Isn’t that what math is all about?