Welcome to the second part of my "It's a Long Way to the Top" series: an absolute beginner's guide to optimisation … by an absolute beginner to optimisation. :-)
In the first installment we went through an intuitive exposition of the logic of optimisation. Now it's time to put that in a slightly more formal language. Here is an intuitive explanation of what a function and its derivative are and how to find its maximum. It's not meant to be complete and readers interested are advised to consult a text (you can try here, but it's up to you).
Mathematically one represents Fateful Mountain (from our previous post) like so
y = f(x).
One reads that as "y is a function of x". x is called the independent variable (the set of all xs is called the function's domain); y is the dependent variable (the set of all ys is called its range). In the mountain example, x is the horizontal distance between a given point and an arbitrary origin (0). y is the height relative to the origin.
One can think of a function in other ways. One is as a black box which, for a given input produces one and only one output, like so
Examples of functions:
- A multiple-choice quiz. In the sheet you were given, you select what you think are the answers to the questions posed; the teacher evaluates your answers and returns your mark. Your answers are the independent variable (x), your mark is the dependent variable (y). The teacher is the black box, the function (f).
- Spreadsheet functions. Open your trusty spreadsheet application. Enter 0.012 in cell A1 and "=sin(A1)" (without the quotation marks) in cell B1. A1 is x; B1 shows y (see below). Try entering any values you like (and exploring other spreadsheet functions).
- A vending machine. You introduce coins for the amount required for the product you want and press the key combination corresponding to it (both things are x); the machine gives you what you want (y). The machine is the function.
The examples show many things (you should give them some thought and think examples of your own). Something important is that for one input, if the black box outputs something, one and only one output is produced: if you and Ben gave the same answers to the same questions, then you both must get the same mark, makes sense?
A black box that does not fulfill this requirement is not a function: sin(0.012) expressed with 8 decimals must always be 0.01199971; it cannot at times be 0.01199971 and at other times other values.
There are other requirements. For the kind of optimisation commonly used in economics, inputs and outputs must be numerical (the mountain: x and y are distances; example 2 is also numerical). Often functions need more than one input (example 3).
Functions with non-numerical inputs (examples 1 and 3) are not further considered here.
Dealing now with numerical functions only. Two additional things are required to use the kind of optimisation involved in the Fateful Mountain example:
- whatever inputs and outputs measure, they must admit decimals (again, the mountain and example 2). In fact, they generally must be real numbers, often non-negative (unlike example 2).
- as already mentioned, functions must be "smooth" (more on this at the end).
Just to make sure this all makes sense: only one of the representations below is a function. Which one? What kind of maximum applies to it? Why the two others are not functions? (Hint: think of the quiz example and your and Ben's marks). Imagine again you are a termite. Could you apply the procedure described if you found a vertical wall, like Yellow?
Back to the Fateful Mountain. The table below (another useful conceptualisation) shows the heights and horizontal positions of a series of points over Fateful Mountain (leave the arrows for later):
At this point, you might want to quickly re-read the previous post.
Imagine you are at point A (see table above) and you move over the mountain to B. In effect, you moved upwards and eastwards at the same time. One can, however, think of that single movement as formed by two separate components: one horizontal and one vertical.
The vertical component is:
height of point B [i.e. yB = f(xB)] minus height of point A [i.e. yA = f(xA)]:
f(xB) - f(xA) = 0.0025-0.0009 = 0.0016
The horizontal component is:
horizontal position of B (xB) minus horizontal position of A (xA):
xB - xA =-3 - (-4) = +1
(note well: positive and equal to 1 for all points in the table above).
If you climb up the west face of the mountain, you are always moving eastwards: the distance to successive horizontal positions always increases. The horizontal component is always strictly positive and given how the points are spaced, constant and equal to +1. We call this quantity Delta.
The vertical component, though, varies. It can be strictly positive, strictly negative or nil: it will be positive when you are climbing up the mountain (When will it be negative? What happens when it's nil?)
The quotient of the vertical and horizontal components is the Slope (from the previous post):
f(x ) – f(x ) f(x ) – f(x )
B A B A 0.0016
Slope = ------------- = ------------- = ------
x – x Delta 1
That's what the third column in that table indicates. Arithmetically, the first value in that column (0.0016) is the result (to four decimals) of (vertical component) divided by (horizontal component) or (0.0025-0.0009)/[-3-(-4)].
The vertical component gives the sign of the slope. In the table the sign of the slope changes, yes? Well, what does that mean?
With that one can locate the summit: it's somewhere between 2 and 4. To find the location more precisely one needs intermediate values of x (see why decimals are important?): instead of unit increments (3-2 = 4-3 = 1), smaller increments (say, 0.5, 0.1, 0.01, etc). Let's call Delta those increments:
x – x = 1 = Delta => x = x + Delta
B A B A
The table below does some more calculations, using Delta = 0.1:
The logic is the same and there's no point repeating it. You iterate the same process for ever smaller Delta (a mathematical word for iterative process is algorithm). You are doing what the termite did: once you roughly found the summit, you go back and forth, giving smaller steps, to make sure you are as precise and accurate as possible.
One could perform similar calculations with more decimals. (Given the formula I used to draw that chart, the peak of Fateful Mountain is located 2.653 units east from the origin and its height is 0.385).
Remember all the talk about "smoothness"? This is what it boils down to. When one calculates the Slope using infinitesimally small Delta, the Slope gets a special name: first derivative of f(x), or just derivative, for short. There are different notations for the derivative of a function f, but a convenient one here is f'(x). To find the global maximum of f(x) where f is continuous, has a single maximum and is continuously differentiable over all its domain we just find x* such that f'(x*) = 0. To assume the curve is "smooth" means one can calculate its first derivative over all the set of x.
That's what the figure below shows.
We did that the hard way, by following a numerical solution algorithm. The beauty of this is that if we know the formula operating inside the black box (and it fulfills the requirements), we don't need to use a numerical algorithm. We can actually do that by algebraic manipulation.
(TO BE CONTINUED)