Lagrange multiplier example, part 1
So let's say you're running some kind of company, and you guys produce widgets. You produce some little trinket that people enjoy buying. The main costs that you have are labor—you know, the workers that you have creating these—and steel.
Let's just say that your labor costs are $20 per hour, $20 each hour, and then your steel costs are, uh, $2,000. Keep the numbers kind of related to each other: $2,000 for every ton of steel. Then, you've had your analysts work a little bit on trying to model the revenues you can make with your widgets as a function of hours of labor and tons of steel.
Let's say the revenue model that they've come up with—the revenue as a function of, you know, hours of labor and then S for steel—let's say, um, is equal to about, you know, 100 times the hours of labor to the power of 2/3 multiplied by the tons of steel to the power of 1/3. If you put in a given amount of labor and a given amount of steel, this is about how much money you're going to expect to earn.
Of course, you want to earn as much as you can, but let's say you actually have a budget for how much you're able to spend on all these things, and your budget—the budget—is $220,000. You're willing to spend $20,000, and you want to make as much money as you can according to this model based on that.
Now, this is exactly the kind of problem that the Lagrange multiplier technique is made for. We're trying to maximize some kind of function, and we have a constraint. Right now, the constraint isn't written as a formula, but we can pretty easily write it as a formula because what makes up our budget? Well, it's going to be the number of hours of labor multiplied by 20—so that's going to be $20 per hour multiplied by the number of hours you put in—plus $2,000 per ton of steel times the tons of steel that you put in.
So the constraint is basically that you have to have these values equal $20,000. I mean, you could say less than, right? You could say you're not willing to go any more than that, but intuitively and in reality, it's going to be the case that in order to maximize your revenues, you should squeeze every dollar that you have available and actually hit this constraint.
So this right here is the constraint of our problem. Let's go ahead and give this guy a name—the function that we're dealing with—a name, and I'm going to call it G of H, S, which is going to be that guy. Now, if you'll remember in the last few videos, the way we visualize something like this is to think about the set of all possible inputs.
In this case, you know, you might be thinking about the H-S plane—you know, the number of hours of labor on one axis, the number of tons of steel on another—and this constraint, well, in this case, it's a linear function. So this constraint is going to give us some kind of line that tells us which pairs of S and H are going to achieve that constraint.
Then the revenue function that we're dealing with will have certain contours—you know, maybe revenues of $10,000 have a certain contour that looks like this, and revenues of $100,000 have a certain contour that looks like this. But what we want is to find which value is barely touching the constraint curve, just tangent to it at a given point, because that's going to be the contour line where if you up the value by just a little bit, it would no longer intersect with that curve.
There would no longer be values of H and S that satisfy this constraint, and the way to think about finding that tangency is to consider the vector perpendicular to the tangent line to the curve at that point, which fortunately is represented by, let's see, let me make some room for myself here, uh, represented by the gradient.
The gradient of our R function— the function whose contours this is, the revenue—and what it means for this to be tangent to the constraint line is that there's going to be another vector, the gradient of G of our constraint function, that points in the same direction that's proportional to that. Typically, the way you write this is to say that the gradient of this function is proportional to the gradient of G, and this proportionality constant is called our Lagrange multiplier.
So let's go ahead and start working it out. Let's first compute the gradient of R. The gradient of R is going to be the partial derivative of R with respect to its first variable, which is H. So the partial derivative with respect to H, and the second component is its partial derivative with respect to that second variable S.
In this case, that first partial derivative—if we treat H as a variable and S as a constant—then that 2/3 gets brought down. So that'll be 100 times 2/3 times H to the power of, well, we've got to subtract 1 from 2/3 when we bring it down, so that'll be -1/3 multiplied by S to the 1/3.
Then the second component here, the partial derivative with respect to S, is going to be 100 times, well, now by treating S as the variable, we take down that 1/3. So that's 1/3 H to the 2/3, which just looks like a constant as far as S is concerned, and then we take S to the 1/3 - 1, which is 2/3 - 2/3—great.
So that's the gradient of R. Now we need the gradient of G, and that one's a lot easier, actually, because G is just a linear function. So when we take the gradient of G, which is its partial derivative with respect to H, partial H, and its partial derivative with respect to S, partial S, well, the partial with respect to H is just 20. The function looks like 20 times H plus something that's a constant.
So that ends up being 20, and then the partial with respect to S likewise is just 2,000 because it's just some constant multiplied by S plus a bunch of other stuff that looks like constants. So that's great.
This means when we set the gradient of R equal to the gradient of G, the pair of equations that we get—and let me just write it all out again— is we have this top one, which I'll call 200/3 times. Let’s go ahead and do a little simplifying while I'm rewriting things here.
So H to the 1/3 is really 1 over H to the 1/3, sorry, is 1 over H to the 1/3 and that's S to the 1/3. So all of this— that first component is being set equal to the first component of the gradient of G, which is 20 times lambda times this Lagrange multiplier, because we're not setting the gradients equal to each other, we're just setting them proportional to each other.
So that's the first equation, and then the second one—I'll go ahead and do some simplifying while I rewrite that one also—that's going to be 100/3, and then H to the 2/3. So times H to the 2/3 over S to the 2/3.
So S to the -2/3 is the same as 1 over S to the 2/3. All of that is equal to 2,000 times lambda, and the important thing is it's that same lambda because the entire vector has to be proportional.
I think right here is probably a pretty good point to stop, and in the next video, I'll go ahead and work through the details, and we'll land on a solution.