Warm up to the second partial derivative test
So, in single variable calculus, if you have a function f of x and you want to find the maximum or the minimum of this function, what you do is you find its derivative and you set that equal to zero. Graphically, this has the interpretation that, you know, if you have the graph of f, setting its derivative equal to zero means that you're looking for places where it's got a flat tangent line.
So, in the graph that I drew, it would be these two flat tangent lines. Once you find these points, you know, for example, here you have one solution that I'll call x1, and then here you have another solution x2. You can ask yourself the question: are these maxima or are they minima? Right? Because both of these can have flat tangent lines.
So, when you do find this and you want to understand if it’s a maximum or a minimum, if you're just looking at the graph, we can tell, you know, you can tell that this point here is a local maximum, and this point here is a local minimum. But if you weren't looking at the graph, there's a nice test that will tell you the answer. You basically look for the second derivative.
In this case, because the concavity is down, that second derivative is going to be less than zero. And then over here, because the concavity is up, that second derivative is greater than zero. By getting this information about the concavity, you can make a conclusion: when the concavity is down, you're at a local maximum; when the concavity is up, you're at a local minimum.
In the case where the second derivative is zero, it's undetermined; you'd have to do more tests to figure it out. It’s unknown. So, in the multivariable world, the situation is very similar. As I've talked about in previous videos, what you do is you'd have some kind of function, and let's say it's a two-variable function. Instead of looking for where the derivative equals zero, you're going to be looking for where the gradient of your function is equal to the zero vector, which we might make bold to emphasize that that's a vector.
That corresponds with finding flat tangent planes. If that seems unfamiliar, go back and take a look at the video where I introduced the idea of multivariable maxima and minima. But the subject of this video is going to be on what is analogous to this second derivative test, where in the single-variable world you just find the second derivative and check if it's greater than or less than zero.
How can we in the multivariable world do something similar to figure out if you have a local minimum, a local maximum, or that new possibility of a saddle point that I talked about in the last video? So there is another test, and it's called the second partial derivative test, and I'll get to the specifics of that at the very end of this video.
But to set the landscape, I want to actually talk through a specific example where we're finding when the gradient equals zero, just to see what that looks like and just to have some concrete formulas to deal with. The function that you're looking at right now is f of x, y is equal to x to the fourth minus four x squared plus y squared. Okay, so that's the function that we're dealing with.
In order to find where its tangent plane is flat, we're looking for where the gradient equals zero. Remember, this is just really a way of unpacking the requirements that both partial derivatives, the partial derivative of f with respect to x at some point, and we'll kind of write it in as we're looking for the x and y where this is zero, and also where the partial derivative of f with respect to y at that same point x, y is equal to zero.
So the idea is that this is going to give us some kind of system of equations that we can solve for x and y. So let’s go ahead and actually do that. In this case, the partial derivative with respect to x, we look up here, and the only places where x shows up, we have x to the fourth minus four x squared. So that x to the fourth turns into four times x cubed; minus four x squared that becomes minus 8x.
Then y, y just looks like a constant, so we're adding a constant, and nothing changes here. The first requirement is that this portion is equal to zero. Now the second part, where we're looking for the partial derivative with respect to y, the only place where y shows up is this y squared term. So the partial derivative with respect to y is just 2y, and we're setting that equal to zero.
I chose a simple example where these partial derivative equations, you know, this one nicely only includes x, and this one nicely only includes y. But that's not always the case; you can imagine if you intermingle the variables a little bit more, these will actually kind of intermingle x's and y's, and it'll be a harder thing to solve.
But I just want something where we can actually start to find the solutions. So if we actually solve this system, this equation here that 2y equals 0 just gives us the fact that y has to equal 0. So that’s nice enough, right? And then the second equation, that 4x cubed minus 8x equals 0. Let’s go ahead and rewrite that where I'm going to factor out one of the x’s and factor out a 4.
So this is 4x multiplied by x squared minus two has to equal zero. So there are two different ways that this can equal zero, right? Either x itself is equal to zero, so that would be one solution, x is equal to zero, or x squared minus 2 is 0, which would mean x is plus or minus the square root of 2. So we have x is plus or minus the square root of 2.
The solution to the system of equations: we know that no matter what, y has to equal 0, and then one of three different things can happen: x equals 0, x equals positive square root of 2, or x equals negative square root of 2. So this gives us three separate solutions, and I’ll go ahead and write them down. Our three solutions as ordered pairs are going to be either (0, 0) for when x is 0 and y is 0. You have (sqrt(2), 0), and then you have (-sqrt(2), 0).
These are the three different points, the three different values for x and y that satisfy the two requirements that both partial derivatives are zero. What that should mean on the graph then is when we look at those three different inputs. All of those have flat tangent planes.
So the first one (0, 0)—if we kind of look above, I guess we're kind of inside the graph here—(0, 0) is right at the origin, and we can see just looking at the graph that that's actually a saddle point. You know, this is neither a local maximum nor a local minimum; it doesn't look like a peak or like a valley.
Then the other two, where we kind of move along the x-axis, I guess it turns out that this point here is directly below x equals positive square root of 2, and this other minimum is directly below x equals negative square root of 2. I wouldn't have been able to guess that just looking at the graph, but we just figured it out.
We can see visually that both of those are local minima. But the question is: how could we have figured that out once we find these solutions? If you didn't have the graph to look at immediately, how could you have figured out that (0, 0) corresponds to a saddle point and that both of these other solutions correspond to local minima?
Following the idea of the single-variable second derivative test, what you might do is take the second partial derivatives of our function and see how that might influence concavity. For example, if we take the second partial derivative with respect to x, and I'll try to squeeze it up here, the second partial derivative of the function with respect to x we’re doing that twice; we're taking the second derivative of this expression with respect to x.
So we bring down that 3, and that’s going to become 12, because 3 times 4 times x squared equals 12 times x squared minus 8. So what this means, we’ll kind of move that around, what this means in terms of the graph is that if we move purely in the x-direction (which means we kind of cut it with a plane representing a constant y value) and we look at the slice of the graph itself, this expression will tell us the concavity at every given point.
So these bottom two points here correspond to plus and minus x equals the square root of two. So if we go over here and think about the case where x equals the square root of 2 and we plug that into the expression, what are we going to get? Well, we’re going to get 12 multiplied by, if x equals square root of 2, then x squared is equal to 2. So that's 12 times 2 minus 8, so that's 24 minus 8, and we're going to get 16, which is a positive number, which is why you have positive concavity at each of these points.
So as far as the x direction is concerned, it feels like, ah yes, both of these have positive concavity, so they should look like local minima. Then, if you plug in 0, instead we go over here, and we say x equals 0, then when you plug that in, you’d have 12 times 0 minus 8, and instead of 16, you would be getting negative 8. So because you have a negative amount, that gives you this negative concavity on the graph, which is why as far as x is concerned, the origin looks like a local maximum.
So let’s actually write that down. If we kind of go down here and we're analyzing each one of these, and we think about what it looks like from the perspective of each variable, as far as x is concerned, that origin should look like a max, and then each of these two points should look like minima. This is kind of what the variable x thinks.
Then, the variable y, if we do something similar and we take the second partial derivative with respect to y, I'll go ahead and write that over here because this will be pretty quick. The second partial derivative with respect to y, we're taking the derivative of this expression with respect to y, and that's just a constant; that's just 2. And because it’s positive, it's telling you that as far as y is concerned, there's positive concavity everywhere.
On the graph, what that would mean, if you just look at things where you're kind of slicing with a constant x value to see pure movement in the y direction, there's always going to be positive concavity. Here, I've only drawn the plane where x is constantly equal to zero, but if you imagine kind of sliding that plane around left and right, you're always getting positive concavity.
So as far as y is concerned, everything looks like a local minimum. So we kind of go down here, and you'd say everything looks like a local minimum, minimum, minimum, and minimum. So it might be tempting here to think that you're done, to think you found all the information you need to, because you say, well, in the x and y direction, they disagree about whether that origin should be a maximum or a minimum, which is why it looks like a saddle point.
Then they agree; they agree on the other two points that both of them should look like a minimum, which is why you could say you think you might say both of these guys look like a minimum. However, that's actually not enough. There are cases, there are examples that I could draw where doing this kind of analysis would lead you to the wrong conclusion. You would conclude that certain points are, you know, a local minimum when in fact they're a saddle point.
The basic reason is that you need to take into account information given by that other second partial derivative, because in the multivariable world, you can take the partial derivative with respect to one variable and then with respect to another, and you have to take into account this mixed partial derivative term in order to make full conclusions.
I’m a little bit afraid that this video might be running long, so I’ll cut it short here, and then I will give you the second partial derivative test in its full glory, accounting for this mixed partial derivative term in the next video. I’ll also, you know, give intuition for where this comes in, why it comes in, why this simple analysis that we did in this case is close and it does give intuition, but it's not quite full and it won't give you the right conclusion always.
All right, I will see you then.