yego.me
💡 Stop wasting time. Read Youtube instead of watch. Download Chrome Extension

The Hessian matrix | Multivariable calculus | Khan Academy


5m read
·Nov 11, 2024

Hey guys, so before talking about the vector form for the quadratic approximation of multivariable functions, I've got to introduce this thing called the Hessen Matrix. The Hessen Matrix, and essentially what this is, it's just a way to package all the information of the second derivatives of a function.

So let's say you have some kind of multivariable function, like, I don't know, like the example we had in the last video, e to the x halves multiplied s of y. So some kind of multivariable function. What the Hessen Matrix is—and it's often denoted with an H or kind of a bold-faced h—is it's a matrix, incidentally enough, that contains all the second partial derivatives of f.

So the first component is going to be the partial derivative of f with respect to x, kind of twice in a row. Everything in this first column, it's kind of like you first do it with respect to x, because the next part is the second derivative, where first you do it with respect to x and then you do it with respect to y. So that's kind of the first column of the matrix.

And then up here, it's the partial derivative where first you do it with respect to y and then you do it with respect to x. And then over here, it's where you do it with respect to y both times in a row—so partial with respect to y both times in a row.

So let's go ahead and actually compute this and think about what this would look like in the case of our specific function here. In order to get all the second partial derivatives, we first should just kind of keep a record of the first partial derivatives.

So the partial derivative of f with respect to x, the only place x shows up is in this e to the x halves, kind of bringing down that 1/2 e to the x halves, and s of y just looks like a constant as far as x is concerned, s of y. Then the partial derivative with respect to y, partial derivative of f with respect to y—now e to the x halves looks like a constant, and it's being multiplied by something that has a y in it.

e to the x halves, the derivative of s of y since we're doing it with respect to y is cosine of y. So these terms won't be included in the Hess itself, but we're kind of just keeping a record of them because now, when we go in to fill in the matrix, this upper left component, we're taking the second partial derivative where we do it with respect to x, then x again.

So up here, when we did it with respect to x, if we did it with respect to x again, we kind of bring down another half, so that becomes 1/4 e to the x halves and that s of y just still looks like a constant, s of y. Then this mixed partial derivative, where we do it with respect to x then y, so we did it with respect to x here.

When we differentiate this with respect to y, the 2 e to the x halves just looks like a constant, but then the derivative of s of y ends up as cosine of y. And then up here, it's going to be the same thing, but let's kind of see how when you do it in the other direction, when you do it first with respect to y, then x.

So over here, we did it first with respect to y; if we took this derivative with respect to x, the half would come down, so that would be 1/2 e to the x halves multiplied by cosine of y because that just looks like a constant since we're doing it with respect to x the second time, so that would be cosine of y.

And it shouldn't feel like a surprise that both of these terms turn out to be the same, with most functions that's the case. Technically not all functions; you can come up with some crazy things where this won't be symmetric, where you'll have different terms than the diagonal, but for the most part, those you can kind of expect to be the same.

And then this last term here, where we do it with respect to y twice, we now think of taking the derivative of this whole term with respect to y. That e to the x halves looks like a constant, and the derivative of cosine is negative sine of y—so this whole thing, a matrix each of whose components is a multivariable function, is the Hessian.

This is the Hessian of f, and sometimes people will write it as Hessian of f, kind of specifying what function it's of. You could think of this, I mean, you could think of it as a matrix-valued function, which feels kind of weird, but you know, you plug in two different values x and y and you'll get a matrix.

So it's this matrix-valued function, and the nice thing about writing it like this is that you could actually extend it so that rather than just for functions that have two variables—let's say you had a function, I'll kind of like clear this up. Let's say you had a function that had three variables or four variables, or kind of any number.

So let's say it was, you know, a function of x, y, and z. Then you can follow this pattern, and following down the first column here, the next term that you would get would be the second partial derivative of f where first you do it with respect to x and then you do it with respect to z.

And then over here, it would be the second partial derivative of f where first you did it with respect to y and then you do it with respect to z. I'll clear up even more room here because you'd have another column where you'd have the second partial derivative where this time everything—you know, first you do it with respect to z and then with respect to x, and then over here you'd have the second partial derivative where first you do it with respect to z and then with respect to y.

And then as the very last component, you'd have the second partial derivative where first you do it with respect to, well I guess you do it with respect to z twice. So this whole thing, this 3x3 matrix would be the Hess of a three-variable function, and you can see how you could extend this pattern, where if it was a four-variable function, you'd get a 4x4 matrix of all of the possible second partial derivatives.

And if it was a 100-variable function, you would have a 100 by 100 matrix. So the nice thing about having this is then we can talk about that by just referencing the symbol. And we'll see in the next video how this makes it very nice to express, for example, the quadratic approximation of any kind of multivariable function—not just a two-variable function—and the symbols don't get way out of hand because you don't have to reference each one of these individual components.

You can just reference the matrix as a whole and start doing matrix operations. And I will see you in that next video.

More Articles

View All
Summer of Soul | National Geographic
(Fast-paced drumming music) [Man] What time is it? ♪ This is the dawning of the age of Aquarius ♪ “Summer of Soul” is about the Harlem Cultural Festival in 1969. With so many greats of music in the day, Tony Lawrence and Hal Tulchin came up with an ide…
Divergence example
So I’ve got a vector field here V of XY where the first component of the output is just ( x \cdot y ) and the second component is ( y^2 - x^2 ). The picture of this vector field is here; this is what that vector field looks like. What I’d like to do is co…
Graphing a circle from its standard equation | Mathematics II | High School Math | Khan Academy
[Voiceover] Whereas to graph the circle (x + 5) squared plus (y - 5) squared equals four. I know what you’re thinking. What’s all of this silliness on the right-hand side? This is actually just the view we use when we’re trying to debug things on Khan Aca…
Ask me anything with Sal Khan: March 27 | Homeroom with Sal
Hi everyone! Welcome to our daily live stream. This is why we’ve almost, we’ve been doing this for a little bit over two weeks. For those of you all who are new to this, the whole point of this is Khan Academy is a not-for-profit with a mission of providi…
Carolynn Levy And Panel (Jon Levy, Jason Kwon) - Startup Legal Mechanics
I would like to introduce my colleague Carolyn Levy to my right here, who’s going to talk about startup mechanics, and then with John Levy and Jason Quan they’ll answer some questions about getting your startup started, legal issues. I will point out that…
10 ways to stop ruining your life
In my last video, I went over 10 ways to quickly ruin your life, and it is by far the most depressing video I have ever made in my life. A lot of you who watched that video said, “Wow, I don’t actually need a tutorial for this. I see myself in every singl…