yego.me
💡 Stop wasting time. Read Youtube instead of watch. Download Chrome Extension

Vector form of multivariable quadratic approximation


6m read
·Nov 11, 2024

Okay, so we are finally ready to express the quadratic approximation of a multivariable function in vector form. So, I have the whole thing written out here where ( f ) is the function that we are trying to approximate. ( X_0 ) and ( Y_K ) is the constant point about which we are approximating, and then this entire expression is the quadratic approximation, which I've talked about in past videos.

If it seems very complicated or absurd, or you're unfamiliar with it, let's just dissect it real quick. This over here is the constant term; this is just going to evaluate to a constant. Everything over here is the linear term because it just involves taking a variable multiplied by a constant. Then the remainder, every one of these components will have two variables multiplied into it. So, like ( X^2 ) comes up, and ( X \cdot Y ) and ( Y^2 ) comes up, so that's the quadratic term.

Now, to vectorize things, first of all, let's write down the input variable ( XY ) as a vector. Typically, we'll do that with a boldfaced ( \mathbf{X} ) to indicate that it's a vector, and its components are just going to be the single variables ( X ) and ( Y ), the non-boldfaced. So, this is the vector representing the variable input. Correspondingly, a boldfaced ( \mathbf{X} ) with a little subscript ( 0 ) (i.e., ( X_0 )) is going to be the constant input, the single point in space near which we are approximating.

When we write things like that, this constant term simply enough is going to look like evaluating your function at that boldfaced ( X_0 ). So, that's probably the easiest one to handle. Now, the linear term looks like a dot product, and if we kind of expand it out as the dot product, it looks like we're taking the partial derivative of ( f ) with respect to ( X ) and then the partial derivative with respect to ( Y ), and we're evaluating both of those at that boldfaced ( X_0 ) input.

Now, each one of those partial derivatives is multiplied by the variable minus the constant number. So, this looks like taking the dot product. Here, I'm going to erase the word "linear." We're taking it with ( X - X_0 ) and ( Y ) as ( Y_K ). This is just expressing the same linear term but as a dot product, but the convenience here is that this is totally the same thing as saying the gradient of ( f ).

That's the vector that contains all the partial derivatives evaluated at the special input ( X_0 ), and then we're taking the dot product between that and the variable vector boldfaced ( \mathbf{X} - X_0 ). Since when you do this component-wise, boldfaced ( \mathbf{X} ) as ( X_0 ), if we kind of think here, it'll be ( X ), the variable minus ( X_0 ), the constant, ( Y ), the variable minus ( Y_0 ), the constant, which is what we have up there.

So, this expression kind of vectorizes the whole linear term. And now, the beef here, the hard part, how are we going to vectorize this quadratic term? Now, that's what I was leading to in the last couple of videos where I talked about how you express a quadratic form like this with a matrix.

The way that you do it, I'll just kind of scroll down to give us some room. The way that you do it is we'll have a matrix whose components are all of these constants. It'll be this ( \frac{1}{2} ) times the second partial derivative evaluated there, and I'm just going to, for convenience sake, I'm going to just take ( \frac{1}{2} ) times the second partial derivative with respect to ( X ) and leave it as understood that we're evaluating it at this point.

On the other diagonal, you have ( \frac{1}{2} ) times the other kind of partial derivative with respect to ( Y ) two times in a row. Then we're going to multiply it by this constant here, but this term kind of gets broken apart into two different components. If you'll remember in the quadratic form video, it was always things where it was ( A ) and then ( 2B ) and ( C ) as your constants for the quadratic form.

So, if we're interpreting this as two times something, then it gets broken down, and on one corner, it shows up as ( f_{xy} ) and on the other one, kind of ( \frac{1}{2} f_{XY} ). So like both of these together are going to constitute the entire mixed partial derivative. The way that we express the quadratic form is we're going to multiply this by, well, the first component is whatever the thing is that's squared here. So it's going to be that ( X - X_0 ) and then the second component is whatever the other thing squared is, which in this case is ( Y - Y_K ).

Of course, we take that same vector but we put it in on the other side too. So, let me make a little bit of room; this is going to be wide. We're going to take that same vector and kind of put it on its side, so it'll be ( X - X_0 ) as the first component and then ( Y - Y_K ) as the second component, but it's written horizontally.

If you multiply out the entire matrix, it's going to give us the same expression that you have up here. If that seems unfamiliar, if that seems, you know, "How do you go from there to there?" check out the video on quadratic forms or you can check out the article where I'm talking about the quadratic approximation as a whole. I kind of go through the computation there.

Now this matrix right here is almost the Hessian matrix; this is why I made a video about the Hessian matrix. It's not quite because everything has a ( \frac{1}{2} ) multiplied into it, so I'm just going to kind of take that out, and we'll remember we have to multiply a ( \frac{1}{2} ) in at some point. But otherwise, it is the Hessian matrix which we denote with a kind of boldfaced ( \mathbf{H} ), and I emphasize that it's the Hessian of ( f ).

The Hessian is something you take of a function, and like I said, remember each of these terms we should be thinking of as evaluated on the special input point, evaluating it at that special, you know, boldfaced ( X_0 ) input point. I was just kind of too lazy to write it in each time, the ( X_0, Y_0, Y_0 ), all of that, but what we have then is we're multiplying it on the right by this whole vector, the variable vector boldfaced ( \mathbf{X} - \mathbf{X}_0 ).

That's what that entire vector is, and then we kind of have the same thing on the right, you know, boldfaced vector ( \mathbf{X} - X_0 ), except that we transpose it. We kind of put it on its side, and the way you denote that is you have a little ( T ) there for transpose. So this term captures all of the quadratic information that we need for the approximation.

So just to put it all together, if we go back up, when we put the constant term that we have, the linear term, and this quadratic form that we just found all together, what we get is that the quadratic approximation of ( f ), which is a function, we'll think of it as a vector input ( \mathbf{X} ), equals the function itself evaluated at, you know, whatever point we're approximating near plus the gradient of ( f ), which is kind of its vector analog of a derivative evaluated at that point.

So this is a constant vector dot product with the variable vector ( \mathbf{X} - \mathbf{X}_0 ), that whole thing, plus ( \frac{1}{2} ) the -- we'll just copy down this whole quadratic term up there, the variable minus the constant multiplied by the Hessian, which is kind of like an extension of the second derivative to multivariable functions.

We're evaluating that, let's see, we're evaluating it at the constant ( X_0 ), and then on the right side, we're multiplying it by the variable ( \mathbf{X} - \mathbf{X}_0 ). This is the quadratic approximation in vector form, and the important part is now it doesn't just have to be of a two-variable input.

You could imagine plugging in a three-variable input or a four-variable input, and all of these terms make sense. You know, you take the gradient of a four-variable function, you'll get a vector with four components. You take the Hessian of a four-variable function, you would get a ( 4 \times 4 ) matrix, and all of these terms make sense.

I think it's also prettier to write it this way because it looks a lot more like a Taylor expansion in the single-variable world. You have, you know, a constant term plus the value of a derivative times ( X ) as a constant plus ( \frac{1}{2} ), what's kind of like the second derivative term, was kind of like taking an ( X^2 ). But this is how it looks in the vector world. So, in that way, it's actually maybe a little bit more familiar than writing it out in the full, you know, component by component term where it's easy to kind of get lost in the weeds there.

So, um, full vectorized form of the quadratic approximation of a scalar-valued multivariable function — boy, is that a lot to say!

More Articles

View All
15 Traits Of A STRONG PERSON
Strong people are valuable assets in any space, but it takes a lot of work to be one. Becoming a strong person is not something we’re born with or something that happens in a day; it’s built over time. There are certain characteristics these people share,…
The Most Profound Philosophical Ideas
All are lunatics, but he who can analyze his delusion is called a philosopher. Reading philosophy isn’t fun; it’s a slow process that requires your full attention. But it is one of the most rewarding things you can do. It fills you with the sense of growt…
I Spoke to the REAL Inventor of Facebook. (The Social Network Explained)
Okay, we are now focusing on one of the newest members of Harvard’s class of 2006. Mark Zuckerberg originally launched the Facebook.com from his dorm at Harvard College on the 4th of February 2004. He and his friend Eduardo Saverin had invested a thousand…
15 Ways to Increase Your Income This Year
You need more money because everything’s become extremely expensive. It doesn’t matter if you’re an employee, a freelancer, or a business owner. Here are 15 ways to increase your income this year. First up, brute force. Work more hours if you’re able to …
Office Hours With Sal: Thursday, March 19. Livestream From Homeroom
All right. It looks like I’m online on Facebook. Hello everyone at Facebook! Sorry, running a little bit late. If you ask, I’ll tell you about my morning. And it looks like we’re online on YouTube. All right, so this is good. As you can imagine, this has…
Sketching exponentials
Now I want to show you a really useful manual skill that you can use when you have voltages that look like exponentials. We’re going to talk about this exponential curve here that’s generated as part of the natural response of this RC circuit. We worked …