yego.me
💡 Stop wasting time. Read Youtube instead of watch. Download Chrome Extension

Vector form of multivariable quadratic approximation


6m read
·Nov 11, 2024

Okay, so we are finally ready to express the quadratic approximation of a multivariable function in vector form. So, I have the whole thing written out here where ( f ) is the function that we are trying to approximate. ( X_0 ) and ( Y_K ) is the constant point about which we are approximating, and then this entire expression is the quadratic approximation, which I've talked about in past videos.

If it seems very complicated or absurd, or you're unfamiliar with it, let's just dissect it real quick. This over here is the constant term; this is just going to evaluate to a constant. Everything over here is the linear term because it just involves taking a variable multiplied by a constant. Then the remainder, every one of these components will have two variables multiplied into it. So, like ( X^2 ) comes up, and ( X \cdot Y ) and ( Y^2 ) comes up, so that's the quadratic term.

Now, to vectorize things, first of all, let's write down the input variable ( XY ) as a vector. Typically, we'll do that with a boldfaced ( \mathbf{X} ) to indicate that it's a vector, and its components are just going to be the single variables ( X ) and ( Y ), the non-boldfaced. So, this is the vector representing the variable input. Correspondingly, a boldfaced ( \mathbf{X} ) with a little subscript ( 0 ) (i.e., ( X_0 )) is going to be the constant input, the single point in space near which we are approximating.

When we write things like that, this constant term simply enough is going to look like evaluating your function at that boldfaced ( X_0 ). So, that's probably the easiest one to handle. Now, the linear term looks like a dot product, and if we kind of expand it out as the dot product, it looks like we're taking the partial derivative of ( f ) with respect to ( X ) and then the partial derivative with respect to ( Y ), and we're evaluating both of those at that boldfaced ( X_0 ) input.

Now, each one of those partial derivatives is multiplied by the variable minus the constant number. So, this looks like taking the dot product. Here, I'm going to erase the word "linear." We're taking it with ( X - X_0 ) and ( Y ) as ( Y_K ). This is just expressing the same linear term but as a dot product, but the convenience here is that this is totally the same thing as saying the gradient of ( f ).

That's the vector that contains all the partial derivatives evaluated at the special input ( X_0 ), and then we're taking the dot product between that and the variable vector boldfaced ( \mathbf{X} - X_0 ). Since when you do this component-wise, boldfaced ( \mathbf{X} ) as ( X_0 ), if we kind of think here, it'll be ( X ), the variable minus ( X_0 ), the constant, ( Y ), the variable minus ( Y_0 ), the constant, which is what we have up there.

So, this expression kind of vectorizes the whole linear term. And now, the beef here, the hard part, how are we going to vectorize this quadratic term? Now, that's what I was leading to in the last couple of videos where I talked about how you express a quadratic form like this with a matrix.

The way that you do it, I'll just kind of scroll down to give us some room. The way that you do it is we'll have a matrix whose components are all of these constants. It'll be this ( \frac{1}{2} ) times the second partial derivative evaluated there, and I'm just going to, for convenience sake, I'm going to just take ( \frac{1}{2} ) times the second partial derivative with respect to ( X ) and leave it as understood that we're evaluating it at this point.

On the other diagonal, you have ( \frac{1}{2} ) times the other kind of partial derivative with respect to ( Y ) two times in a row. Then we're going to multiply it by this constant here, but this term kind of gets broken apart into two different components. If you'll remember in the quadratic form video, it was always things where it was ( A ) and then ( 2B ) and ( C ) as your constants for the quadratic form.

So, if we're interpreting this as two times something, then it gets broken down, and on one corner, it shows up as ( f_{xy} ) and on the other one, kind of ( \frac{1}{2} f_{XY} ). So like both of these together are going to constitute the entire mixed partial derivative. The way that we express the quadratic form is we're going to multiply this by, well, the first component is whatever the thing is that's squared here. So it's going to be that ( X - X_0 ) and then the second component is whatever the other thing squared is, which in this case is ( Y - Y_K ).

Of course, we take that same vector but we put it in on the other side too. So, let me make a little bit of room; this is going to be wide. We're going to take that same vector and kind of put it on its side, so it'll be ( X - X_0 ) as the first component and then ( Y - Y_K ) as the second component, but it's written horizontally.

If you multiply out the entire matrix, it's going to give us the same expression that you have up here. If that seems unfamiliar, if that seems, you know, "How do you go from there to there?" check out the video on quadratic forms or you can check out the article where I'm talking about the quadratic approximation as a whole. I kind of go through the computation there.

Now this matrix right here is almost the Hessian matrix; this is why I made a video about the Hessian matrix. It's not quite because everything has a ( \frac{1}{2} ) multiplied into it, so I'm just going to kind of take that out, and we'll remember we have to multiply a ( \frac{1}{2} ) in at some point. But otherwise, it is the Hessian matrix which we denote with a kind of boldfaced ( \mathbf{H} ), and I emphasize that it's the Hessian of ( f ).

The Hessian is something you take of a function, and like I said, remember each of these terms we should be thinking of as evaluated on the special input point, evaluating it at that special, you know, boldfaced ( X_0 ) input point. I was just kind of too lazy to write it in each time, the ( X_0, Y_0, Y_0 ), all of that, but what we have then is we're multiplying it on the right by this whole vector, the variable vector boldfaced ( \mathbf{X} - \mathbf{X}_0 ).

That's what that entire vector is, and then we kind of have the same thing on the right, you know, boldfaced vector ( \mathbf{X} - X_0 ), except that we transpose it. We kind of put it on its side, and the way you denote that is you have a little ( T ) there for transpose. So this term captures all of the quadratic information that we need for the approximation.

So just to put it all together, if we go back up, when we put the constant term that we have, the linear term, and this quadratic form that we just found all together, what we get is that the quadratic approximation of ( f ), which is a function, we'll think of it as a vector input ( \mathbf{X} ), equals the function itself evaluated at, you know, whatever point we're approximating near plus the gradient of ( f ), which is kind of its vector analog of a derivative evaluated at that point.

So this is a constant vector dot product with the variable vector ( \mathbf{X} - \mathbf{X}_0 ), that whole thing, plus ( \frac{1}{2} ) the -- we'll just copy down this whole quadratic term up there, the variable minus the constant multiplied by the Hessian, which is kind of like an extension of the second derivative to multivariable functions.

We're evaluating that, let's see, we're evaluating it at the constant ( X_0 ), and then on the right side, we're multiplying it by the variable ( \mathbf{X} - \mathbf{X}_0 ). This is the quadratic approximation in vector form, and the important part is now it doesn't just have to be of a two-variable input.

You could imagine plugging in a three-variable input or a four-variable input, and all of these terms make sense. You know, you take the gradient of a four-variable function, you'll get a vector with four components. You take the Hessian of a four-variable function, you would get a ( 4 \times 4 ) matrix, and all of these terms make sense.

I think it's also prettier to write it this way because it looks a lot more like a Taylor expansion in the single-variable world. You have, you know, a constant term plus the value of a derivative times ( X ) as a constant plus ( \frac{1}{2} ), what's kind of like the second derivative term, was kind of like taking an ( X^2 ). But this is how it looks in the vector world. So, in that way, it's actually maybe a little bit more familiar than writing it out in the full, you know, component by component term where it's easy to kind of get lost in the weeds there.

So, um, full vectorized form of the quadratic approximation of a scalar-valued multivariable function — boy, is that a lot to say!

More Articles

View All
Justification with the mean value theorem: table | AP Calculus AB | Khan Academy
The table gives selected values of the differentiable function f. All right, can we use a mean value theorem to say that there is a value c such that f prime of c is equal to 5 and c is between 4 and 6? If so, write a justification. Well, to use the mean…
Bill Ackman: The Real Estate Market is "Falling Off a Cliff"
I do think the economy is weakening, and I have some concerns. Billionaire investor Bill Amman just issued a dire warning message on the future of the real estate market and economy. Amman is the founder and CEO of Pershing Square, one of the most well-re…
How Fish Eat Part 2 (SLOW MOTION UNDERWATER!) - Smarter Every Day 119
Hey, it’s me Destin, welcome back to Smarter Every Day. So in the last episode of Smarter Every Day, we revealed that fish eat by sucking in the water by opening up their mouth, and then once they do that, they allow the water to exit back behind the ope…
Zeros of polynomials: plotting zeros | Polynomial graphs | Algebra 2 | Khan Academy
We’re told we want to find the zeros of this polynomial, and they give us the polynomial right over here, and it’s in factored form. They say plot all the zeros or the x-intercepts of the polynomial in the interactive graph. This is a screenshot from Khan…
Sampling distribution of sample proportion part 1 | AP Statistics | Khan Academy
[Instructor] So I have a gumball machine right over here. It has yellow, and green, and pink, and blue gumballs. Let me throw a few blue ones in there. And what we’re going to concern ourselves in this video are the yellow gumballs. And let’s say that w…
1st Taxpayer-funded EV Station
We have the first taxpayer-funded EV charging station in the country. What does this say about the state of play with our EV infrastructure in the country? It’s way behind schedule, obviously. It’s actually taken back many companies that are thinking abo…