yego.me
💡 Stop wasting time. Read Youtube instead of watch. Download Chrome Extension

The Jacobian matrix


5m read
·Nov 11, 2024

In the last video, we were looking at this particular function. It's a very non-linear function, and we were picturing it as a transformation that takes every point (x, y) in space to the point (x + sin(y), y + sin(x)).

Moreover, we zoomed in on a specific point, and let me actually write down what point we zoomed in on. It was (-2, 1). That's something we're going to want to record here: (-2, 1). I added a couple extra grid lines around it just so we can see in detail what the transformation does to points that are in a neighborhood of that point.

Over here, this square shows the zoomed-in version of that neighborhood. What we saw was that even though the function as a whole, as a transformation, looks rather complicated, around that one point it looks like a linear function. It's locally linear.

So what I’ll show you here is what matrix is going to tell you—the linear function that this looks like. This is going to be some kind of 2 by 2 matrix. I'll make a lot of room for ourselves here; it'll be a 2x2 matrix. The way to think about it is to first go back to our original setup before the transformation and think of just a tiny step to the right, what I'm going to think of as a little partial x, a tiny step in the x direction.

What that turns into after the transformation is going to be some tiny step in the output space. Here, let me actually kind of draw on what that tiny step turned into. It's no longer purely in the x direction. It has some rightward component but now also some downward component.

To be able to represent this in a nice way, what I’m going to do is instead of writing the entire function as something with a vector-valued output, I'm going to go ahead and represent this as two separate scalar-valued functions. I'm going to write the scalar value functions f1(x, y) and f2(x, y). So I'm just giving a name to x + sin(y) and f2(x, y), again, all I'm doing is giving a name to the functions we already have written down.

When I look at this vector, the consequence of taking a tiny dx step in the input space corresponds to some 2D movement in the output space. The x component of that movement, right—if I was going to draw this out and say, "Hey, what's the x component of that movement?"—that's something we think of as a little partial change in f1, the x component of our output.

If we divide this—if we take, you know, partial f1 divided by the size of that initial tiny change—it basically scales it up to be a normal-sized vector, not a tiny nudge, but something that's more constant that doesn't shrink as we zoom in further and further. Then, similarly, the change in the y direction, right? The vertical component of that step that was still caused by the dx, right, it's still caused by that initial step to the right—that is going to be the tiny partial change in f2, the y component of the output.

Because here, we're all just looking in the output space that was caused by a partial change in the x direction. Again, I kind of like to think about this: we're dividing by a tiny amount. This partial f2 is really a tiny, tiny nudge, but by dividing by the size of the initial tiny nudge that caused it, we're getting something that's basically a number, something that doesn't shrink when we consider more and more zoomed in versions.

So, that's all what happens when we take a tiny step in the x direction. But another thing you could do, another thing you can consider, is a tiny step in the y direction. We want to know, "Hey, if you take a single step, some tiny unit upward, what does that turn into after the transformation?"

What that looks like is this vector that still has some upward component, but it also has a rightward component. Now, I'm going to write its components as the second column of the matrix because, as we know, when you're representing a linear transformation with a matrix, the first column tells you where the first basis vector goes, and the second column shows where the second basis vector goes.

If that feels unfamiliar, either check out the refresher video or maybe go and look at some of the linear algebra content. To figure out the coordinates of this guy, we do basically the same thing. We say, first of all, the change in the x direction here, the x component of this nudge vector, that's going to be given as a partial change to f1, right, to the x component of the output.

Here, we're looking in the output space, so we're dealing with f1 and f2. We're asking what that change was that was caused by a tiny change in the y direction. So the change in f1 caused by some tiny step in the y direction, divided by the size of that tiny step, and then the y component of our output here, the y component of the step in the output space that was caused by the initial tiny step upward in the input space—well, that is the change of f2, the second component of our output, as caused by dy, as caused by that little partial y.

And of course, all of this is very specific to the point that we started at, right? We started at the point (-2, 1). So each of these partial derivatives is something that really we're saying, "Don't take the function, evaluate it at the point (-2, 1)." When you evaluate each one of these at the point (-2, 1), you'll get some number, and that'll give you a very concrete 2 by 2 matrix that's gonna represent the linear transformation that this guy looks like once you've zoomed in.

So this matrix here that's full of all of the different partial derivatives has a very special name. It's called, as you may have guessed, the Jacobian, or more fully, you'd call it the Jacobian matrix. One way to think about it is that it carries all of the partial differential information, right? It's taking into account both of these components of the output and both possible inputs, and giving you kind of a grid of what all the partial derivatives are.

But as I hope you see, it's much more than just a way of recording what all the partial derivatives are. There's a reason for organizing it like this, in particular, and it really does come down to this idea of local linearity. If you understand that the Jacobian matrix is fundamentally supposed to represent what a transformation looks like when you zoom in near a specific point, almost everything else about it will start to fall in place.

In the next video, I'll go ahead and actually compute this just to show you what the process looks like and how the result we get kind of matches with the picture we're looking at. See you then.

More Articles

View All
The Last Star in the Universe – Red Dwarfs Explained
One day the last star will die, and the universe will turn dark forever. It will probably be a red dwarf; a tiny kind of star. That’s also one of our best bets to find alien life, and might be the last home of humanity before the universe becomes uninhabi…
Introducing Khanmigo for teachers
This is Conmigo, an AI-powered guide designed to help all students learn when subjects are giving them trouble. Conmigo can help. Kamika was fun and can transform learning into an adventure. Kanmigo is not just for students; teachers can use it too by tog…
Multiplying monomials | Polynomial arithmetic | Algebra 2 | Khan Academy
Let’s say that we wanted to multiply 5x squared, and I’ll do this in purple: 3x to the fifth. What would this equal? Pause this video and see if you can reason through that a little bit. All right, now let’s work through this together. Really, all we’re …
What it’s like to be half Japanese half Turkish 🇯🇵/ 🇹🇷
What’s up! It’s me, Ruri. I’m a first-year medical student here in Turkey, and today we’re talking about what it’s like to be growing up half Japanese and half Turkish. I will timestamp every single thing that I mention in the description below so that yo…
Charlie Munger: How to Invest
Charlie Munger is without a doubt one of the most respected names in the value investing world. He’s been Buffett’s right hand man for many decades and still serves as the vice chairman of Berkshire Hathaway at 99 years old. But as many of you may know, h…
Graph labels and scales | Modeling | Algebra II | Khan Academy
We’re told that Chloe takes a slice of pizza out of the freezer and leaves it on the counter to defrost. She models the relationship between the temperature ( p ) of the pizza, this seems like it’s going to be interesting. The temperature ( p ) of the piz…