yego.me
💡 Stop wasting time. Read Youtube instead of watch. Download Chrome Extension

Gradient


4m read
·Nov 11, 2024

So here I'm going to talk about the gradient, and in this video I'm only going to describe how you compute the gradient. In the next couple ones, I'm going to give the geometric interpretation. I hate doing this; I hate showing the computation before the geometric intuition since usually it should go the other way around. But the gradient is one of those weird things where the way that you compute it actually seems kind of unrelated to the intuition. You'll see that we'll connect them in the next few videos, but to do that, we need to know what both of them actually are.

So on the computation side of things, let's say you have some sort of function. I'm just going to make it a two-variable function, and let's say it's f of x, y equals x squared sine of y. The gradient is a way of packing together all the partial derivative information of a function. So let's just start by computing the partial derivatives of this guy.

So partial of f with respect to x is equal to... so we look at this and we consider x the variable and y the constant. Well, in that case, sine of y is also a constant, you know, as far as x is concerned. The derivative of x is 2x, so we see that this will be 2x times that constant sine of y. Whereas the partial derivative with respect to y, now we look up here and we say x is considered a constant. So x squared is also considered a constant, so this is just a constant times sine of y. That's going to equal that same constant times the cosine of y, which is the derivative of sine.

Now what the gradient does is it just puts both of these together in a vector. Specifically, let me all change colors here. You denote it with a little upside-down triangle. The name of that symbol is nabla, but you often just pronounce it del. You'd say del f, or gradient of f, and what this equals is a vector that has those two partial derivatives in it.

So the first one is the partial derivative with respect to x: 2x times sine of y. The bottom one, partial derivative with respect to y: x squared cosine of y. Notice, maybe I should emphasize this, is actually a vector-valued function, right? So maybe I'll give it a little bit more room here and emphasize that it's got an x and a y. This is a function that takes in a point in two-dimensional space and outputs a two-dimensional vector.

So you could also imagine doing this with three different variables. Then you would have three partial derivatives and a three-dimensional output. The way you might write this more generally is we could go down here and say the gradient of any function is equal to a vector with its partial derivatives: partial of f with respect to x and partial of f with respect to y.

In some sense, we call these partial derivatives. I like to think of the gradient as the full derivative because it kind of captures all of the information that you need. A very helpful mnemonic device with the gradient is to think about this triangle, this nabla symbol, as being a vector full of partial derivative operators. And by operator, I just mean, you know, here let's like partial with respect to x, something where you could give it a function and it gives you another function.

So you give this guy, you know, the function f, and it gives you this expression, this multivariable function as a result. So the nabla symbol is this vector full of different partial derivative operators, and in this case, it might just be two of them. This is kind of a weird thing, right? Because it's like, "What? This is a vector; it's got like operators in it? That's not what I thought vectors do."

But you can kind of see where it's going. It's really just a—you could think of it as a memory trick, but it's in some sense a little bit deeper than that. Really, when you take this triangle and you say, "Okay, let's take this triangle," you can kind of imagine multiplying it by f. Really, it's like an operator taking in this function, and it's going to give you another function.

It's like you take this triangle and you put an f in front of it, and you can imagine like this part gets multiplied, quote unquote, multiplied with f. This part gets, quote unquote, multiplied with f. But really, you're just saying you take the partial derivative with respect to x and then with y, and on and on.

The reason for doing this is this symbol comes up a lot in other contexts. There are two other operators that you're going to learn about called the divergence and the curl. I'll get to those later, all in due time. But it's useful to think about this vector-ish thing of partial derivatives.

I mean, one weird thing about it, you could say, okay, so this nabla symbol is a vector of partial derivative operators; what's its dimension? You know, it's like how many dimensions you got. Because if you had a three-dimensional function, that would mean that you should treat this like it's got three different operators as part of it.

And you know, I'd kind of finish this off down here, and if you had something that was 100-dimensional, it would have 100 dimension, 100 different operators in it, and that's fine. It's really just, again, kind of a memory trick. So with that, that's how you compute the gradient. Not too much to it; it's pretty much just partial derivatives, but you smack them into a vector. Where it gets fun and where it gets interesting is with the geometric interpretation. I'll get to that in the next couple videos. It's also a super important tool for something called the directional derivative, so you've got a lot of fun stuff ahead.

More Articles

View All
Diego Saez Gil - How Pachama Uses Tech to Solve Climate Change
Alright guys, welcome to the podcast! How’s it going to you? It’s going great. So today we have Diego Sayis Gil of Pochamma from the Winter ‘19 batch and Gustav Helstrom, who is a partner at YC. So today we’re here to talk about Diego’s company. Gustav, w…
Inside NELK’s $250,000,000 Empire (The Full Story)
All right, I want to know something: who right now is doing it like Milk? Nobody! There’s nothing like us on the internet right now. Kyle’s talked about it being worth a quarter million recently on a podcast. Um, I think Happy Dad alone this year will be …
How Much I Make From YouTube #shorts
Hey, so for anyone curious how much I make on YouTube with three and a half million subscribers, here you go. I’ll take you into my analytics. So, in total, we did 110 million views this year, and as you can see, the views every day range anywhere from a…
Homeroom with Sal & Vas Narasimhan - Tuesday, August 17
Hi everyone, Sal Khan here. Welcome to Homeroom with Sal. We have a very exciting show today. After a bit of a hiatus, we haven’t done a live stream in a little while, but we have Vas Narasimhan, who is the CEO of Novartis. We had him on last year at the …
These are the asteroids to worry about
This video was sponsored by KiwiCo. More about them at the end of the show. On February 15th, 2013, over Chelyabinsk, Russia, an asteroid heavier than the Eiffel Tower slammed into the atmosphere. And then, 30 kilometers above the ground, it exploded. Thi…
Reasoning through multiplying decimal word problems | Grade 5 (TX TEKS) | Khan Academy
We’re told that Juan runs 1.7 kilometers every morning. Juan runs the same amount every day for six days. How many kilometers did Juan run in six days? Pause this video and see if you can figure this out before we do this together. All right, so Juan is …