yego.me
💡 Stop wasting time. Read Youtube instead of watch. Download Chrome Extension

Gradient and graphs


5m read
·Nov 11, 2024

So here I'd like to talk about what the gradient means in the context of the graph of a function.

In the last video, I defined the gradient, um, but let me just take a function here. The one that I have graphed is (x^2 + y^2) (f of xy = (x^2 + y^2)). So, two-dimensional input, which we think about as being kind of the XY plane, and then a one-dimensional output that's just the height of the graph above that plane.

I defined in the last video the gradient to be a certain operator. An operator just means you take in a function and you output another function, and we use this upside down triangle, so it gives you another function that's also of X and Y, but this time it has a vector valued output. The two components of its output are the partial derivatives: partial of f with respect to X and the partial of f with respect to Y.

For a function like this, we actually evaluate it. Let's take a look. Um, the very first one is taking the derivative with respect to X, so it looks at X and says, "You look like a variable to me. I'm going to take your derivative: your 2x."

But the Y component just looks like a constant as far as the partial X is concerned, and the derivative of a constant is zero. But when you take the partial derivative with respect to Y, things reverse. It looks at the X component and says, "You look like a constant, your derivative is zero," but it looks at the Y component and says, "Ah, you look like a variable, your derivative is 2y."

So this ultimate function that we get—the gradient—which takes in a two variable input (xy) (some point on this plane), but outputs a vector can nicely be visualized with a vector field. I have another video on vector fields if you're feeling unsure, but I want you to just take a moment, pause if you need to, and guess or try to think about what vector field this will look like.

I'm going to show you in a moment, but what's it going to look like—the one that takes in (xy) and outputs (2x, 2y)? All right, have you done it? Have you thought about what it's going to look like?

Here's what we get: it's a bunch of vectors pointing away from the origin. The basic reason for that is that if you have any given input point, and say it’s got coordinates (xy), then the vector that that input point represents would, you know, if it went from the origin here, that's what that vector looks like.

But the output is two times that vector. So when we attach that output to the original point, you get something that's two times that original vector but pointing in the same direction, which is away from the origin. I kind of drew it poorly here, um, and of course, when we draw vector fields, we don't usually draw them to scale.

We scale them down just so that things don't look as cluttered. That's why everything here looks the same length, but color indicates length. So you should think of these red guys as being really long, the blue ones as being really short.

So what does this have to do with the graph of the function? There's actually a really cool interpretation. Imagine that you are just walking along this graph; you know, you're a hiker, and this is a mountain.

You picture yourself at any point on this graph. Let's say—what color should I use?—let's say you’re sitting at a point like this, and you say, "What direction should I walk to increase my altitude the fastest?" You want to get uphill as quickly as possible.

From that point, you might walk, you know, what looks like straight up there. You certainly wouldn't go around in this way; you wouldn't go down. So you might go straight up there. If you project your point down onto the input space—so this is the point above which you are—that vector, the one that's going to get you going uphill the fastest, the direction you should walk for this graph, it should kind of make sense, is directly away from the origin.

Because here, I’ll erase this because once I start moving things, that won’t stick. Um, if you were to look at things from the very bottom, any point that you are on the mountain, on the graph here, and you want to increase the fastest, you should just go directly away from the origin. Because that’s when it’s the steepest, and all of these vectors are also pointing directly away from the origin.

So, people will say, "The gradient points in the direction of steepest ascent." That might even be worth writing down: direction of steepest descent.

And let’s just see what that looks like in the context of another example. So I’ll pull up another graph here, pull up another graph, and it’s a vector field. So this graph, it’s all negative values; it’s all below the XY plane, and it’s got these two different peaks.

I’ve also drawn the gradient field, which is the word for the vector field representing the gradient on top. You’ll notice near the peak, all of the vectors are pointing kind of in the uphill direction, sort of telling you to go towards that peak in some way.

And, you know, as you get a feel around, you can see here this very top one, like the point that it’s stemming from corresponds with something just a little bit shy of the peak there. Everybody's telling you to go uphill; each vector is telling you which way to walk to increase the altitude on the graph the fastest.

It’s the direction of steepest descent, and that’s what the direction means. But what does the length mean? Well, if you take a look at these red vectors here, so red means that they should be considered very, very long, and the graph itself—the point they correspond to on the graph is just way off-screen for us because this graph gets really steep and really negative very fast.

So the points these correspond to have really, really steep slopes, whereas these blue ones over here—you know, it’s kind of a relatively shallow slope. By the time you’re getting to the peak, things start leveling off.

So the length of the gradient vector actually tells you the steepness of that direction of steepest ascent.

Uh, but one thing I want to point out here: it doesn't really make sense immediately looking at it why just throwing the partial derivatives into a vector is going to give you this direction of steepest descent.

Uh, ultimately it will, we’re going to talk through that, and I hope to make that connection pretty clear. But unless you're some kind of intuitive genius, I don't think that connection is at all obvious at first.

But you will see it in due time; it’s going to require something called the directional derivative.

See you next video.

More Articles

View All
Representing systems of any number of equations with matrices | Precalculus | Khan Academy
In a previous video, we saw that if you have a system of three equations with three unknowns, like this, you can represent it as a matrix vector equation. Where this matrix, right over here, is a three by three matrix that is essentially a coefficient mat…
Touching Plasma PhD Research Opportunities at UAH - Smarter Every Day 193
Hey, it’s me Destin! Welcome back to Smarter Every Day. Here’s the deal: um, I’m in a super weird place in life right now. I’ve got four kids. I’m an engineer. I’ve got this YouTube thing, I give talks, but my channel name is Smarter Every Day. I’ve been …
The Truth About Y Combinator
I love, I love the like, well, I’ve watched all your videos, so we kind of get YC. It’s like, guys, these videos aren’t YC. Like, yes. [Music] So, this is Michael Cybo with Dalton Caldwell, and today we just finished up, um, a YC batch, and we’re getting …
pH and solubility | Equilibrium | AP Chemistry | Khan Academy
Changing the pH of a solution can affect the solubility of a slightly soluble salt. For example, if we took some solid lead(II) fluoride, which is a white solid, and we put it in some distilled water, the solid is going to reach an equilibrium with the io…
Finding specific antiderivatives: exponential function | AP Calculus AB | Khan Academy
We’re told that F of 7 is equal to 40 + 5 e 7th power, and f prime of X is equal to 5 e to the X. What is F of 0? So, to evaluate F of 0, let’s take the anti-derivative of f prime of X, and then we’re going to have a constant of integration there. So we …
Comparison word problems: roly-polies | Addition and subtraction | 1st grade | Khan Academy
Leah has nine roly polies. Let’s write that down. Leah has nine roly polies in her bug house; she has one more. She has one more than Dingan, is I think how I would say that name. How many roly polies does Dingan have? So that’s what we need to figure ou…