Partial derivatives, introduction

7m read

·Nov 11, 2024

So let's say I have some multivariable function like f of XY. So it'll have a two variable input is equal to I don't know x^2 * y plus s of y, so it'll output just a single number. It's a scalar valued function.

Question is, how do we take the derivative of an expression like this? And there's a certain method called a partial derivative, which is very similar to ordinary derivatives. I kind of want to show how they're secretly the same thing. So to do that, let me just remind ourselves of how we interpret the notation for ordinary derivatives.

So if you have something like f(x) is equal to x^2, and let's say you want to take its derivative. I'll use the Leibniz notation here, df/dx. Let's evaluate it at two. Let's say I really like this notation because it's suggestive of what's going on. If we sketch out a graph, so you know this axis represents our output, this over here represents our input, and x^2 has a certain parabolic shape to it. Something like that.

Then we go to the input x = 2. This little dx here I like to interpret as just a little nudge in the x direction, and it's kind of the size of that nudge. Then df is the resulting change in the output after you make that initial little nudge. So it's this resulting change, and when you're thinking in terms of graphs, this is slope.

You kind of have this rise over run for your ratio between the tiny change of the output that's caused by a tiny change in the input. And of course, this is dependent on where you start. Over here we have x = 2, but you could also think about this without graphs if you really wanted to. You might just think about, you know, your input space as just a number line, and your output space also is just a number line, the output of f over here.

And really, you're just thinking of somehow mapping numbers from here onto the second line. In that case, your initial nudge, your initial little dx, would be some nudge on that number line, and you're wondering how that influences the function itself. So maybe that causes a nudge that's, you know, four times as big, and that would mean your derivative is four at that point.

So the reason that I'm talking about this is because over in the multivariable world, we can pretty much do the same thing. You know, you could write df/dx and interpret that as saying, "Hey, how does a tiny change in the input in the x direction influence the output?" But this time the way that you might visualize it, you'd be thinking of your input space here.

I'll draw it down here as the xy plane. So this time this is not going to be graphing the function. This is every point on the plane is an input, and let's say you were evaluating this at a point like (1, 2). Okay? In that case, you go over to the input that's 1 and then 2, and then you'd say, "Okay, so this tiny nudge in the input, this tiny change dx, how does that influence the output?"

And in this case, the output — I mean it's still just a number — so maybe we go off to the side here and we draw just like a number line as our output, and somehow we're thinking about the function as mapping points on the plane to the number line. So you say, "Okay, that's your dx, how much does it change the output?" And you know, maybe this time it changes it negatively. It depends on your function, and that would be your df.

And you can also do this with the y variable, right? There's no reason to — you can't say df/dy and evaluate that same point (1, 2) and interpret totally the same way. Except this time, your dy would be a change in the y direction. So maybe I should really emphasize here that that dx, that dx is a change in the x direction here, and that dy is a change in the y direction.

And maybe when you change your f according to y, it does something different, right? Maybe, you know, the output increases, and it increases by a lot. It's more sensitive to y. Again, it depends on the function. And I'll show you how you can compute something like this in just a moment here.

But first, there's kind of an annoying thing associated with partial derivatives where we don't write them with d's in dx, df. People came up with this new notation mostly just to emphasize to the reader of your equation that it's a multivariable function involved. And what you do is you say — you write a d but it's got kind of a curl at the top. It's this new symbol, and people will often read it as partial.

So you might read like ∂f/∂y. If you're wondering, by the way, why we call these partial derivatives, it's sort of like this doesn't tell the full story of how f changes because it only cares about the x direction. Neither does this. This only cares about the y direction. So each one is only a small part of the story.

So let's actually evaluate something like this. I'm going to go ahead and clear the board over here. I think the one-dimensional analogy and something we probably have already — so little remnants. So if you're actually evaluating something like this, here I'll write it again up here: ∂f/∂x and we're doing it at (1, 2). It only cares about movement in the x direction, so it's treating y as a constant. It doesn't even care about the fact that y changes. As far as it's concerned, y is always equal to 2.

So we can just plug that in ahead of time. So I’m going to say ∂/∂x (x^2 * 2 + s(2)). But instead of writing y, I'm just going to plug in that constant ahead of time because when you're only moving in the x direction, this is kind of how the multivariable function sees the world. And I'll just keep a little note that we're evaluating this whole thing at x = 1.

And here, this is actually just an ordinary derivative, right? This is an expression that's in x. You're asking how it changes as you shift around x, and you know how to do this. This is just taking the derivative. The derivative of x^2 * 2 is going to be 4x because x^2 goes to 2x, and then the derivative of a constant s(2) is just a constant; it's zero.

And of course, we're evaluating this at x = 1, so your overall answer is going to be 4. And just for practice, let's also do that with the derivative with respect to y. So we look over here, I'm going to write the same thing. You're taking the partial derivative of f with respect to y. We're evaluating it at the same point (1, 2).

This time it doesn't care about movement in the x direction. So as far as it's concerned, that x just stays constant at 1. So we'd write 1^2 * y + s(y), and you're saying, "Oh, I'm keeping track of this at y = 2." So that's kind of you're evaluating at y equal 2.

When you take the derivative, this is just 1 * y, so the derivative is 1. This over here, the derivative is cosine of y. Again, we're evaluating this whole thing at y = 2, so your overall answer would be, you know, 1 + cosine(2). I'm not sure what the value of cosine(2) is off the top of my head, but that would be your answer.

And this is a partial derivative at a point, but a lot of times you're not asked to just compute it at a point. What you want is a general formula that tells you, "Hey, plug in any point (x, y) and it should spit out the answer." So let me just kind of go over how you would do that. It's actually very similar, but this time, instead of plugging in the constant ahead of time, we just have to pretend that it's a constant.

So let me make a little bit of space for ourselves here. Really, we don't need any of this anymore. I'm going to leave the partial ∂f/∂y. We want this as a more general function of x and y. Well, we kind of do the same thing. We're going to say that this is, you know, derivative with respect to x, and I'm using partials just to kind of emphasize that it's a partial derivative, but now we'd write x^2 and kind of emphasize that it's a constant value of y plus s(y).

And again, I'll say y. And here I'm writing the variable y, but we have to pretend like it's a constant. You're pretending that you plug in 2 or something like that, and you still just take the derivative. So in this case, the derivative of x^2 * a constant is just 2x * that constant, 2x times that constant.

And over here, the derivative of a constant is always zero, so that's just always going to be zero. So this is your partial derivative as a more general formula. If you plugged in (1, 2) to this, you'd get what we had before. And similarly, if you're doing this with ∂f/∂y, we'd write down all of the same things.

Now you're taking it with respect to y, and I'm just going to copy this formula here actually. But this time, we're considering all of the x's to be constants. So in this case, when you take the derivative with respect to y of some kind of constant, you know constant * y is just going to equal that constant.

So this is going to be x^2, and over here you're taking the derivative of s(y). There's no x's in there, so that remains s(y). Clear? And now this is a more general formula. If you plugged in (1, 2), you would get 1. So, oh sorry, that's cosine(y) because we're taking a derivative.

So if you plugged in (1, 2), you know you would get 1 + cosine(1), which is what we had before. So this is really what you'll see for how to compute a partial derivative — you pretend that one of the variables is constant, and you take an ordinary derivative.

And in the back of your mind, you're thinking this is because you're just moving in one direction for the input, and you're seeing how that influences things. And then, you know, you might move in one direction for another input and see how that influences things. In the next video, I'll show you what this means in terms of graphs and slopes, but it's important to understand that graphs and slopes are not the only way to understand derivatives.

Because as soon as you start thinking about vector valued functions or functions with inputs of higher dimensions than just two, you can no longer think in terms of graphs. But this idea of nudging the input in some direction, seeing how that influences the output, and then taking the ratio, you know, the ratio of that output nudge to the input nudge, that's a more general way of viewing things.

And that's going to be very helpful moving forward in multivariable calculus.

Partial derivatives, introduction

More Articles