yego.me
💡 Stop wasting time. Read Youtube instead of watch. Download Chrome Extension

Multivariable chain rule


6m read
·Nov 11, 2024

So I've written here three different functions. The first one is a multivariable function; it has a two variable input, (XY), and a single variable output, that's (x^2 \cdot y). That's just a number. And then the other two functions are each just regular old single variable functions.

What I want to do is start thinking about the composition of them. So, I'm going to take as the first component the value of the function (x(t)). So you pump (t) through that, and then you um make that the first component of (f), and the second component will be the value of the function (y(t)). The image that you might have in your head for something like this is you can think of (t) as just living on a number line of some kind. Then you have (x) and (y), which is just a plane, so that'll be, you know, your (x) coordinate, your (y) coordinate in two-dimensional space.

Then you have your output, which is just whatever the value of (f) is. For this whole function, for this whole composition of functions, you're thinking of (x(t), y(t)) as taking a single point in (t) and kind of moving it over to two-dimensional space somewhere. And then from there, our multivariable function takes that back down. So this is just a single variable function, nothing, you know, nothing too fancy going on in terms of where you start and where you end up. It's just what's happening in the middle.

What I want to know is, what's the derivative of this function? If I take this, and it's just an ordinary derivative, not a partial derivative, because this is a single variable function: one variable input, one variable output. How do you take its derivative? There's a special rule for this; it's called the Chain Rule, the multivariable chain rule. But you don't actually need it. So let's actually walk through this, showing that you don't need it. It's not that you'll never need it; it's just for computations like this, you could go without it.

It's a very useful theoretical tool, a very useful model to have in mind for what function composition looks like and implies for derivatives in the multivariable world. So let's just start plugging things in here. If I have (f) of (x(t)) of (y(t)), the first thing I might do is write (F) and instead of (x(t)), just write in (\cos(t)) since that's the function that I have for (x(t)), and then (y) we replace that with (s(t)). Of course, I'm hoping to take the derivative of this, and then from there, we can go to the definition of (f):

[f(x,y) = x^2 \cdot y]

which means we take that first component squared, so we'll take that first component (\cos(t)) and then square it, square that guy, and then we'll multiply it by the second component (s(t)). Again, we're just taking this derivative and you might be wondering, "Okay, why am I doing this?" You're just showing me how to take a first derivative, an ordinary derivative. But the pattern that we'll see is going to lead us to the multivariable chain rule, and it's actually kind of surprising when you see it in this context, 'cause it pops out in a way that you might not expect things to pop out.

So, continuing or chugging along, when you take the derivative of this, you do the product rule: left (d) right * plus right (d) left. So in this case, the left is (\cos^2(t)); we just leave that as it is, (\cos^2(t)), and multiply it by the derivative of the right (d) right. So that's going to be (s(t)) multiplied by (\cos(t)). Then we add to that, right, which is, you know, keep that right side unchanged multiplied by the derivative of the left.

For that, we use the chain rule, the single variable chain rule, where you think of taking the derivative of the outside. So you PP that, plop that two down like you're taking the derivative of (2x), but you're just writing in (\cos(t)) instead of (x); (\cos(t)) and then you multiply that by the derivative of the inside. That's a tongue twister, um, which is negative (s(t)).

I'm afraid I'm going to run off the edge here, certainly with the many, many parentheses that I need. I'll go ahead and rewrite this though. I'm going to rewrite it anyway because there's a certain pattern that I hope to make clear. So let me just rewrite this side, um, just copy that down here. I just want to rewrite this guy; you might be wondering why, but it'll become clear in just a moment why I want to do this.

In this case, I'm going to write this as (2 \cdot \cos(t) \cdot s(t)), and then all of that multiplied by negative (s(t)). So this is the derivative, this is the derivative of the composition of functions that ultimately was a single variable function, but it kind of went through two different variables. I just want to make an observation in terms of the partial derivatives of (f).

So let me just make a copy of this guy, give ourselves a little bit of room down here, just paste that over here. So let's look at the partial derivatives of (f) for a second here. If I took the partial derivative with respect to (X) (\partial X), which means (Y) is treated as a constant. So I take the derivative of (x^2) to get (2x) and then multiply it by that constant, which is just (y) if I also do it with respect to (Y), get all of them in there. So now (Y) looks like a variable, (X) looks like a constant.

So (X^2) also looks like a constant, constant times a variable; the derivative is just that constant. These two, their pattern comes up in the ultimate result that we got. This is the whole reason that I rewrote it: if you look at this (2xy), you can see that over here where (\cos(t)) corresponds to (x), (s) corresponds to (y) based on our original functions. Then (x^2) here corresponds with squaring the (x) that we put in there.

If we take the derivative of our two intermediary functions, the ordinary derivative of (x) with respect to (t), that's the derivative of (\cos(t)), which is negative (s(t)), and then similarly the derivative of (y) just the ordinary derivative, no partial going on here with respect to (t); that's equal to (\cos), derivative of (s) is (\cos).

These guys show up right; you see (-s) over here and you see (\cos) show up over here. We could generalize this; we could write it down and say at least for this specific example, it looks like the derivative of the composition is this part, which is the partial of (f) with respect to (y), right? That's kind of what it looks like here.

Once we've plugged in the intermediary functions, multiplied by this guy, which was the ordinary derivative of (y) with respect to (t). So that was the ordinary derivative of (y) with respect to (t). Very similarly, this guy was the partial of (f) with respect to (x), (\partial X), and we're multiplying it by the ordinary derivative of (x(t)) with respect to (t).

Of course, when I write this (\partial F/\partial Y), what I really mean is you plug in for (X) and (Y) the two coordinate functions (x(t), y(t)). Um, so if I say (\partial F/\partial y) over here, what I really mean is you take that (x^2) and then you plug in (X(t)^2) to get (\cos^2(t)), and same deal over here; you're always plugging things in, so you ultimately have a function of (t).

But this right here has a name: this is the multivariable chain rule, and it's important enough I'll just kind of I'll just write it out all on its own here. If we take the ordinary derivative with respect to (t) of a composition of a multivariable function, in this case just two variables (x(t), y(t)), where we're plugging in two intermediary functions (x(t), y(t)), each of which is just single variable, the result is that we take the partial derivative with respect to (X) and we multiply it by the derivative of (x) with respect to (t), and then we add to that the partial derivative with respect to (Y) multiplied by the derivative of (y) with respect to (t).

So this entire expression here is what you might call the simple version of the multivariable chain rule. Um, and you get there's a more general version and we'll kind of build up to it, but this is the simplest example you can think of where you start with one dimension and then you move over to two dimensions somehow, and then you move from those two dimensions down to one.

So this is that, and in the next video I'm going to talk about the intuition for why this is true. You know, here I just went through an example and showed, oh it just happens to be true, it fills this pattern. But there's a very nice line of reasoning for where this comes about, and I'll also talk about a more generalized form where you'll see it.

We start using vector notation; it makes things look very clean, and I might even get around to a more formal argument for why this is true. So see you next video.

More Articles

View All
The Team Leader Steps Down | Explorer
Hi. On a remote peak in Myanmar, a team of elite climbers is unraveling just as they are poised to attempt the summit. “But what I’m hearing from you guys is that you don’t trust me on the rope.” “We’re just worried about the safety of the team. There’l…
Graphing a circle from its standard equation | Mathematics II | High School Math | Khan Academy
[Voiceover] Whereas to graph the circle (x + 5) squared plus (y - 5) squared equals four. I know what you’re thinking. What’s all of this silliness on the right-hand side? This is actually just the view we use when we’re trying to debug things on Khan Aca…
Watch: Decomposing Dolphin Brings New Life to Seafloor | Expedition Raw
This common dolphin that just happened to wash up on the beach where Noah gave me a call said, “Hey, instead of putting in the dumpster, would you like to use this for your project?” It was the perfect opportunity. We’re going to try to better understand …
Why I'm Finally Spending Money
What’s up you guys? It’s Graham here. So, a little over a year ago, I made a video breaking down exactly how much money I spend every month, where it all goes, and my philosophy is when it comes to saving money, investing, and trying to get the best value…
Luring in the Coconut Crab | Primal Survivor
In the South Pacific, locals have a basic but effective method to catch their prey: the baited stick. First, we have to collect U coconuts—dry ones. Yeah, let’s make a sharp steak, huh? The coconut aroma will waft across the island, and with any luck, we’…
How to Make Fresh Sprouts | Live Free or Die: How to Homestead
I would like to show you how to make fresh clover sprouts. It’s one of the things that Tony and I have a hard time getting in the winter: our fresh vegetables. One solution to that is to make sprouts. Here I have a whole bunch of clover seeds. Just take a…