yego.me
💡 Stop wasting time. Read Youtube instead of watch. Download Chrome Extension

Multivariable chain rule


6m read
·Nov 11, 2024

So I've written here three different functions. The first one is a multivariable function; it has a two variable input, (XY), and a single variable output, that's (x^2 \cdot y). That's just a number. And then the other two functions are each just regular old single variable functions.

What I want to do is start thinking about the composition of them. So, I'm going to take as the first component the value of the function (x(t)). So you pump (t) through that, and then you um make that the first component of (f), and the second component will be the value of the function (y(t)). The image that you might have in your head for something like this is you can think of (t) as just living on a number line of some kind. Then you have (x) and (y), which is just a plane, so that'll be, you know, your (x) coordinate, your (y) coordinate in two-dimensional space.

Then you have your output, which is just whatever the value of (f) is. For this whole function, for this whole composition of functions, you're thinking of (x(t), y(t)) as taking a single point in (t) and kind of moving it over to two-dimensional space somewhere. And then from there, our multivariable function takes that back down. So this is just a single variable function, nothing, you know, nothing too fancy going on in terms of where you start and where you end up. It's just what's happening in the middle.

What I want to know is, what's the derivative of this function? If I take this, and it's just an ordinary derivative, not a partial derivative, because this is a single variable function: one variable input, one variable output. How do you take its derivative? There's a special rule for this; it's called the Chain Rule, the multivariable chain rule. But you don't actually need it. So let's actually walk through this, showing that you don't need it. It's not that you'll never need it; it's just for computations like this, you could go without it.

It's a very useful theoretical tool, a very useful model to have in mind for what function composition looks like and implies for derivatives in the multivariable world. So let's just start plugging things in here. If I have (f) of (x(t)) of (y(t)), the first thing I might do is write (F) and instead of (x(t)), just write in (\cos(t)) since that's the function that I have for (x(t)), and then (y) we replace that with (s(t)). Of course, I'm hoping to take the derivative of this, and then from there, we can go to the definition of (f):

[f(x,y) = x^2 \cdot y]

which means we take that first component squared, so we'll take that first component (\cos(t)) and then square it, square that guy, and then we'll multiply it by the second component (s(t)). Again, we're just taking this derivative and you might be wondering, "Okay, why am I doing this?" You're just showing me how to take a first derivative, an ordinary derivative. But the pattern that we'll see is going to lead us to the multivariable chain rule, and it's actually kind of surprising when you see it in this context, 'cause it pops out in a way that you might not expect things to pop out.

So, continuing or chugging along, when you take the derivative of this, you do the product rule: left (d) right * plus right (d) left. So in this case, the left is (\cos^2(t)); we just leave that as it is, (\cos^2(t)), and multiply it by the derivative of the right (d) right. So that's going to be (s(t)) multiplied by (\cos(t)). Then we add to that, right, which is, you know, keep that right side unchanged multiplied by the derivative of the left.

For that, we use the chain rule, the single variable chain rule, where you think of taking the derivative of the outside. So you PP that, plop that two down like you're taking the derivative of (2x), but you're just writing in (\cos(t)) instead of (x); (\cos(t)) and then you multiply that by the derivative of the inside. That's a tongue twister, um, which is negative (s(t)).

I'm afraid I'm going to run off the edge here, certainly with the many, many parentheses that I need. I'll go ahead and rewrite this though. I'm going to rewrite it anyway because there's a certain pattern that I hope to make clear. So let me just rewrite this side, um, just copy that down here. I just want to rewrite this guy; you might be wondering why, but it'll become clear in just a moment why I want to do this.

In this case, I'm going to write this as (2 \cdot \cos(t) \cdot s(t)), and then all of that multiplied by negative (s(t)). So this is the derivative, this is the derivative of the composition of functions that ultimately was a single variable function, but it kind of went through two different variables. I just want to make an observation in terms of the partial derivatives of (f).

So let me just make a copy of this guy, give ourselves a little bit of room down here, just paste that over here. So let's look at the partial derivatives of (f) for a second here. If I took the partial derivative with respect to (X) (\partial X), which means (Y) is treated as a constant. So I take the derivative of (x^2) to get (2x) and then multiply it by that constant, which is just (y) if I also do it with respect to (Y), get all of them in there. So now (Y) looks like a variable, (X) looks like a constant.

So (X^2) also looks like a constant, constant times a variable; the derivative is just that constant. These two, their pattern comes up in the ultimate result that we got. This is the whole reason that I rewrote it: if you look at this (2xy), you can see that over here where (\cos(t)) corresponds to (x), (s) corresponds to (y) based on our original functions. Then (x^2) here corresponds with squaring the (x) that we put in there.

If we take the derivative of our two intermediary functions, the ordinary derivative of (x) with respect to (t), that's the derivative of (\cos(t)), which is negative (s(t)), and then similarly the derivative of (y) just the ordinary derivative, no partial going on here with respect to (t); that's equal to (\cos), derivative of (s) is (\cos).

These guys show up right; you see (-s) over here and you see (\cos) show up over here. We could generalize this; we could write it down and say at least for this specific example, it looks like the derivative of the composition is this part, which is the partial of (f) with respect to (y), right? That's kind of what it looks like here.

Once we've plugged in the intermediary functions, multiplied by this guy, which was the ordinary derivative of (y) with respect to (t). So that was the ordinary derivative of (y) with respect to (t). Very similarly, this guy was the partial of (f) with respect to (x), (\partial X), and we're multiplying it by the ordinary derivative of (x(t)) with respect to (t).

Of course, when I write this (\partial F/\partial Y), what I really mean is you plug in for (X) and (Y) the two coordinate functions (x(t), y(t)). Um, so if I say (\partial F/\partial y) over here, what I really mean is you take that (x^2) and then you plug in (X(t)^2) to get (\cos^2(t)), and same deal over here; you're always plugging things in, so you ultimately have a function of (t).

But this right here has a name: this is the multivariable chain rule, and it's important enough I'll just kind of I'll just write it out all on its own here. If we take the ordinary derivative with respect to (t) of a composition of a multivariable function, in this case just two variables (x(t), y(t)), where we're plugging in two intermediary functions (x(t), y(t)), each of which is just single variable, the result is that we take the partial derivative with respect to (X) and we multiply it by the derivative of (x) with respect to (t), and then we add to that the partial derivative with respect to (Y) multiplied by the derivative of (y) with respect to (t).

So this entire expression here is what you might call the simple version of the multivariable chain rule. Um, and you get there's a more general version and we'll kind of build up to it, but this is the simplest example you can think of where you start with one dimension and then you move over to two dimensions somehow, and then you move from those two dimensions down to one.

So this is that, and in the next video I'm going to talk about the intuition for why this is true. You know, here I just went through an example and showed, oh it just happens to be true, it fills this pattern. But there's a very nice line of reasoning for where this comes about, and I'll also talk about a more generalized form where you'll see it.

We start using vector notation; it makes things look very clean, and I might even get around to a more formal argument for why this is true. So see you next video.

More Articles

View All
Computing a tangent plane
Hey guys! So, in the last video, I was talking about how you can define a function whose graph is a plane, and moreover, a plane that passes through a specified point and whose orientation you can somehow specify. We ended up seeing how specifying that or…
How To Upgrade Your Friends
They say that if you hang out with five millionaires, you will be the sixth. But this is also true when it comes to surrounding yourself with intelligent people. Whether we accept it or not, we are the average of the five people we spend most of our time …
Word problem subtracting fractions with like denominators
After a rainstorm, Lily measures the depth of several puddles in her backyard. She records her results in a table. So, here are three different puddles, and she measures the depth in inches. Then we’re asked: how much deeper was the puddle under the swin…
Shouldn't We Just Copy Warren Buffett's Portfolio?
I could not come up with these ideas on my own. I came up with this idea from Warren and Charlie, and I copied it. So, one of the most important models that you can adopt is the model of cloning. When you see someone doing something smart, uh, just incorp…
The Housing Market Is In Serious Trouble
What’s up, Graham? It’s guys here. So, the housing market has taken yet another unexpected turn, because now you’re officially able to buy a home for one percent down. That’s right, this Phoenix charmer could be all yours for less than five thousand doll…
The New Stock Market World Order Has Begun | Recession Warning
What’s up, guys? You here? And, uh, well, this escalated quickly. In the span of one month, mortgage rates have climbed to their highest level in a decade. Morgan Stanley warns that a bear market rally is setting the stage for a correction, with even more…