Intuition for why independence matters for variance of sum | AP Statistics | Khan Academy
So in previous videos, we talked about the claim that if I have two random variables, X and Y, that are independent, then the variance of the sum of those two random variables, or the difference of those two random variables, is going to be equal to the sum of the variances.
So, that if you have independent random variables, your variation is going to increase when you take a sum or a difference. We built a little bit of intuition there. What I want to talk about in this video is really about building even more intuition and getting a gut feeling for why this independence is important for making this claim.
To get that intuition, let's look at two random variables that are definitely random variables but that are definitely not independent. So, let's say, let's let X be equal to the number of hours that the next person you meet—I'll say random person—random person slept yesterday. And let's say that Y is equal to the number of hours that same person was awake yesterday.
Appreciate why these are not independent random variables. One of them is going to completely determine the other. If I slept eight hours yesterday, then I'm going to have been awake for 16 hours. If I slept for 16 hours, then I would have been awake for eight hours.
We know that X plus Y, even though they're random variables and there could be variation in X and there could be variation in Y, for any given person, remember these are still based on that same person, X plus Y is always going to be equal to 24 hours.
So these are not independent. Not independent! If you're given one of the variables, it would completely determine what the other variable is. The probability of getting a certain value for one variable is going to be very different given what value you got for the other variable. So they're not independent at all.
In this situation, if someone said, let's just say for the sake of argument that the variance of X, the variance of X is equal to, I don't know, let's say it's equal to 4 and the units for variance would be squared hours. So, 4 hours squared. We could say that the standard deviation for X in this case would be 2 hours.
And let's say that the variance, or let's say the standard deviation of Y is also equal to 2 hours. And let's say that the variance of Y, the variance of Y, well, it would be the square of the standard deviation. So it would be 4 hours, 4 hours squared would be our units.
So, if we just tried to blindly say, "Oh, I'm just going to apply this little expression, this claim we had without thinking about the independence," we would try to say, "Well then the variance of X plus Y, the variance of X plus Y must be equal to the sum of their variances."
So it would be 4 plus 4, so is it equal to 8 hours squared? Well, that doesn't make any sense because we know that a random variable that is equal to X plus Y—that this is always going to be 24 hours. In fact, it's not going to have any variation; X plus Y is always going to be 24 hours.
So for these two random variables, because they are so connected, they are not independent at all. This is actually going to be zero. There is zero variance here. X plus Y is always going to be 24, at least on Earth, where we have a 24-hour day.
I guess if someone lived on another planet or something then it could be slightly different, and we're assuming that we have an exactly 24-hour day on Earth.
So this is to give you a gut sense of why independence matters for making this claim, and if you have things that are not independent, it gives you a good sense for why this claim doesn't hold up as much.