yego.me
💡 Stop wasting time. Read Youtube instead of watch. Download Chrome Extension

Example: Analyzing the difference in distributions | Random variables | AP Statistics | Khan Academy


6m read
·Nov 11, 2024

Suppose that men have a mean height of 178 centimeters, with a standard deviation of 8 centimeters. Women have a mean height of 170 centimeters, with a standard deviation of 6 centimeters. The male and female heights are each normally distributed. We independently randomly select a man and a woman. What is the probability that the woman is taller than the man?

So, I encourage you to pause this video and think through it. I'll give you a hint: what if we were to define the random variable M as equal to the height of a randomly selected man? Height of a random man. What if we defined the random variable W to be equal to the height of a random woman? Woman. And we defined a third random variable in terms of these first two.

So, let me call this D for difference, and it is equal to the difference in height between a randomly selected man and a randomly selected woman. So, D, the random variable D, is equal to the random variable M minus the random variable W.

So, the first two are clearly normally distributed; they tell us that right over here. The male and female heights are each normally distributed. We also know (or you're about to know) that the difference of random variables that are each normally distributed is also going to be normally distributed.

So, given this, can you think about how to tackle this question? The probability that the woman is taller than the man?

Alright, now let's work through this together. To help us visualize, I'll draw the normal distribution curves for these three random variables. So, this first one is for the variable M, and so right here in the middle, that is the mean of M, and we know that this is going to be equal to 178 centimeters.

We'll assume everything is in centimeters. We also know that it has a standard deviation of eight centimeters. So, for example, if this is one standard deviation above, this is one standard deviation below. This point right over here would be eight centimeters more than 178, so that would be 186, and this would be eight centimeters below that, so this would be 170 centimeters.

So this is for the random variable M. Now, let's think about the random variable W. The random variable W, the mean of W, they tell us is 170, and one standard deviation above the mean is going to be six centimeters above the mean. The standard deviation is six, so this would be minus six to go to one standard deviation below the mean.

Now, let's think about the difference between the two, the random variable D. So let me think about this a little bit. The random variable D, the mean of D, is going to be equal to the differences in the means of these random variables. So it's going to be equal to the mean of M minus the mean of W.

Well, we know both of these; this is going to be 178 minus 170. So let me write that down: this is equal to 178 centimeters minus 170 centimeters, which is going to be equal to… I'll do it in this color. This is going to be equal to 8 centimeters. So this is 8 right over here.

Now, what about the standard deviation? Assuming these two random variables are independent— and they tell us that we are independently randomly selecting a man and a woman— the height of the man shouldn't affect the height of the woman, or vice versa. Assuming that these two are independent variables, if you take the sum or the difference of these, then the spread will increase, but you won't just add the standard deviations.

What you would actually do is say the variance of the difference is going to be the sum of these two variances. So let me write that down. I could write variance with var or I could write it as a standard deviation squared. So let me write that: the standard deviation of D, of our difference squared (which is the variance), is going to be equal to the variance of our variable M plus the variance of our variable W.

Now, this might be a little bit counterintuitive. This might have made sense to you if this was plus right over here, but it doesn't matter if we are adding or subtracting. And these are truly independent variables, then regardless of whether we're adding or subtracting, you would add the variances.

And so we can figure this out; this is going to be equal to the standard deviation of variable M is 8. So, 8 squared is going to be 64. And then we have 6 squared—this right over here is 6. Six squared is going to be 36. You add these two together; this is going to be equal to 100.

And so the variance of this distribution right over here is going to be equal to 100. Well, what's the standard deviation of that distribution? Well, it's going to be equal to the square root of the variance, so the square root of 100, which is equal to 10.

So, for example, one standard deviation above the mean is going to be 18. One standard deviation below the mean is going to be equal to negative 2. And so now, using this distribution, we can actually answer this question: what is the probability that the woman is taller than the man?

Well, we can rewrite that question as saying: what is the probability that the random variable D is… what conditions would it be? Pause the video and think about it. Well, the situations where the woman is taller than the man; if the woman is taller than the man, then this is going to be a negative value. Then D is going to be less than zero.

So, what we really want to do is figure out the probability that D is less than zero. And so what we want to do—if we say 0 is right over… if we said that 0 is right over here on our distribution, so that is D is equal to 0— we want to figure out, well, what is the area under the curve less than that?

So we want to figure out this entire area. There are a couple of ways you could do this. You could figure out the Z-score for D equaling 0, and that's pretty straightforward. You could just say this Z is equal to 0 minus our mean of 8 divided by our standard deviation of 10.

So it's negative 8 over 10, which is equal to negative 8 tenths. So you could look up a Z table and say, what is the total area under the curve below Z is equal to negative 0.8? Another way you could do this is you could use a graphing calculator. I have a TI-84 here, where you have a normal cumulative distribution function.

I'm going to press 2nd, vars, and that gets me to distribution. So I have these various functions; I want the normal cumulative distribution function. So that is choice 2. And then the lower bound—well, I want to go to negative infinity. Well, calculators don't have a negative infinity button, but you could put in a very, very, very, very negative number that for our purposes is equivalent to negative infinity.

So, we could say negative 1 times 10 to the 99th power, and the way we do that is second. This two capital E's are saying essentially times 10 to the— and I'll say 99th power. So this is a very, very, very negative number. The upper bound here we want to go—let me delete this— the upper bound is going to be zero.

We're finding the area from negative infinity all the way to 0. The mean here—well, we've already figured that out; the mean is 8. And then the standard deviation here we figured this out too; this is equal to 10.

And so when we pick this, we're going to go back to the main screen, enter. So this is—we could have just typed this in directly on the main screen—this says, look, we're looking at a normal distribution. We want to find the cumulative area between two bounds; in this case, it's from negative infinity to 0, where the mean is 8 and the standard deviation is 10.

We press enter and we get approximately 0.212. It is approximately 0.212. Or you could say, what is the probability that the woman is taller than the man? Well, 0.212 or approximately—there's a 21.2 percent chance of that happening; a little better than 1 in 5.

More Articles

View All
15 Ways to Hack Your Brain to Break Bad Habits
How many times have you tried to break a bad habit? 90% of people fail when they first start trying to break their bad habits, and it’s because they’re trying to break it in all the wrong ways. Habits are hardwired into your brain, and they have to be bec…
How can a dandelion hold back a flood? | Initiating the butterfly effect for good
I’m setting off on a journey around the world to follow the butterfly effect. I want to see how even a single action on one side of the globe can have a profound environmental impact on the other. My journey begins in Germany, with a family rewilding thei…
The Child Mind Institute on supporting children during Covid-19 | Homeroom with Sal
Hi everyone, welcome to the daily homeroom! Uh, for those of you all who aren’t familiar with what this is or might just be showing up off of Facebook or YouTube, uh, this is Khan Academy’s way of making sure that we all stay connected during school clos…
Surf Sisters - Ep. 2 | National Geographic Presents: IMPACT With Gal Gadot
GAL: Grief and loss are the most universal things that humans experience. Kelsey, who lost her twin sister to Covid last year, realized this truth. And instead of isolating herself in her pain, she reached out to help heal others. This is her Impact. KEL…
Safari Live - Day 166 | National Geographic
This program features live coverage of an African safari and may include animal kills and carcasses. Viewer discretion is advised. Good afternoon, good afternoon ladies and gentlemen, and a very warm welcome to you again here on Safari Live. We are on a …
Watch: Elephant Attack From a Survivor’s POV | National Geographic
After the last group of elephants had crossed the glade, the final elephant turned and began to ram towards us, ears flapping and trumpeting. This is usually a sign of a bluff charge from about 150 m away. Very unusual behavior. We started backing away, w…