yego.me
💡 Stop wasting time. Read Youtube instead of watch. Download Chrome Extension

Sampling distribution of the difference in sample proportions | AP Statistics | Khan Academy


6m read
·Nov 10, 2024

We're told suppose that eight percent of all cars produced at plant A have a certain defect and six percent of all cars produced at plant B have this defect. Each month, a quality control manager takes separate random samples of 200 of the over 3000 cars produced from each plant. The manager looks at the difference between the proportions of cars with the defect in each sample, so they're looking at the difference of sample proportions every month.

Describe the distribution of the difference of sample proportions in terms of its mean, standard deviation, and shape. So let's take these step by step. First, let's think about the mean of the difference of our sample proportions. Pause this video and try to figure out what that's going to be.

Well, we have seen this in previous videos that if we have the mean of the difference of two random variables, that's the same as the difference of the means. Or another way to think about it is if we want to figure out the mean of this: sample proportion from plant A minus sample proportion from plant B. This is just going to be equal to the mean of the sample proportion from plant A minus the mean of the sample proportion from plant B.

Now, what are these going to be equal to? Well, what's the mean of the sample proportion of plant A? It's just going to be the true population proportion for plant A. And they tell us that. They tell us that 8 percent of all cars produced at plant A have a certain defect. So this could be eight percent or we could write it as 0.08. And then from that, we are going to subtract the mean of the sample proportion from plant B. We know what that mean's going to be. The mean of a sample proportion is going to be the population proportion, the parameter of the population, which we know for plant B is six percent, or 0.06.

And then that gets us a mean of the difference of 0.02 or 2 percent. Having a 2 percent difference in defect rate would be the mean. Now let's think about the standard deviation. Instead of thinking in terms of standard deviation, let's think about the square of the standard deviation, which is variance. From there, we can go back to standard deviation by taking a square root.

So if we're looking at the variance, let me write it this way: if we're looking at the variance of the difference of the sample proportions, so the sample proportion from plant A minus the sample proportion from plant B. But just as a review, if you assume that we're sampling independently from each of the plants, so what we're sampling from plant A does not affect what we're sampling from plant B or vice versa, then we can add the variances.

So this is going to be equal to the variance of the sample proportion from plant A plus the variance of the sample proportion from plant B. Some of you might be saying, "Wait, aren't we taking the difference of sample proportions here? Why are we adding?" And the reminder is to remember variance is a measure of spread. Whether you're now taking the difference of random variables or you're taking the sum of them, when you have more variables, you're going to have more spread.

So regardless of whether this is negative or positive over here, this is going to be a positive. So what is this going to be equal to? Well, we could take each of these terms. What's going to be the variance of the sample proportion from plant A? Well, if every time we looked at one of the cars, we looked at it, and then we put it back into the mix. So if we were sampling with replacement, which means that each of our observations are independent of the other ones, we have a formula.

We know that this variance would be the population proportion of plant A times one minus the population proportion of plant A divided by the number that we sampled from plant A. Now, in the scenario that we are talking about, we didn't sample with replacement. We just took 200 at a time and looked at them. We didn't take one at a time and replace it and do that 200 times. But we also know that this is a pretty good approximation even when you are not sampling with replacement if your sample is less than ten percent of the population.

And 200 is less than ten percent of 3000, so this is a pretty good approximation of what you would use in a first-year statistics class. Of course, we can use the same logic. This is going to be equal to the population proportion from plant B times one minus the population proportion in plant B, all of that over your sample size from plant B.

And we know all of these things. We know that your population proportion in plant A is eight percent or 0.08. One minus that is 0.92. We're taking samples of 200 at a time from plant A. And then in plant B, we know the population proportion they told us is six percent or 0.06. One minus that is 0.94. And then the sample size from plant B is also going to be 200.

It's going to be 200. We get 0.08 times 0.92 divided by 200. And then, plus, let's open parentheses here, we get 0.06 times 0.94 divided by 200. And then actually, let me close the parentheses. And that equals this business: 0.00065. So, 0.00065. From this, we can figure out what the standard deviation is going to be.

The standard deviation of the difference between our sample proportions is going to be just the square root of this. It's going to be the square root of 0.00065. And that is approximately equal to, let's just take the square root and we get this, 0.0255. 0.025. And there you have it. We have thought about the standard deviation.

Then, last but not least, let's think about the shape. Just as a review, we just have to remind ourselves that the distribution of each sample proportion is going to be normal as long as we expect at least 10 successes and 10 failures. Well, let's look at each of these. How many successes do you expect, where success would actually be a defect? But let's think about this.

Eight percent of, in each case, of a sample of 200. That's going to be 16. So you would expect 16 defects. And then you would expect 200 minus 16, which is a lot larger than 10 of no defects. So both of those are greater than or equal to 10. If you did the same thing for plant B, you get the same idea: six percent of 200 is 12.

And then if you say the ones that have no defects, that's 200 minus 12, which is way more than 10, especially in that latter case. But in every situation, we expect to have at least 10 successes and 10 failures. So, we can assume that the distributions of each of these are going to be normal. We also know that the difference of two normally distributed variables is also normal, so long as they pass that large count condition that we just talked about.

So, let's draw what this distribution might look like. It might look something like this. It's going to be a normal distribution where you have a mean right over here. I'll do that in that same color: a mean of 0.02. You can definitely take on negative values because there are some situations in which your sample proportion from plant B actually could be larger just by random chance than it is from plant A.

So, you can definitely take on negative values. But if I wanted to show where zero is, maybe zero is right over here. So we could draw an axis right over here. And then we know what the standard deviation is; it's 0.025, or it's approximately that. So if we were to go one standard deviation down, we would go right about there, and if we were to go one standard deviation up, we would go right about there.

And obviously, we could go more than one standard deviation above or below that mean.

More Articles

View All
How to enter flow state
A tennis player wins her second straight set, feeling like she can predict her opponent’s next move. A musician works to master a chord progression, unfazed by mistakes as he repeatedly plays the same four bars. A scientist fails to notice her morning ala…
Hurricane Katrina Survivor Gives Tours of Its Destruction | National Geographic
Let me tell you a little bit about the City of New Orleans. Right after Katrina, I kept hearing everybody say, “Why should we pay our tax dollars to bring New Orleans back? They below sea level.” I am a tour guide. I do Katrina tours. I never was an emoti…
A vision of crimes in the future - Marc Goodman
[Music] [Music] I study the future of crime and terrorism, and frankly, I’m afraid. I’m afraid by what I see. I sincerely want to believe that technology can bring us the Techno Utopia that we’ve been promised. But you see, I’ve spent a career in law en…
15 Ways to Train Your Brain Like a Genius
Your brain is the most powerful weapon you can train to use. If you fine-tune it to your advantage, you can unlock its true potential and there’s really not much to it. It’s been said that the brain stops developing at 25, but that’s not entirely true. Yo…
Thank You
Hey guys, this is Mac Kids 101, and I just wanted to leave a video saying thank you. Thank you so much for supporting us for, um, for all the time that we’ve had this channel. I’m leaving this video because a lot of people are wondering where I’ve been f…
Homeroom with Sal & Lindsay Spears - Monday, June 22
Hi everyone! Welcome to the daily homeroom. It’s been a little bit of a while. We took a week-long break last week, so hopefully, everyone is doing well. For those of you who are new to this, this is something we started doing when we started seeing the …