yego.me
💡 Stop wasting time. Read Youtube instead of watch. Download Chrome Extension

Sampling distribution of the difference in sample means | AP Statistics | Khan Academy


5m read
·Nov 10, 2024

What we're going to do in this video is explore the sampling distribution for a difference in sample means, and we'll use this example right over here. So it tells us a large bakery makes thousands of cupcakes daily in two shifts: shift A and shift B.

Suppose that on average, cupcakes from shift A weigh 130 grams, with a standard deviation of 4 grams. For shift B, the mean and standard deviation are 125 grams and 3 grams, respectively. Assume independence between shifts. Every day, the bakery takes a simple random sample of 40 cupcakes from each shift. They calculate the mean weight for each sample, then look at the difference A minus B between the sample means.

Find the probability that the mean weights from the samples are more than six grams apart from each other. So I'm actually not going to tell you immediately to pause this video and try to work through this on your own. First, I'm going to think about how we could break this down, and then I'll ask you to pause and try to tackle each of those parts by itself.

So, in order to tackle this eventual question, we're going to have to think about the mean of the sampling distribution for the difference in sample means: sample mean from group A minus sample mean for group B. We're going to have to think about the standard deviation of the sampling distribution for the difference in sample means, and we're going to think about if this distribution is normal.

If we're able to figure out these three things, then we just have to figure out how many standard deviations away from the mean this is. We could use your standard z-table to figure out the probability. So now I encourage you to pause this video and try to tackle this first part: what is the mean of the sampling distribution for the difference in sample means?

All right, now let's work through this together. The mean of the sampling distribution for the difference in sample means— and we have seen this before—this is going to be equal to the difference between the means of the sampling distribution for each of the sample means. So that mean minus this mean.

We also know that the mean of the sampling distribution for each of these sample means, that's just going to be the mean of the population that we are sampling from. So this mean right over here is just going to be the population mean for shift A, which is going to be 130 grams. I'll just write that there.

Then the mean of the sampling distribution for the sample means from shift B, we can see that that's just going to be the population mean for shift B, which is right over here, so minus 125 grams. And of course, this is just going to be equal to 5 grams. So we have answered the first part: we know the mean of the sampling distribution of the difference in sample means.

Now, what about the standard deviation? So for that, let's think actually about variances because the math's a little bit easier with variances, and then from that we can derive standard deviations. So we know that the variance of the sampling distribution for the difference in sample means, assuming that your two samples are independent and you're sampling with replacement, if you're sampling with replacement, it's actually going to be the sum of the variances of the sampling distribution for each of the sample means.

So it's going to be that plus this right over here. Now, you might be saying, wait, we're not sampling with replacement. Well, we also know that if each of the sample sizes are less than 10 percent of the population, then the difference is negligible, and so we could still use this formula.

You could see that the simple random sample here is 40 from each shift, and they say that a large bakery makes thousands of cupcakes daily in two shifts. So even if it was a thousand, ten percent of that would be a hundred. This is less than ten percent, so we meet that condition. We can use the same formula that you would use if you were sampling with replacement.

So this first variance right over here of the sampling distribution for the sample means from shift A, this is going to be equal to the variance of shift A, the population variance of shift A, divided by your sample size. Then this over here is going to be the same thing for shift B: it's going to be the variance of shift B divided by your sample size.

So this is going to be equal to what? Well, the variance from shift A is going to be the square of the standard deviation from shift A. The standard deviation's right over there, and so that's going to be 16. We could write grams squared if we want to keep the units there, and then we're going to divide by the sample size.

We know that the sample size in each case is 40 cupcakes at a time for each sample. And then for shift B, we know that the standard deviation, the population standard deviation for shift B is 3 grams. You square that, and you get 9 grams squared. A gram squared is kind of an interesting idea, but that's what the units are working out to be right now, and our sample size is still equal to 40.

And so this is going to be equal to, let's see, 16 plus 9 is 25, a common denominator of 40. So it's 25 over 40, which is the same thing as 5/8, 5/8 of a gram squared, which is a little bit strange for units. But this now tells us what the standard deviation is going to be because it's just going to be the square root of all of this business.

So the standard deviation of the sampling distribution for the difference in sample means over here is going to be the square root of 5/8, and now of course the units are back to grams, which makes sense. This is approximately going to be equal to... get my calculator out. 5 divided by 8 equals... and then we take the square root of that, and it's going to be approximately 0.79.

So the next question before we try to figure out the probability is: are we dealing with a normal distribution here? When we think about the sampling distribution for the difference in sample means? So I encourage you to pause the video again and think about that.

So there's two ways that we can assume that the sampling distribution for the difference in sampling means is normal. If the original populations that each of the sample means are being calculated from are normal, then that means that the sampling distribution for each of the sample means is going to be normal, and that means that the difference of the sampling distributions are going to be normal.

Now, we don't know for a fact that the weights of the cupcakes from each shift are normal distributions, but we also know that the sampling distribution of the sampling means can be modeled as being approximately normal if the two sample sizes are greater than or equal to 30. We know that each of these samples are definitely greater than or equal to 30; they are 40.

So that tells us that the sampling distribution of the difference in sample means is also normal. We've established the things that we need to then calculate the probability. So I encourage you to pause the video and see if you can use that information to calculate that probability, and we will then do that in the next video.

More Articles

View All
Inside Japan’s Earthquake Simulator
This is the world’s largest earthquake simulator. It’s called E-Defense. Its huge shake table can support a 10-story building and then move it in all directions with the force of the world’s most destructive earthquakes. E-Defense has conducted more than …
Area model for multiplying polynomials with negative terms
In previous videos, we’ve already looked at using area models to think about multiplying expressions, like multiplying x plus seven times x plus three. In those videos, we saw that we could think about it as finding the area of a rectangle, where we could…
Peter Lynch: How to Invest in 2023
Peter Lynch: The man, the myth, the legend. He ran the Magellan fund at Fidelity between 1977 and 1990, where he achieved a 29.2 percent annual return. The guy is an investing master. He also wrote the book “One Up On Wall Street,” which you know at this …
Inside a Civil War Most People Have Never Heard of | National Geographic
This family was luckier than most. After nine days as hostages, these men returned to their loved ones. It was an incredible moment to witness. So in a I too, kind of fear, anger, and hope is present every day in the Central African Republic. Since 2013,…
Introduction to Democracy and its broad variations
What we’re going to do in this video is dig a little bit deeper into the notion of democracy. The reason why this is going to be valuable is that it’s going to inform the decisions that the founding fathers had to make when they thought about whether to r…
Worked example: Calculating solubility from Kₛₚ | Equilibrium | AP Chemistry | Khan Academy
[Instructor] Let’s calculate the molar solubility of calcium fluoride if the Ksp value for calcium fluoride is 3.9 times 10 to the negative 11th at 25 degrees Celsius. The first step is to write the dissolution equation for calcium fluoride. So, solid c…