Free response example: Significance test for a mean | AP Statistics | Khan Academy
Regulations require that product labels on containers of food that are available for sale to the public accurately state the amount of food in those containers. Specifically, if milk containers are labeled to have 128 fluid ounces and the mean number of fluid ounces of milk in the containers is at least 128, the milk processor is considered to be in compliance with the regulations.
The filling machines can be set to the labeled amount. Variability in the filling process causes the actual contents of milk containers to be normally distributed. A random sample of 12 containers of milk was drawn from the milk processing line in a plant, and the amount of milk in each container was recorded. The sample mean and standard deviation of this sample of 12 containers of milk were 127.2 ounces and 2.1 ounces, respectively.
Is there sufficient evidence to conclude that the packaging plant is not in compliance with the regulations? Provide statistical justification for your answer. So pause this video and see if you can have a go at it all right now.
Let's do this together. So first, let's say what we're talking about. Let me define mu, and this is going to be the mean amount of milk in the population of containers at the plant that we care about.
Then we can set up our hypotheses. Our null hypothesis over here is that we are in compliance. We could say that the mean for our population of containers is actually 128. That's our minimum we need to be in compliance. Our alternative hypothesis is that we are not in compliance, so that's that our mean, the true population mean, is less than 128 fluid ounces. So this is a situation where we are not in compliance in the alternative hypothesis.
Now if you're going to do a significance test, you need to set a significance level. So let's do that over here. Significance level. If you haven't noticed, I'm doing what would be expected of you on a test, and this is an actual question from an AP exam.
Our significance level here, I'll just pick it to be 0.05 because that's a fairly typical one, and since they didn't give one to us, it's important to set one ahead of time. Now we want to check our conditions for inference. So let me do that over here. Conditions for inference. This is to feel good that the sample that we're using to make our inference, to do our significance test, is a reasonable one to make inferences from.
The first one is our random condition. Do we meet that? Well, they tell us here it's a random sample of 12 containers of milk. If I was doing this on the AP exam, I would write it out here. I would say, in the passage or in the question, they say a random sample of 12, and then they go on to say more things. So I would say that meets condition.
Now the next one we want to care about is our normal condition. This is to feel good that our sampling distribution is roughly normal. Now, there are a couple of ways that we could do that. One is if our sample size is greater than 30 or greater than or equal to 30, then we say okay, our sampling distribution is going to be roughly normal.
But in this situation, our sample size n is less than 30. But there's another way to meet the normal condition, and that's if the underlying parent data is normally distributed. They actually say it right over here: variability in the filling process causes the actual contents of milk to be normally distributed.
So we could say, in the passage it says, and let's see, I could quote part of this. So actual contents and dot dot dot normally distributed. So that meets condition.
The last condition we want to think about is the independence condition. This is to feel good that the individual observations in our sample can be considered to be roughly independent. Now, one way is if they were sampling with replacement, which they're not doing here. It looks like they took all 12 containers at once.
But another way is if this is less than 10 percent of the overall population, then you could say, okay, you can view them as roughly independent. So you say didn’t sample with replacement, but assume that 12 is less than 10 percent of the population. In that case, you would meet this condition as well.
So it looks like we've met these three conditions that we need to make for inference or we can assume we've done it. They haven't given us any information to the contrary. Now what we can do is calculate a t statistic and then from that calculate our p-value, compare our p-value to our significance level, and see what kind of conclusions we can make.
So our t statistic right over here—and once again if at any point you're inspired, and if you haven't done so already, try to do it on your own—our t statistic is going to be our sample mean minus the assumed mean from the null hypothesis.
Let me, since I'm introducing this notation, this little sub zero, I'll say that's the assumed mean from my null hypothesis. I’ll do that, and then I'll divide. Ideally, if I was doing a z statistic, I would divide by the standard deviation of the sampling distribution of the sample mean, which is often known as the standard error of the mean.
But the whole reason why I'm doing a t statistic is, well, I don't know exactly what that is, but I could estimate the standard deviation of the sampling distribution of the sample mean using the sample standard deviation divided by the square root of n. And once again, it's always good if you're doing this on a test to explain what n is or what some of these things are.
If you're using standard notation, people might assume what they are, but if you have time on these tests, you can always explain more of what these actual variables are. But in this case, this is going to be 127.2, that is our sample mean minus our assumed mean from our null hypothesis, minus 128, all of that over our sample standard deviation, which is 2.1 divided by the square root of 12.
And so this is going to be approximately equal to—I got a calculator out here—and so we have seen the numerator. We have 127.2 minus 128, and then we're going to divide that by, I'll do another parenthesis, 2.1 divided by the square root of 12. Let me close my parentheses. Did I type that in correctly? Yeah, that looks right. Click enter.
And so this is negative. I'll say it's approximately negative 1.32. So negative 1.32. Now we can figure out our p-value. Our p-value, which is the same thing as the probability of getting a t-statistic this low or lower, so we could say t is less than or equal to negative 1.32 is equal to—so I’ll get my calculator back out—and so here what I would use is I would use the cumulative distribution function for t statistics.
So that's that right over there. I do care about the left tail, so I care about the area under the curve from negative infinity up to and including negative 1.32. So let's do negative negative 1.32, and then my degrees of freedom, well, it's going to be my sample size minus 1. My sample size was 12, so that minus 1 is 11.
And then I do paste. So I have this tcdf from negative e 99 to negative 1.32, comma 11. Actually, you would want to write this down on your exam if you were doing it just so they know where you got that from. This is equal to [Music] 0.107.
So let me write it: this is approximately 0.107. It's important to say how you calculated this. So used t c d f, and we went from negative 1 times 10 to the 99th power and we went up to negative 1.32, and then we had 11 degrees of freedom to get this result right over here.
It also might be good practice to draw your t distribution right over here. So that's our t distribution; that's the mean of our t distribution. We say that this is the area that we care about, so that is that right over there just to make sure people know what we're talking about.
Now we're ready to make a conclusion. We can compare this to our significance level. We can say since 0.107 is greater than our significance level, which is greater than 0.05, we fail to reject the null hypothesis.
So let's just make sure we read their question right: is there sufficient evidence to conclude that the packaging plant is not in compliance with the regulations? Another way of saying this is there is not sufficient evidence to conclude that the plant is not in compliance with regulations.
And then we are done.