Determining sample size based on confidence and margin of error | AP Statistics | Khan Academy
We're told Della wants to make a one-sample z-interval to estimate what proportion of her community members favor a tax increase for more local school funding. She wants her margin of error to be no more than plus or minus two percent at the 95% confidence level. What is the smallest sample size required to obtain the desired margin of error?
So let's just remind ourselves what the confidence interval will look like and what part of it is the margin of error. Then, we can think about what her sample size should be. She wants to estimate the true population proportion that favors a tax increase. She doesn't know what this is, so she is going to take a sample size of size n. In fact, this question is all about what n she needs in order to have the desired margin of error.
Well, whatever sample she takes, she's going to calculate a sample proportion, and then the confidence interval that she's going to construct is going to be that sample proportion plus or minus the critical value. This critical value is based on the confidence level. We'll talk about that in a second.
What z*—what critical value would correspond to a 95% confidence level? Then, you would multiply that by the standard error of her statistic. In this case, it would be the square root of the standard error of her sample proportion, which is the sample proportion times 1 minus the sample proportion, all of that over her sample size.
Now, she wants the margin of error to be no more than 2 percent, so the margin of error is this part right over here. She wants it to be less than or equal to two percent. That green color is kind of too shocking; it's unpleasant. All right, less than or equal to two percent right over here.
So how do we figure that out? Well, the first thing—let's just make sure we incorporate the 95% confidence level. We could look at a z-table. Remember, a 95% confidence level means if we have a normal distribution here, 95% confidence level means the number of standard deviations we need to go above and beyond this in order to capture 95% of the area right over here.
So this would be two and a half percent that is unshaded at the top right over there, and then this would be two and a half percent right over here. We could look up in a z-table, and if you were to look up in a z-table, you wouldn't look up 95; you would look up the percentage that would leave two and a half percent unshaded at the top. So you would actually look up 97.5. But it's good to know in general that at a 95% confidence level, you're looking at a critical value of 1.96, and that's just something good to know.
We could, of course, look it up on a z-table, so this is 1.96. And so this is going to be 1.96 right over here. But what about p hat? We don't know what p hat is until we actually take the sample. But this whole question is how large of a sample should we take?
Well, remember we want this stuff right over here that I’m now circling in this less bright color—this blue color. We want this thing to be less than or equal to two percent. This is our margin of error. So what we could do is we could pick a sample proportion. We don't know what that's going to be that maximizes this right over here because if we maximize this, we know that we're essentially figuring out the largest thing that this could end up being, and then we'll be safe.
So the p-hat, the maximum p-hat, is 0.5. And I want to emphasize: we don't know; she didn't even perform the sample yet. She didn't even take the random sample and calculate the sample proportion. But we want to figure out what n to take. So to be safe, she says, "Okay, what sample proportion would maximize my margin of error?"
Let me just assume that, and then let me calculate n. So let me set up an inequality here. We want 1.96—that's our critical value—times the square root of, we're just going to assume 0.5 for our sample proportion. Although, of course, we don't know what it is yet until we actually take the sample.
So that’s our sample proportion, that's 1 minus our sample proportion—all of that over n—needs to be less than or equal to 2 percent. We don't want our margin of error to be any larger than two percent. Let me just write this as a decimal: 0.02.
Now we just have to do a little bit of algebra to calculate this. So let's see how we could do this. This could be rewritten as, we could divide both sides by 1.96. So this would be equal to, on the left-hand side, we'd have the square root of all of this, but that's the same thing as the square root of 0.5 times 0.5. So that'd just be 0.5 over the square root of n needs to be less than or equal to, actually, let me write it this way: this is the same thing as 2 over 100.
So, 2 over 100 times 1 over 1.96 needs to be less than or equal to 2 over 196. Let me scroll down a little bit; this is fancier algebra than we typically do in statistics, or at least an introductory statistics class.
All right, so let's see. We could take the reciprocal of both sides. We could say the square root of n over 0.5 and 196 over 2. And let's see, what's 196 divided by 2? That is going to be 98. So this would be 98.
If we take the reciprocal of both sides, then you're going to swap the inequality, so it's going to be greater than or equal to, let's see, I can multiply both sides of this by 0.5. So 0.5—that's my—I said 0.5, but my fingers wrote down 0.4.
And see, 0.5. There we get the square root of n needs to be greater than or equal to 49, or n needs to be greater than or equal to 49 squared. And what's 49 squared? Well, you know 50 squared is 2500, so you know it's going to be close to that.
So you can already make a pretty good estimate that it's going to be d. But if you want to multiply it out, we can—49 times 49.
9 times 9 is 81. 9 times 4 is 36 plus 8 is 44. 4 times 9 is 36. 4 times 4 is 16 plus 3. We have 19, and then you add all of that together and you indeed do get—so that's 10, and so this is a 14.
You do indeed get 2401. So that's the minimum sample size that Della should take if she genuinely wanted her margin of error to be no more than two percent.
Now, it might turn out that her margin of error, when she actually takes the sample of size 2401, if her sample proportion is less than 0.5 or greater than 0.5, well then she's going to be in a situation where her margin of error might be less than this. But she just wanted it to be no more than that.
Another important thing to appreciate is it just—the math all worked out very nicely just now where I got our n to be actually a whole number. But if I got 2400 and 1.5, then you would have to round up to the nearest whole number because your sample size is always going to be a whole number value.
So I will leave you there.