Sampling distribution of sample proportion part 2 | AP Statistics | Khan Academy
This right over here is a scratch pad on Khan Academy created by Khan Academy user Charlotte Allen. What you see here is a simulation that allows us to keep sampling from our gumball machine and start approximating the sampling distribution of the sample proportion.
So, her simulation focuses on green gumballs, but we talked about yellow before. In the yellow gumballs, we said 60 were yellow, so let's make 60 percent here green. Then let's take samples of 10, just like we did before, and then let's just start with one sample.
So, we're going to draw one sample, and what we want to show is we want to show the percentages, which is the proportion of each sample that are green. So, if we draw that first sample, notice out of the 10, 5 ended up being green, and then it plotted that right over here under 50 percent. We have one situation where 50 were green.
Now let's do another sample. So, this sample 60 are green, and so let's keep going. Let's draw another sample, and now that one we have, we have 50 are green. So, notice now we see here on this distribution two of them had 50 green. We could keep drawing samples, and let's just really increase, so we're going to do 50 samples of 10 at a time.
So here we can quickly get to a fairly large number of samples, and here we're over a thousand samples. What's interesting here is we're seeing experimentally that our sample, the mean of our sample proportion here is 0.62. What we calculated a few minutes ago was that it should be 0.6.
We also see that the standard deviation of our sample proportion is 0.16, and what we calculated was approximately 0.15. As we draw more and more samples, we should get even closer and closer to those values, and we see that for the most part we are getting closer and closer. In fact, now that it's rounded, we're at exactly those values that we had calculated before.
Now, one interesting thing to observe is when your population proportion is not too close to zero and not too close to one, this looks pretty close to a normal distribution. That makes sense because we saw the relation between the sampling distribution of the sample proportion and a binomial random variable.
But what if our population proportion is closer to zero? So, let's say our population proportion is 10, 0.1. What do you think the distribution is going to look like then? Well, we know that the mean of our sampling distribution is going to be 10, and so you can imagine that the distribution is going to be right skewed. But let's actually see that.
So here we see that our distribution is indeed right skewed, and that makes sense because you can only get values from 0 to 1. If your mean is closer to zero, then you're going to see the meat of your distribution here, and then you're going to see a long tail to the right, which creates that right skew.
If your population proportion was close to one, well, you can imagine the opposite is going to happen. You're going to end up with a left skew, and we indeed see right over here a left skew. Now, the other interesting thing to appreciate is the larger your samples, the smaller the standard deviation.
So, let's do a population proportion that is right in between. So here this is similar to what we saw before; this is looking roughly normal. But now, and that's when we had a sample size of 10, but what if we have a sample size of 50 every time?
Well, notice now it looks like a much tighter distribution. This isn't even going all the way to one yet, but it is a much tighter distribution. The reason why that made sense, the standard deviation of your sample proportion is inversely proportional to the square root of n, and so that makes sense.
So hopefully, you have a good intuition now for the sample proportion, its distribution, the sampling distribution of the sample proportion, that you can calculate its mean and its standard deviation, and you feel good about it because we saw it in a simulation.