yego.me
💡 Stop wasting time. Read Youtube instead of watch. Download Chrome Extension

Introduction to sampling distributions


5m read
·Nov 11, 2024

So let's say I have a bag of colored balls here, and we know that 40 of the balls are orange. Now imagine defining a random variable X, and X is based on a trial where we stick our hand in this bag, we don't look around, and we randomly pick a ball, look at its color, we record it, and then we're going to put it back. So we're going to assume that we have replacement here, and we're going to say that our random variable X is going to be equal to 1 if we pick an orange ball, and it's going to be equal to 0 otherwise.

You might already recognize this as a Bernoulli random variable, and we can construct a probability distribution for X. In fact, let's do this. So X is going to be discrete; it can only take on two different values. So X can take on zero, or it can take on one. If there are forty percent of the balls are orange, the one has a forty percent chance of happening. So let me do that. So there's going to be a forty percent chance of getting a 1. So that's 0.4 right over there, and there would be a 60 chance, or a 0.6 probability of getting a 0.

So this right over here, just trying to hand-draw it, so this would be 0.6 probability of getting a zero. So we could call this the probability distribution. Probability distribution for X, this is all review so far. But the reason why I did this is because we're now going to introduce ourselves to the notion of a sampling distribution, and it can be a little bit confusing because in our brains, we tend to think in terms of probability distributions and not as much in terms of sampling distributions.

So what you do in a sampling distribution is you still start with a population here, but then you take a sample of that population. So let me label things. So this is our population. This is our population here. We take a sample; we take a sample from that population, and it could have a certain sample size, sample size n. Then we'll calculate some statistic for that sample. So we will calculate a statistic, and then we're going to think about the distributions of these statistics that we can get from these samples.

One way to think about this is keep doing this. So this is our first sample of sample size n; we calculated statistics. Then we take another sample of sample size n, and then we calculate the statistic again. Then we take another sample, and we just keep doing this. We take another sample of sample size n, and we calculate the statistic again. And let's say we were to do this an infinite number of times, and we were to plot the distribution of the statistic that we're calculating; well then we have our sampling distribution.

Let's try to make this a little bit more tangible by going back to our colored balls example and calculate or think about a sampling distribution for that. So let's say we have our population here. Population, and we know that the parameter for this population; we know that the proportion of balls that are orange, forty percent are orange. We don't always know the parameters; oftentimes we're estimating the parameters by looking at samples. But let's say we then take sample sizes of ten, so sample size 10.

Every time we calculate the statistic for our sample of what percentage are orange. So let's say the first time we take a sample, this time over here we get three oranges. Three oranges. Let's say the next time we get two oranges. Actually, let me do these as a proportion. So if my sample size is 10, I get three oranges, which is thirty percent, and then if I do it again, I get two oranges, and that is twenty percent.

And I just keep doing this, and eventually, I can plot a distribution of these sample proportions. You would end up with some type of a discrete distribution. The way to read this discrete distribution is, let's say this right over here ends up, and I'm just going to make up a number. This isn't going to be the actual number, but let's say that this is 0.15. The way to read that is you have a 15 chance of getting a sample where 50 of your balls are orange. Or if this right over here is 0.07, that would mean that you have a 7 chance where 20 of your balls are orange.

Now, to make this even a little bit more tangible, let's run a simulation that actually does this. This right over here is a simulation created on Khan Academy on our computer programming scratch pads by Charlotte Owen. It's a simulation to construct a sampling distribution. So, let's say here she's using candies instead of just colored balls, but these candies are essentially colored balls. And so here we can set the population proportion. So let's say that the actual proportion, as we saw in our example, of let's say it's green as opposed to orange here is 40 percent.

And so let's say in each sample, just as we said, our sample size is ten, so we're gonna take a sample size of ten. And let's just do one sample first. So let's just draw a sample. And so what we did is we took 10 of these gumballs out, and we are counting how many of them are green. So in this first sample of 10, we see that 1, 2, 3, 4, 5, 6 of them are green. So in the out of the possible outcomes, we're now going to tally one of our outcomes having, hey, we got six of our 10 to be green.

And if we want to show the proportion instead of just the count, we can just pick percentage here. And so here we've had one scenario already where 60 were green. But we don't want to just do one sample; we just want to keep drawing samples. Let's draw another sample. So in this last sample, we have fifty percent are green. So now that we have one was fifty percent green, one was sixty percent green, let's try another sample.

Now we have another sample where we got sixty percent green. So there are two situations where we had sixty percent green. And so I can keep doing this over and over and over. And so what we're creating right over here is a sampling distribution. If we were to do this an infinite number of times, we would get the true sampling distribution of the sample proportion given the actual population proportion that is green.

And so this is after 77 samples. Notice this is saying that out of 77 of our samples, 22 of those samples resulted in 40 percent of our gumballs being green. Only one of our samples had 80 percent of our gumballs being green. And if we just want to do a ton more samples, I'll go all the way to drawing 50 samples at a time. So let me just keep increasing this. Notice we have 17 samples now where we had zero percent that are green.

We have 91 of the 2200 samples where 10 were green, where one out of the 10 in our sample were green. And we could convert any of these numbers; 17, 91, 256, we could turn these into percentages by just dividing by the number of samples. But this is fun; we could just keep going and making this larger and larger and larger. I encourage you to play with this; I'll provide a link for it in the description of this video and on Khan Academy.

But the main idea is to get an intuition for how a sampling distribution is different from just a traditional probability distribution; that in a sampling distribution, you're taking samples from a population, calculating some statistic for that sample, and what you're plotting in the sampling distribution are the various probabilities, the various likelihoods of the outcomes for those statistics in those samples.

More Articles

View All
Estimating subtracting decimals
[Instructor] Alright, now let’s get some practice estimating, subtracting decimals. So, over here it says 12.93 minus 6.1 is approximately equal to what? This squiggly-looking equal sign you can view as roughly equal to or approximately equal to. So, paus…
Perfect Muzzle Flash Photos - Smarter Every Day 43
Hey, it’s me Destin. Welcome back to Smarter Every Day. So, ah, first things first, let me show you that this weapon is unloaded. And I really like to think about firearms because there’s a lot of science involved here. What causes muzzle flash? Alright…
Stopped Paying Mortgage | The 2020 Real Estate Collapse
What’s up you guys? It’s Graham here. So, I wanted to cover one of the most requested topics here in the channel over the last month. Besides the giant murder hornets coming to the United States. Really quick, have you seen these things? They’re massive! …
3 Ways To Crush Next Year
Hey there, relaxer! It’s that time of year again, time to start thinking about your goals and resolutions for the upcoming year. Last year, you said this year was the year. Well, maybe it’s actually time to take yourself more seriously. Now here’s the th…
This Russian City is the Amber Capital of the World | National Geographic
On beaches like this one outside of Kaliningrad, precious gemstone amber is so plentiful you might simply find it washed up in the sand. Amber is actually fossilized tree sap that’s 50 million years old. Ninety percent of the world’s supply of amber comes…
Homeroom with Sal & Melinda Gates - Tuesday, January 12
Hi everyone, Sal here from Khan Academy. Welcome to the Homeroom live stream! Actually, I think this is the first of the year. Hopefully, everyone had a good New Year’s considering the circumstances and is enjoying 2021. Given the circumstances, we have a…