yego.me
💡 Stop wasting time. Read Youtube instead of watch. Download Chrome Extension

Techniques for generating a simple random sample | Study design | AP Statistics | Khan Academy


4m read
·Nov 11, 2024

Let's say that your school has a population of 80 students in it. Maybe it's not your whole school; maybe it's just your grade. So there's 80 students in your population, and you want to get an estimate of the average height in your population. You think it's too hard for you to go and measure the height of all 80 students. So you decide to find a simple or take a simple random sample. You think it's reasonable for you to measure the heights of 30 of these students.

What you want to do is randomly sample 30 of the 80 students and take their average height and say, "Well, that's probably a pretty good estimate for the population parameter for the average height of the entire population." So once you decide to do this, you say, "Well, how do I select those 30 students and how do I select it so that I feel good that it is actually random?"

There are several ways that you could approach this. One way to do it is to associate every person in your school with a piece of paper and put them all in a bowl, then pick them out. Let's do that!

So let's say this is alphabetically the first person in the school; they're on a slip of paper. Then the next slip of paper gets the next person, and you're going to go all the way down. So you're going to have 80 pieces of paper; they all should be the same size. Then you throw them all into a bowl of some kind. This seems like a very basic way of doing it, but it's actually a pretty effective way of getting a simple random sample.

I’ll try to draw a little that looks like a fish bowl or something. All right, so that's our bowl, and so all the pieces of paper go in there. Then you get someone to put a blindfold on, and they can't feel what names are there. They should pick out the first 30 without replacing them because you obviously don't want to pick the same name twice. Those 30 names that you pick would be your simple random sample, and then you could measure their heights to estimate the average height for the population. This would be a completely legitimate way of doing it.

Other ways that you could do it: if you have a computer or calculator, you could use a random number generator. The random functions on computer programming languages or on your calculator tend to be something—some place you’ll see something like math.rand() (rand short for random). You might see something like random(); you might see something like random() without anything passed into it. It might give you a number between 0 and 1, or 1 or 0 and 100.

You have to be very careful on how you use this to make sure that you have an even chance of picking certain numbers. What you would do in this situation, if you had access to some random number generator and it could even pick out a random number between 1 and 80, including 1 and 80, is you would line up all the students' names alphabetically. The first student alphabetically would be assigned the number 01, and you could just say 1 if you're using a random number generator.

But I'll use two digits for it just because it'll be useful and consistent, and in a little bit, we'll use another technique where it's going to be nice to be consistent with our number of digits. So the next one 02, and you go all the way to 79 and all the way to 80. Then you use your random number generator to keep generating numbers from 1 to 80. As long as you don't get repeats, you pick the first 30 to be your actual random sample.

Another related technique, which is a little bit more old school, but it's definitely the way that it has been done in the past and even done now sometimes, is to use a random digit table. You still start with these number associations with each student in the class. Then you use a randomly generated list of numbers.

Let's say that’s our randomly generated list of numbers, and it keeps going well beyond this. You start at the beginning and say, "Okay, we're interested in getting 30 two-digit numbers from 1 to 80, including 1 and 80." One technique that you could use is, you start right at the beginning and say, "All right, this is a randomly generated list of numbers."

The first number here is 59. Is 59 between 1 and 80? Sure is! As long as this was a 01, that would have worked. If this was an 80, that would have worked. If this was a 00, it wouldn't have worked. If this was an 81, it wouldn't have worked. But this would be our—this right over here—that would be our first name that we could imagine as picking that first name out of the hat. Whoever is associated with number 59 now you would move on.

You get the next two digits; the next two digits are 83. They don't fall into our range from 1 to 80, so we're not going to use it. Then you look at the next two digits, and we get a 5 and a 9. Well, that fits in our range, but we already picked person 59, so we're not going to pick 59 again. So we keep moving on. Then we get a 37; well that's in our range, and we haven't picked that yet, so we do that.

Then we get a 00; once again not in our range. I think you see where this is going. 91— not in our range. 23—it's in our range, and we haven't picked it yet, so we're going to pick the 23. I think you see where this is going; we're going to keep going down this list in the way that I've just described until we get 30 of these.

We've just gotten three; we just have to keep on going. This isn't an exhaustive list of all of the different ways that you can get random numbers, but it starts to give you some techniques in your toolkit. You might say, "Oh, well, why don't I just randomly come up with some numbers in my head?" I would really suggest that you don't do that because humans are famously bad at being truly random. You might want to do something like even use something that you think is a random process, but you realize later that it wasn't as random as you thought.

So once again, multiple techniques, but these are some of the, I would say, best practices for actually generating a simple random sample.

More Articles

View All
What Causes The Phases Of The Moon?
[Applause] Now I’ve been around Sydney and I’ve asked people what causes the phases of the moon, and you know what they say? How do we get the faces of the Moon? Uh, because of the Earth blocks the light that comes from the Sun. A full moon is basically w…
Michael Burry's CRAZY Win on Gamestop (Courtesy of Wall Street Bets)
Can’t stop, won’t stop, Gamestop! The following video is an interesting tale of how this guy rode this wave thanks to these guys and somehow got annoyed by it. [Music] Well, it’s highly likely that in the last couple of weeks, Michael Burry has made an …
Why Black Holes Could Delete The Universe – The Information Paradox
Black holes are the most powerful things in the universe, strong enough to rip whole stars into atom-sized pieces. Well, this is scary enough. They have an even more powerful and dark property: they might delete the universe itself. Black holes in a nuts…
Announcing O'Leary Fine Wines
[Music] And we are back now with the new edition of Shar Tank. Your life, we have two entrepreneurs ready to go head-to-head. Kevin Oer from Shark Tank is here. We’ve already seen him double Dutch; it’s one of his many, many talents. Also wearing that Smi…
Fireflies Put on a Spectacular Mating Dance | Short Film Showcase
[Music] It’s late summer in the highland forests of Mexico. Billions of fireflies are hiding in the underbrush, waiting for the perfect night to find a mate. But most nights, something is off, and so they keep waiting. The fireflies prefer a moonless nigh…
The Search for Intelligent Life on Earth | Cosmos: Possible Worlds
[bees buzzing] NEIL DEGRASSE TYSON: For thousands of years, bees have been symbols of mindless industry. We always think of them as being something like biological robots, doomed to live out their lives in lockstep, shackled to the dreary roles assigned …