yego.me
💡 Stop wasting time. Read Youtube instead of watch. Download Chrome Extension

Example of under coverage introducing bias | Study design | AP Statistics | Khan Academy


3m read
·Nov 11, 2024

A senator wanted to know about how people in her state felt about internet privacy issues. She conducted a poll by calling 100 people whose names were randomly sampled from the phone book. Note that mobile phones and unlisted numbers are not in phone books. The senator's office called those numbers until they got a response from all 100 people chosen. The polls showed that 42% of respondents were very concerned about internet privacy.

What is the most concerning source of bias in this scenario? We should also think about what kind of bias is likely introduced. Is this likely to be an overestimate or an underestimate of the number of respondents? Maybe there is no bias here, but our choices, and "no bias" is not one of the choices. So, you can imagine it's going to be one of these three.

I encourage you to pause this video and think about what we just said. We're a senator, we're trying to figure out what percentage of our constituents are very concerned about internet privacy, and we go to the phone book, we sample 100 people, we keep calling them until they answer, and we get that 42% are very concerned. So, what's the source of bias?

Now let's work through this together. Non-response would have been the case if we selected these hundred people and let's say only 50 people answered the phone, and we didn't keep calling them. Then we'd say, "Well, you know, 50 of the people who we sampled to answer our survey didn't even respond." There was a non-response there. What was there about those 50 people? Maybe there was something that would have skewed the survey. If we had gotten them, it would have maybe yielded better data.

But in this case, they tell us the senator's office called those numbers until they got a response from all 100 people chosen. So, the 100 people that they chose, they made sure they got a response. So, non-response is not going to be an issue here.

Next choice: undercoverage. Well, undercoverage is where you're not able to sample from part of the population. A part of the population that wasn't sampled might introduce bias. Let's think about what happened in this situation. We are a senator, we want to sample all of our constituents, but instead, we sample from the constituents who happen to be listed in the phone book.

So, these are the people who happen to be listed in the phone book. We're not sampling from people who are not in the phone book, who may have landlines and are unlisted. We're also not sampling from people who don't have landlines and only have mobile phones. You might say, "Well, why is that important?"

Well, think about it. People who decide not to list in the phone book or people who don't even have a landline; some of those people might be a little bit more concerned about privacy than everyone else. They explicitly chose not to be listed. So, undercoverage is definitely a very concerning source of bias here.

We are sampling from only a subset of our entire population that we care about. In particular, we're missing out on people who might care about privacy. I would say that because of undercoverage, 42% is likely to be an underestimate of the number of people concerned about internet privacy. There is probably a higher proportion of people out here who care about privacy because they're unlisted or they don't even have a landline.

So, undercoverage probably introduced bias and implies that 42% is an underestimate of the percentage of the senator's constituents who care about internet privacy.

Now, the last question: volunteer response sampling. This would be the case where you know the senator, I don't know, put a billboard out or told someone, maybe on her website, "Hey, vote for this," or "Give us your information on how much you care about internet privacy." That would have been the source of bias there.

Once again, if you did, "Hey, come to my website in Philadelphia," you're only getting information from a subset of your population who are choosing and volunteering. That is not the situation here. She didn't ask 100 people to volunteer; her team went out and got them from the phone book.

So, this was definitely a case of undercoverage.

More Articles

View All
Comparing fractions with same numerator | Math | 3rd grade | Khan Academy
Let’s compare 5⁄6 and 5⁄8. Let’s think about what they mean. 5⁄6 means five out of six pieces. If you have a whole, let’s say a whole cake, and you cut it into six pieces, 5⁄6 is five of those six pieces. 5⁄8 again is five pieces. That’s something that’s…
Keith Schacht and Doug Peltz on What Traction Feels Like - at YC Edtech Night
This is the last fireside chat tonight, and I am very happy to introduce Quiche Act and Doug Pelts from Mystery Science. Thank you! Whoo! Thanks! Could you guys just start us off by introducing yourselves, please? I’ll let you go first. Okay! I’m Doug f…
Held Captive by Qaddafi’s Troops in Libya: A Photographer’s Story | Nat Geo Live
In 2011, I wanted to cover the uprising in Libya. So, like so many journalists, we snuck in through Egypt. We knew that one of the great risks for us journalists was getting caught by Qaddafi’s forces. So, on March 15th, 2011, I was working with Tyler Hic…
Run-ons and comma splices | Syntax | Khan Academy
Hello Grim, Marians. Hello Rosie. Hi David, how are you? Good, how are you? Good. Today we are going to talk about run-ons and comma splices. A run-on sentence is what happens when two independent clauses are put together in one sentence without any punc…
Why You Should Put YOUR MASK On First (My Brain Without Oxygen) - Smarter Every Day 157
All right, I’ll make it super fast. It’s me, Destin. Welcome back to SmarterEveryDay. When you’re in a jet, if the cabin depressurizes, they drop this little mask out of the top. What happens if you’re in a depressurized cabin and you’re up above 15,000 f…
Compressing functions | Mathematics III | High School Math | Khan Academy
[Voiceover] G of x is a transformation of f of x. The graph here shows this is y is equal to f of x, the solid blue line. This is y is equal to g of x as a dashed red line. And they ask us, “What is g of x in terms of f of x?” And like always, pause the v…