yego.me
💡 Stop wasting time. Read Youtube instead of watch. Download Chrome Extension

Example: Comparing distributions | AP Statistics | Khan Academy


5m read
·Nov 11, 2024

What we're going to do in this video is start to compare distributions. So for example, here we have two distributions that show the various temperatures different cities get during the month of January. This is the distribution for Portland; for example, they get eight days between one and four degrees Celsius. They get 12 days between four and seven degrees Celsius, so forth and so on. And then this is the distribution for Minneapolis.

Now when we make these comparisons, what we're going to focus on is the center of the distributions to compare that and also the spread. Sometimes people will talk about the variability of the distributions, and so these are the things that we're going to compare. In making the comparison, we're actually just going to try to eyeball it. We're not going to try to pick a measure of central tendency, say the mean or the median, and then calculate precisely what those numbers are for these. We might want to do those if they're close, but if we can eyeball it that would be even better. Similar for the spread and variability; in either of these cases, there are multiple measures in our statistical toolkit. Center: mean, median is for mean; median is valuable for the center. For spread, variability: the range, the interquartile range, the mean absolute deviation, the standard deviation—these are all measures, but sometimes you can just kind of gauge it by looking.

So in this first comparison, which distribution has a higher center, or are they comparable? Well, if you look at the distribution for Portland, the center of this distribution, let's say if we were to just think about the mean, although I think the mean and the median would be reasonably close, right over here, it seems like it would be around, it would be around seven or maybe a little bit lower than seven. So it would be kind of in that range; maybe between five and seven would be our central tendency, which would be either our mean or our median. While for Minneapolis, it looks like our center is much closer to maybe negative two or negative three degrees Celsius.

So here, even though we don't know precisely what the mean or the median is of each of these distributions, you can say that Portland's distribution has a higher center—higher center, however you want to measure it, either mean or median. Now what about the spread or variability? Well, if you just superficially thought about range, you see here that there's nothing below one degree Celsius and nothing above 13. So you have about a 13-degree range at most, right over here. In fact, what might be contributing to this first column might be a bunch of things at 3 degrees or even 3.9 degrees. Similarly, what's contributing to this last column might be a bunch of things at 10.1 degrees. But at most, you have a 12-degree range right over here, while over here it looks like you have—well, it looks like it's approaching a 27-degree range.

So based on that, and even if you just eyeball it—this is just we're using the same scales for our horizontal axes here, the temperature axes—and this is just a much wider distribution than what you see over here. And so you would say that the Minneapolis distribution has more spread or higher spread or more variability—so higher spread right over here.

Let's do another example, and we'll use a different representation for the data. Here, we're told at the Olympic Games, many events have several rounds of competition. One of these events is the men's 100-meter backstroke. The upper dot plot shows the times in seconds of the top eight finishers in the final round of the 2012 Olympics, so that's in green right over here—the final round. The lower dot plot shows the times of the same eight swimmers, but in the semi-final round.

So given these distributions, which one has a higher center? Well, once again, I mean, here you can actually—it’s a little bit easier to eyeball even what the median might be. The mean, I would probably have to do a little bit more mathematics, but let's say the median—let's see, there's one, two, three, four, five, six, seven, eight data points. So the median is going to sit between the lower four and the upper four, so the central tendency right over here is for the final round looks like it's around 57.1 seconds.

While if we especially think about the median, the central tendency for the semi-final round—let's see, one, two, three, four, five, six, seven, eight—looks like it is right about there, so this is about 57, more than 57.3 seconds. So the semi-final round seems to have a higher central tendency, which is a little bit counter-intuitive. You would expect the finalists to be running faster on average than the semi-finalists, but that's not what the state is showing. So the semi-final round has a higher center—higher, higher center—and I just eyeballed the median, and I suspect that the mean would also be higher in this second distribution.

Now what about variability? Well, once again, if you just looked at range, and these are both at the same scale, if you just visually look, the variability here—the range for the final round is larger than the range for the semi-final round. So you would say that the final round has higher variability. Very—ability, it has a higher range; eyeballing it, it looks like it has a higher spread.

And there's, of course, times where one distribution could have a higher range but then it might have a lower standard deviation. For example, you could have data that's like, you know, two data points that are really far apart, but then all of the other data just sits really, really closely packed. So for example, a distribution like this—and I'll draw the horizontal axis here just so you can imagine it as a distribution—a distribution like this might have a higher range but lower standard deviation than a distribution like this.

Let me just—I'm just drawing a very rough example. A distribution like this has a lower range but actually might have a higher standard deviation—might have a higher standard deviation than the one above it. In fact, I can even—I can make that even better. A distribution like this would have a lower range, but it would also have a higher standard deviation. So you can't just look at—it's not always the case; just by looking at one of these measures, the range or the standard deviation, you'll know for sure.

But in cases like this, it's safe to say when you're looking at it by inspection that, look, this green, the final round data does seem to have a higher range, higher variability, and so I'd feel pretty good at this very high level comparison.

More Articles

View All
Homeschooling your kids? Learn how to use our weekly math learning plans
Hello! Welcome! We are so glad to have several of you, a few hundred already here today, and really appreciate your time. My name is Dave Herron. I work on our team that supports teachers in school districts at Khan Academy, and I am joined today, about t…
Why Science Says It's Good for Kids to Lie | National Geographic
[Music] My name is Ellen. I’m a research assistant at Kong Leaf Development Lab. This is where we do our deception studies, and here we play three games with the kids. You’ve been doing such a good job, and we got off to such a good start that I kind of w…
The Dred Scott case and citizenship | Citizenship | High school civics | Khan Academy
In this video, I want to give you a very brief overview of Dred Scott vs. Sanford, a Supreme Court decision made in 1857 that had major consequences on the definition of citizenship in the United States. This case was tied up with so many of the questions…
NERD WARS: Blizzcon Kerrigan, Lich King, Diablo and Deathwing 1/2
[Music] Hey guys, this is Jeff from Wacky Gamer. You had awesome BlizzCon nerd War suggestions! I’m really excited to go ask the fans. Meanwhile, go to Facebook.WackyGamer.com and you can contribute and see some of our pictures. I still have to go with …
15 Things To Do When Life Doesn’t Go Your Way
In the novel of Our Lives, plot twists are essential to the richness of the story. They’re here to make your Ted Talk more interesting. Maybe you got fired, lost someone, or your flight got delayed, missed your connection, and now you’re writing a script …
Secrets from Longevity Experts l Transform Your Health and Extend Your Lifespan
I think of all the money I’ve invested in so many businesses over the years, and I didn’t invest enough in myself, which is the most important business I have. So I’m obviously trying to fix that these days. Mr. Wonderful here, back in the United Arab Emi…