yego.me
💡 Stop wasting time. Read Youtube instead of watch. Download Chrome Extension

Mosaic plots and segmented bar charts | Exploring two-variable data | AP Statistics | Khan Academy


4m read
·Nov 10, 2024

Let's say we're looking at some type of disease, and we want to see if there's any relationship between people having antibodies for that disease and whether they are adult children or infants.

If you don't know what antibodies are, these are things that your immune system keeps around, so it's very easy to recognize future infections. But you don't have to worry too much about that for this video. In this video, we're just trying to think about how we can visualize data to understand if there's a relationship between having antibodies and the age of the individual.

So, let's say we go out and collect a bunch of data. We test 120 adults, and 114 have antibodies; 6 don't. We test 60 children; 54 have antibodies, and 6 don't. We test 20 infants, and then 8 have antibodies, and 12 don't.

We can just look at this data, but this really still doesn't give us a visual representation of what's going on. One step we can take, which still doesn't give us a fully visual representation, is to just think about percentages that might help us think about the likelihood of having antibodies.

If we calculate the percentages, we might see something like this. For example, 114 over 120 is 95, or 95% have antibodies. That 114 over 120 is 95 percent, and then the number that don't have antibodies, this 6 right over here, that is 6 over 120. You can do that for each of the categories: 54 over 60 is 90, while 6 over 60, you can do that math in your head, is 10.

We could do the same thing for the infants: 8 out of 20 is 40, while 12 out of 20 is 60. So that helps us a little bit; it helps us think about, well, what's the percentage of adults that have the antibody or children or infants?

But if we really want to visualize it, we can look at two different types of visualizations. One, we can call a segmented bar chart, and I will show a segmented bar chart for this data right over here.

Now, in a segmented bar chart, we group; we have a bar for each category here, and we're making adults, children, and infants the different categories because we're thinking maybe that has something to do with the likelihood of having antibodies.

For each bar—for example, this adult bar—you can see the percentage that have the antibodies and the percentage that don't. So 95% of the adult bar is filled in blue; that's for yes, they have the antibodies, and 5% is filled in red. For children, you can see that 90% is filled in blue, and 10% is filled in red because 10 don't have the antibodies.

Then for infants, you can see that 40% is filled in blue, and 60% don't have the antibodies. Now, this by itself is pretty useful to visually see, all right, it looks like adults are much more likely to have the antibodies than children, and children are far more likely to have the antibodies than infants.

So it looks like this idea of making a bar for each of adults, children, or infants was a good way to start to understand the likelihood of having antibodies. You could have done it other ways; you could have had a bar for having antibodies and another bar for not having antibodies, and then you could have segmented the bar chart by whether they are adults, children, or infants.

But if you did that, that would have been trying to understand whether having antibodies or not having antibodies is predictive of whether you're an adult, child, or infant, while this one makes, at least to me, a little bit more sense that whether an adult, child, or infant might be predictive of whether or not you have antibodies.

But there is some information lost in this segmented bar chart. For example, we have lost the fact that we have sampled, or we have tested, a lot more adults than children and far more children than infants.

So one way to incorporate that data back into a visualization to essentially show how many people you sampled in each of these categories, we can generate what's known as a mosaic plot. So this is a mosaic plot right over here.

One way to think about it is we have just adjusted the width of each of these bars based on how many people we tested. We tested 200 people, and so you can view this width right over here as being 200. You can see that we tested 120 adults, so the width of this first bar, I guess you could say—although now we're dealing with a mosaic plot—this width right over here would be 60 percent of this entire width, which you can see that it is.

Then, the children make up 60 of the 200 that we tested, and so this width right over here would be 60 over the entire 200, or would be about 30 percent of the entire width. We can see that we tested the fewest number of infants, and so this 20 right over here represents the 20 infants we tested.

The reason why this mosaic plot conveys more information is that it conveys all the same information that our segmented bar chart does, but it also gives us a sense that we tested more adults than children and far more children than infants.

It's also easy to then look at and say, okay, of the total number of people who don't have the antibodies—so that would be the red area right over here—even though we tested the fewest number of infants, it looks like infants make up a large chunk of the total number of folks who don't have antibodies.

So I'll leave you there. The whole point of this video is to just understand why a segmented bar chart or mosaic plot will be useful. In future videos, we'll get more practice analyzing them.

More Articles

View All
Starbucks predatory practices, and 'the will of the people'
Lawton, you made a video about the predatory business practices of Starbucks and asked how this will be dealt with in a free market or how we dealt with in the absence of government regulation. Specifically, I think that in a free market, some businesses…
TIL: You Can Smell Through Your Skin | Today I Learned
[Music] Your nose isn’t the only thing that can smell things. You can smell through your skin, and that was a big surprise on one of our expeditions. I dive into a lot of these underwater caves, what we call blue holes. Maybe at about 30 ft, you hit these…
Exclusive: Is This the Skull of Slave Rebellion Leader Nat Turner? | National Geographic
[Music] It is my honor, uh, to present, uh, this uh, remains to you. Being able to hold that piece of his body that he couldn’t own for himself, we’ll be able to treat it with the respect and honor that is due. That we’re going to be able to give him the …
How Fish Eat Part 2 (SLOW MOTION UNDERWATER!) - Smarter Every Day 119
Hey, it’s me Destin, welcome back to Smarter Every Day. So in the last episode of Smarter Every Day, we revealed that fish eat by sucking in the water by opening up their mouth, and then once they do that, they allow the water to exit back behind the ope…
No Need To Worry About A Recession!
[Music] You’ve got inflation fears out there. That is one of just many worries weighing on the averages. But in times of high volatility, you got to start looking around. Where can investors go for opportunity? Let’s bring in Kevin O’Leary. Kevin, you ar…
Capturing the Year in an Instant | Podcast | Overheard at National Geographic
Uh, the fire is approaching. It’s making this really loud wind, uh, sort of howling. You can hear the fire coming over the ridge line. Uh, just in the last 20 minutes it has become visible, so it jumped the ridge and is getting closer. That’s National Ge…