Analyzing a cumulative relative frequency graph | AP Statistics | Khan Academy
Nutritionists measured the sugar content in grams for 32 drinks at Starbucks. A cumulative relative frequency graph—let me underline that—a cumulative relative frequency graph for the data is shown below.
So they have different amounts of sugar in grams on the horizontal axis, and then we have the cumulative relative frequency. So let's just make sure we understand how to read this. This is saying that zero, or 0%, of the drinks have no sugar content.
This right over here, this data point, looks like it's at the 5 g mark, and then this looks like it's at the 0.1 mark. This says that 0.1, or I guess we would say 10%, of the drinks that Starbucks offers has 5 g of sugar or less. This data point tells us that 100% of the drinks at Starbucks has 50 g of sugar or less.
The cumulative relative frequency—this is why for each of these points, we say this is the frequency that has that much sugar or less. And that’s why it just keeps on increasing as we add more sugar. We're going to see a larger proportion or a larger relative frequency has that much sugar or less.
So let's read the first question. An iced coffee has 15 g of sugar. Estimate the percentile of this drink to the nearest whole percent. So iced coffee has 15 g of sugar, which would be right over here.
So let's estimate the percentile. We can see they actually have a data point right over here, and we can see that 20% or 0.2 of the drinks that Starbucks offers have 15 g of sugar or less. So the percentile of this drink, if I were to estimate it, looks like it's the relative frequency 0.2 has that much sugar or less.
And so this percentile would be 20%. Once again, another way to think about it, to read this, you could convert these to percentages. You could say that 20% has this much sugar or less—15 g of sugar or less. So an iced coffee is in the 20th percentile.
Let's do another question. So here we are asked to estimate the median of the distribution of drinks. Hint: Think about the 50th percentile. So the median, if you were to line up all of the drinks, you would take the middle drink.
And so you could view that as, well, what drink is exactly at the 50th percentile? Now let's look at the 50th percentile. It would be a cumulative relative frequency of 0.5, which would be right over here on our vertical axis.
Another way to think about it is that 0.5 or 50% of the drinks—if we go to this point right over here, what has a cumulative relative frequency of 0.5—we see that we are right at looks like this is 25 g.
So one way to interpret this is that 50% of the drinks have less than or have 25 g of sugar or less. So this looks like a pretty good estimate for the median for the middle data point. The median is approximately 25 g; that is, half of the drinks have 25 g or less of sugar.
Let's do one more based on the same data set. So here we're asked, what is the best estimate for the interquartile range of the distribution of drinks?
So the interquartile range—we want to figure out, well, what's sitting at the 25th percentile? And we want to think about what's at the 75th percentile, and then we want to take the difference. That's what the interquartile range is.
So let's do that. First, the 25th percentile—we'd want to look at the cumulative relative frequency. So the 25th, this would be the 30th, so the 25th would be right around here.
And so it looks like the 25th percentile is that—looks like about, I don't know, and we're estimating here, so that looks like it's about this would be 15. Looks like I would say maybe 18 g, so approximately 18 g.
Once again, one way to think about it is that 25% of the drinks have 18 g of sugar or less. Now let's look at the 75th percentile. So this is the 70th; the 75th would be right over there.
Actually, I can draw a straighter line than that; I have a line tool here. So the 75th percentile would put me right over there. I don't know, that looks like—I'll go with 39 g, roughly 39 g.
And so what's the difference between these two? Well, the difference between these two looks like it's about 21 g. So our interquartile range, our estimate of our interquartile range, looking at this cumulative relative frequency distribution, because we're saying, hey, look, it looks like the 25th percentile—25% of the drinks have 18 g or less; 75% of the drinks have 39 g or less.
If we take the difference between these two quartiles—this is the first quartile; this is our third quartile—we're going to get 21 g. Now, if we look at this choice, the choices right over here, 20 g definitely seems like the best estimate closest to what we were able to estimate based on looking at this cumulative relative frequency graph.