Calculating percentile | Modeling data distributions | AP Statistics | Khan Academy
The Dot Plot shows the number of hours of daily driving time for 14 school bus drivers. Each dot represents a driver. So, for example, one driver drives one hour a day, two drivers drive two hours a day, one driver drives three hours a day, and it looks like there's five drivers that drive 7 hours a day.
Which of the following is the closest estimate to the percentile rank for the driver with a daily driving time of 6 hours? Then they give us some choices. Which of the following is the closest estimate to the percentile rank for the driver with the daily driving time of 6 hours? So pause the video and see if you can figure out which of these percentiles is the closest estimate to the percentile rank of a driver with a daily driving time of 6 hours, looking at this data right over here.
All right, now let's work through this together. So when you think about percentile, you really want to think about – so let me write this down. When we're talking about percentile, we're really saying the percentage of the data that – and there's actually two ways that you could compute it. One is the percentage of the data that is below the amount in question. The other possibility is the percent of the data that is at or below that amount, the amount in question.
So if we look at this right over here, let's just figure out how many data points – what percentage of the data points are below 6 hours per day? So let's see, there are – I’m just going to count them – 1, 2, 3, 4, 5, 6, 7. So seven of the 14 are below 6 hours.
So we could just say seven. If we use this first technique, we would have seven of the 14 that are below 6 hours per day, and so that would get us a number of 50%. That 6 hours is at the 50th percentile. If we want to say what percentage is at that number below, then we would also count this one. So we would say eight or eight out of 14, which is the same thing as 4 out of 7.
If we want to write that as a decimal, let's see – 7 goes into 4. We just need to estimate. So 7 goes into 45 times – 35. We subtract, we get a five, bring down a zero, goes five times. It's just going to be 0.5 repeating, so 55.55%. So either of these would actually be a legitimate response to the percentile rank for the driver with a daily driving time of 6 hours.
It depends on whether you include the 6 hours or not, so you could say either the 50th percentile or roughly the 55th, actually the 56th percentile if you want it rounded to the nearest percentile. Now if you look at these choices here, lucky for us, there's only one choice that's reasonably close to either one of those, and that's the 55th percentile.
It looks like the people who wrote this question went with the calculation of percentile where they include the data point in question, so everything at 6 hours or less—what percentage of the total data is that?