yego.me
💡 Stop wasting time. Read Youtube instead of watch. Download Chrome Extension

Residual plots | Exploring bivariate numerical data | AP Statistics | Khan Academy


4m read
·Nov 11, 2024

What we're going to do in this video is talk about the idea of a residual plot for a given regression and the data that it's trying to explain.

So right over here we have a fairly simple least squares regression. We're trying to fit four points. In previous videos, we actually came up with the equation of this least squares regression line. What I'm going to do now is plot the residuals for each of these points.

So what is a residual? Well, just as a reminder, your residual for a given point is equal to the actual minus the expected. So how do I make that tangible? Well, what's the residual for this point right over here? For this point here, the actual y when x equals 1 is one. But the expected when x = 1 for this least squares regression line, 2.5 * 1 - 2, well that's going to be 0.5.

And so our residual is 1 minus 0.5. So we have a positive 0.5 residual. Over for this point, you have zero residual; the actual is the expected. For this point right over here, the actual when x equals 2 for y is two, but the expected is three.

So our residual over here, once again, the actual is Y = 2 when x = 2; the expected 2 * 2.5 - 2 is 3. So this is going to be 2 - 3, which equals a residual of -1. And then over here, our residual, our actual when x = 3 is 6. Our expected when x = 3 is 5.5, so 6 minus 5.5, that is a positive 0.5.

So those are the residuals. But how do we plot it? Well, we would set up our axes. Let me do it right over here: one, two, and three. And let's see, the maximum residual here is 0.5 and then the minimum one here is -1.

So let's see, this could be 0.5, 1, 1.5. So this is -1, this is positive one here. And so when x equals 1, what was the residual? Well, the actual was one, expected was 0.5. 1 - 0.5 is 0.5. So this right over here we can plot right over here; the residual is 0.5.

When x equals 2, we actually have two data points. First I'll do this one: when we have the point (2, 3), the residual there is zero, so for one of them, the residual is zero.

Now for the other one, the residual is -1. Let me do that in a different color. For the other one, the residual is negative one, so we would plot it right over here. And then this last point, the residual is positive 0.5, so it is just like that.

And so this thing that I have just created where we're just seeing for each x where we have a corresponding point, we plot the point above or below the line based on the residual, this is called a residual plot.

Now one question is why do people even go through the trouble of creating a residual plot like this? The answer is, regardless of whether the regression line is upward sloping or downward sloping, this gives you a sense of how good a fit it is and whether a line is good at explaining the relationship between the variables.

The general idea is if you see the points pretty evenly scattered or randomly scattered above and below this line, you don't really discern any trend here; then a line is probably a good model for the data. But if you do see some type of trend, if the residuals had an upward trend like this or if they were curving up and then curving down or they had a downward trend, then you might say, "Hey, this line isn't a good fit," and maybe we would have to do a nonlinear model.

What are some examples of other residual plots? And let's try to analyze them a bit. So right here you have a regression line and its corresponding residual plot. And once again, you see here the residual is slightly positive; the actual is slightly above the line, and you see it right over there, it's slightly positive.

This one's even more positive; you see it there. But like the example we just looked at, it looks like these residuals are pretty evenly scattered above and below the line. There isn't any discernible trend, and so I would say that a linear model here, and in particular this regression line, is a good model for this data.

But if we see something like this, a different picture emerges. When I look at just the residual plot, it doesn't look like they're evenly scattered. It looks like there's some type of trend here I'm going down here, but then I'm going back up.

When you see something like this where on the residual plot you're going below the x-axis and then above, then it might say, "Hey, a linear model might not be appropriate," maybe some type of nonlinear model, some type of nonlinear curve might better fit the data or the relationship between the y and the x is nonlinear.

Another way you could think about it is when you have a lot of residuals that are pretty far away from the x-axis in the residual plot, you would also say this line isn't such a good fit. If you calculate the R value here, it would only be slightly positive, but it would not be close to one.

More Articles

View All
Into the Wilderness: Trapping a Wolf | Life Below Zero
♪ [Ricko] We have to hunt and kill to survive. Just like the animals out here. ♪ ♪ ♪ ♪ Most likely the wolves came along and hamstringed it, or they’re right around here somewhere. I’m traveling along with my snow machine, looking for a place to do some w…
Asking Billionaires How They Got Rich! (Houston)
Who am I here with today? Damon John. Kendra Scott, are you a business owner? I am. I’m one of only 20 female founders in the United States that have founded a billion-dollar brand. So you founded a billion-dollar company? A billion-dollar company, with a…
Knock Knock, You’re Busted | Drugs, Inc.
In a Queensland suburb, cops are raiding a suspected dealer’s home. The suspect alerted police. They know he could be flushing vital evidence, or worse, setting up a trap. They go in hard, but not hard enough. The front door has been specially reinforced.…
How to Talk to Aliens
[Michael] Where is everyone? We have been listening for messages from outer space for more than half a century, and so far… silence. Why? Are we truly alone in the universe? Or is everyone else acting like us and just doing a lot of listening? Maybe we ne…
Why I'm Leaving California
Growing number of its residents are packing up and moving out. Experts say over the past decade, around 150,000 people have left the state. The U.S. Census Bureau says California had a net loss of 190,000 people last year. “I’m out of here. When do you l…
Negative powers differentiation | Derivative rules | AP Calculus AB | Khan Academy
[Voiceover] So we have the function g of x, which is equal to 2/x to the third minus 1/x squared. And what I wanna do in this video, is I wanna find what g prime of x is and then I also wanna evaluate that at x equal two. So I wanna figure that out. And…