yego.me
💡 Stop wasting time. Read Youtube instead of watch. Download Chrome Extension

Bivariate relationship linearity, strength and direction | AP Statistics | Khan Academy


6m read
·Nov 11, 2024

What we have here is six different scatter plots that show the relationship between different variables.

So for example, in this one here, in the horizontal axis, we might have something like age, and then here could be accident frequency. Accident frequency, and I'm just making this up, and I could just show these data points maybe for some kind of statistical survey that when the age is this, whatever number this is, maybe this is 20 years old, this is the accident frequency, and it could be a number of accidents per hundred.

And that when the age is 21 years old, this is the frequency. And so these data scientists or statisticians went and plotted all of these in this scatter plot. This is often known as bivariate data, which is a very fancy way of saying, "Hey, you're plotting things that take two variables into consideration," and you're trying to see whether there's a pattern with how they relate.

What we're going to do in this video is think about, "Well, can we try to fit a line? Does it look like there's a linear or non-linear relationship between the variables on the different axes? How strong is that variable? Is it a positive or negative relationship?" And then we'll think about this idea of outliers.

So let's just first think about whether there's a linear or non-linear relationship, and I'll get my little ruler tool out here. So this data right over here, it looks like I could put a line through it that gets pretty close through the data. You're not going to; it's very unlikely you're going to be able to go through all of the data points. But you can try to get a line, and I'm just doing this, there's more numerical, more precise ways of doing this, but I'm just eyeballing it right over here, and it looks like I could plot a line that looks something like that, that goes roughly through the data.

So this looks pretty linear, so I would call this a linear relationship. Since as we increase one variable, it looks like the other variable decreases, this is a downward sloping line. I would say this is negative; this is a negative linear relationship. But this one looks pretty strong.

So because the dots aren't that far from my line, this one gets a little bit further, but it's not, you know, there's not some dots way out there, so most of them are pretty close to the line. So I'll call this a negative, reasonably strong linear relationship. Negative strong, I'll call reasonably; I'll just say strong, but reasonably strong linear relationship between these two variables.

Now let's look at this one and pause this video and think about what this one would be for you. Well, let's see, I'll get my ruler tool out again. It looks like I can try to put a line. It looks like, generally speaking, as one variable increases, the other variable increases as well.

So something like this goes through the data and approximates the direction, and this looks positive. As one variable increases, the other variable increases roughly, so this is a positive relationship. But this is weak; a lot of the data is off—well off of the line—so positive weak. But I'd say this is still linear; it seems that as we increase one, the other one increases at roughly the same rate, although these data points are all over the place, so I would still call this linear.

Now there's also this notion of outliers. If I said, "Hey, this line is trying to describe the data," well, we have some data that is fairly off the line. So for example, even though we're saying it's a positive weak linear relationship, this one over here is reasonably high on the vertical variable, but it's low on the horizontal variable.

So this one right over here is an outlier; it's quite far away from the line. You could view that as an outlier, and this is a little bit subjective. Outliers, well, what looks pretty far from the rest of the data? This could also be an outlier. Let me label these outliers.

Now pause the video and see if you can think about this one. Is this positive or negative? Is it linear or non-linear? Is it strong or weak? I'll get my ruler tool out here. So this goes here; it seems like I can fit a line pretty well to this.

So I could fit, maybe I'll do the line in purple. I could fit a line that looks like that. And so this one looks like it's positive; as one variable increases, the other one does for these data points. So it's a positive; I'd say this is pretty strong; the dots are pretty close to the line. There, it really does look like a little bit of a fat line if you just look at the dots.

So positive strong linear relationship, and none of these data points are really strong outliers. This one's a little bit further out, but they're all pretty close to the line and seem to describe that trend roughly.

All right, now let's look at this data right over here. So let me get my line tool out again. So it looks like I can fit a line, and it looks like it's a positive relationship. The line would be upward sloping; it would look something like this. And once again, I'm eyeballing it; you can use computers and other methods to actually find a more precise line that minimizes the collective distance to all of the points, but it looks like there is a positive.

But I would say this one is a weak linear relation because you have a lot of points that are far off the line. So not so strong, so I would call this a positive weak linear relationship, and there's a lot of outliers here. You know, this one over here is pretty far out.

Now let's look at this one. Pause this video and think about: Is it positive or negative? Is it strong or weak? Is this linear or non-linear? Well, the first thing we want to do is just think about linear or non-linear. I could try to put a line on it, but if I try to put a line on it, it's actually quite difficult.

If I try to align like this, notice everything is kind of bending away from the line. It looks like generally as one variable increases, the other variable decreases, but they're not doing it in a linear fashion. It looks like there's some other type of curve at play. So I could try to do a fancier curve that looks something like this, and this seems to fit the data a lot better.

So this one I would describe as non-linear, and it is a negative relationship. As one variable increases, the other variable decreases. So this is a negative; I would say reasonably strong non-linear relationship—pretty strong—pretty strong. This is subjective, so I'll say negative reasonably strong non-linear relationship.

And maybe you could call this one an outlier, but it's not that far, and I might even be able to fit a curve that gets a little bit closer to that. Once again, I'm eyeballing this.

Now let's do this last one. So this one looks like a negative linear relationship to me. A fairly strong negative linear relationship, although there's some outliers. So let me draw this line; that seems to fit the data pretty good. So this is a negative, reasonably strong, reasonably strong linear relationship.

But these are very clear outliers; these are well away from the data or from the cluster of where most of the points are, so with some significant—at least these two significant outliers here.

So hopefully this makes you a little bit familiar with some of this terminology. And it's important to keep in mind this is a little bit subjective. There'll be some cases that are more obvious than others. So for, oftentimes you want to make a comparison; that this is a stronger linear, positive linear relationship than this one is right over here because you can see most of the data is closer to the line.

This one is for sure; this is more non-linear than linear. It depends how you want to describe. Oftentimes, making a comparison or making a subjective call on how to describe the data.

More Articles

View All
Area of a circle | Perimeter, area, and volume | Geometry | Khan Academy
[Teacher] A candy machine creates small chocolate wafers in the shape of circular discs. The diameter, the diameter of each wafer is 16 millimeters. What is the area of each candy? So, the candy, they say it’s the shape of circular discs. And they tell …
Making animated GIF's using photoshop
Hey guys, this is Maads 101. Today, I’m going to be showing you how to make a GIF animation in Photoshop. So first, you’re going to open up Photoshop CS3. Now, you want to go to File, New. Make it about 100 by 100 pixels. You’re going to make sure it’s R…
Fake Beams - Smarter Every Day 186
Hey, it’s me Destin. Welcome back to Smarter Every Day! So, if you watch Smarter Every Day for any length of time, you know that it’s about whatever I’m thinking about—like in Eclipse, or how brains work, or helicopters, or management, or whatever. You di…
Building a theater that remakes itself - Joshua Prince-Ramus
Thank you. I’m going to speak to you today about architectural agency. What I mean by that is it’s time for architecture to do things again, not just represent things. This is a construction helmet that I received two years ago at the groundbreaking of th…
Ryan Serhant: How to Sell a BILLION DOLLARS of Real Estate Per Year!
[Music] I’m introducing you in this video. There we go. You gotta say what’s up. You guys, it’s Graham here. What’s up you guys, it’s Graham here! Welcome to the greatest real estate investor podcast and YouTube in the entire world. Do you ever guess? Ye…
Anna Camp: Playing Dorothy Bradford | Saints & Strangers
[Music] Dorothy Bradford is William Bradford’s wife, played by Vincent Caryer, and he is one of the first governors of Plymouth Rock. Dorothy’s personal journey is an incredibly sad one. She’s left her only child behind and the feeling of not being able t…