yego.me
💡 Stop wasting time. Read Youtube instead of watch. Download Chrome Extension

Worked example of linear regression using transformed data | AP Statistics | Khan Academy


4m read
·Nov 11, 2024

We are told that a conservation group with a long-term goal of preserving species believes that all at-risk species will disappear when land inhabited by those species is developed. It has an opportunity to purchase land in an area about to be developed. The group has a choice of creating one large nature preserve with an area of 45 square kilometers and containing 70 at-risk species or five small nature preserves, each with an area of three square kilometers and each containing 16 at-risk species unique to that preserve.

Which choice would you recommend and why?

Here's some interesting data. It looks like some data they have gathered for different islands and we have their areas and then this is the number of species at risk in 1990 and then the species extinct by 2000. We can see for these various islands their areas and the proportion that got extinct. It looks like they're plotted on this scatter plot.

Now, be very careful when you look at this because look at the two axes. The vertical axis is the proportion extinct in 2000, so it's these numbers. But the horizontal axis isn't just straight up area; it's the natural log of the area. Why did they do this? Well, notice when you make the horizontal axis the natural log of the area, it looks like there's a linear relationship. But be clear, it's a linear relationship between the natural log of the area and proportion extinct in 2000.

The reason why it's valuable to do this type of transformation is now we can apply our tools of linear regression to think about what would be the proportion extinct for the 45 square kilometers versus for the five small three-kilometer islands. So pause this video and see if you can figure it out on your own.

They give us the regression data for a line that fits this data. All right, now let's work through it together and to make some space because all of it is already plotted right over here and we have our regression data.

The regression line, we know its slope and y-intercept. The y-intercept is right over here: 0.28996. So it's going to be right over here would be the y-intercept. Its slope is approximately negative 0.05. I could eyeball it; it probably is going to look something like this. That's the regression line.

Another way to think about it is the regression line tells us, in general, the proportion—I'll just say proportion shorthand for proportion extinct—is going to be equal to our y-intercept 0.28996 minus 0.053 and we have to be careful here. You might be tempted to say times the area, but no, the horizontal axis here is the natural log of the area times the natural log of the area.

We can use this equation for both scenarios to think about what is going to be the proportion that we would expect to get extinct in either situation, and then how many actual species will get extinct. The one that maybe has fewer species that get extinct is maybe the best one, or the one that the more that we can preserve is maybe the best one.

So let's look at the two scenarios. The first scenario is the 45 square kilometer island. This is just one, so times one. What is going to be the proportion that we would expect to go extinct based on this regression?

What's going to be 0.28996 minus 0.05323 times the natural log of 45? If we want to know the actual number that go extinct, the number extinct would be equal to the proportion times how many—that's 70 at-risk species on that island.

So let's get our calculator out to figure that out. The proportion we would expect to go extinct in the 45 square kilometer island based on our linear regression goes like this: approximately nine percent. If we want to figure out the actual number we would expect to go extinct, we multiply that by the number of species on that island, which is 70.

We get approximately 6.11. So this is going to be approximately 6.11; we could say there would be approximately six extinct. This is all very approximate. Thus, we have approximately 64 saved.

Now let's think about the other scenario. Let's think about the scenario where we have five small nature preserves. It's going to be three square kilometers times five islands, and we're going to just do the same exercise.

Our proportion that goes extinct is going to be 0.28996—that's just the y-intercept for our regression line—minus 0.05323. They have a negative sign there because we have a negative slope, and this is not just times the area; it's times the natural log of the area.

It's going to be three square kilometers. Our number extinct will be equal to our proportion calculated in the line above times five small nature preserves, each with an area of three square kilometers and each containing 16 at-risk species.

So 5 times 16, if each island has 16 and there are 5 islands, that's 5 times 16, which equals 80. Now, let's figure out what this is; get the calculator out again.

So this is going to be the proportion; it's a much higher proportion, and then we'll multiply that times our number of species to figure out how many species will go extinct.

We have here it's approximately 18.52. So this is approximately 18.52. Another way to think about it is we're going to have approximately, well, if we round, let's just say 19 extinct. If we have 19 extinct, how many we’re going to save? We're going to have 61 saved.

Even if you said 18 and a half here and 61.5 here, on either measure, the 45 square, the big island is better. You're going to have fewer species that are extinct and more that are saved.

So which choice would you recommend and why? I'd recommend the one large island because you're going to save—you would expect to save more species—and you would expect that fewer are going to get extinct based on this linear regression.

More Articles

View All
When Life is Meaningless (And Why We Feel Worthless)
You know, man doesn’t stand forever, his nullification. Once, there will be a reaction, and I see it setting in, you know, when I think of my patients, they all seek their own existence and to assure their existence against that complete atomization into …
Fields | Forces at a distance | Middle school physics | Khan Academy
If you hold a ball up in the air and let it go, you know it’s going to fall, but why? Nothing is touching it once you let it go. How can there be a force on it? Well, this is because Earth’s gravitational force is pulling the ball, and gravity is a non-co…
🎉100th show! 🎉 Homeroom with Sal & Tabatha Rosproy - Thursday, September 24
Hi everyone! Welcome to the Homeroom live stream. Sal here from Khan Academy. We have a very exciting guest today! We have Tabitha Ross, Pro 2020 National Teacher of the Year. So, if you have questions for what it’s like to be a teacher, especially a teac…
Justification with the intermediate value theorem: table | AP Calculus AB | Khan Academy
The table gives selected values of the continuous function f. All right, fair enough. Can we use the Intermediate Value Theorem to say that the equation f of x is equal to 0 has a solution where 4 is less than or equal to x is less than or equal to 6? If …
Article V of the Constitution | National Constitution Center | Khan Academy
[Kim] Hey, this is Kim from Khan Academy, and today I’m learning about Article Five of the U.S. Constitution, which describes the Constitution’s amendment process. To learn more about Article Five, I talked to two experts, Professor Michael Rappaport, who…
From $100 to $75 Million: Is Bitcoin a good investment?
What’s up you guys, it’s Graham here. So, if you’ve looked at the internet in the last few days, I’m sure you’ve seen an article out there that says if you had bought $100 of Bitcoin 7 years ago, you would have over $75 million today. Bitcoin is a topic t…