Generalizabilty of survey results example | AP Statistics | Khan Academy
Niketi took a random sample of 10 countries to study fertility rate and life expectancy. She noticed a strong negative linear relationship between those variables in the sample data. Here is computer output from a least squares regression analysis for using fertility rate to predict life expectancy. Use this model to predict the life expectancy of a country whose fertility rate is two babies per woman, and you can round your answer to the nearest whole number of years. So pause this number and see if you can do it. You might need to use a calculator.
All right, now let's do this together. So, in general, this computer output is actually giving us a lot of data. More than we need, actually, to do this prediction. But it's giving us the data we need to know the equation for a regression line. So, the general form of a regression line, a linear regression line, would be our estimate.
And that little hat means we're estimating our y value. Would be ŷ (hat) would be equal to our y-intercept plus our slope times our x value. Now, in this situation, we're using fertility to predict life expectancy. Or let me circle all of life expectancy. So, the thing that we're trying to predict, that is y; life expectancy and fertility is the thing that we're using to predict that. So, that is going to be our x right over there.
Now, what are a and b? Well, our computer output gives us that it's these numbers right over here. Our constant coefficient right over here, this is a, and our slope is going to be negative 5.97. You could view it as the coefficient on fertility. Remember, this right over here is fertility. You could even rewrite this as our estimated life expectancy.
Estimated life expectancy, I could put a little hat on it to show this is estimated life expectancy, is going to be equal to 89.70 minus 5.97 times fertility. Times fertility rate, I'll just call it fert and period right over there. Notice this is the coefficient on fertility, and then this is the constant coefficient. We could view that right over there.
Now we can use this to estimate the life expectancy of a country whose fertility rate is two babies per woman. For fertility, you just put a two here and then you get your estimated life expectancy. So, what's that going to be? We can get out a calculator.
So we can say 5.97 times 2 is equal to that. And then we want to subtract that from, so put a negative there, and add that to 89.70, which is equal to, and we want to round to the nearest whole number of years. So, that's approximately 78 years. So, this is approximately 78 years and we're done.
Just to be clear, what even happened here is that Naketji did a regression. On the x-axis is fertility; fertility on the y-axis is let's call it l.e. That's our y-axis. Took ten data points: one, two, three, four, five, six, seven, eight, nine, ten. Put a regression line on, tried to fit a regression line. You saw a negative linear relationship, and then using this regression line to estimate, "Hey, if fertility is, let's say, this is two right over here, what is the estimated life expectancy?" And we just saw that that would be roughly 78 years.