Introduction to Type I and Type II errors | AP Statistics | Khan Academy
What we're going to do in this video is talk about type 1 errors and type 2 errors, and this is in the context of significance testing.
So just as a little bit of review, in order to do a significance test, we first come up with a null and an alternative hypothesis, and we'll do this on some population in question. These will say some hypotheses about a true parameter for this population. The null hypothesis tends to be kind of what was always assumed or the status quo, while the alternative hypothesis suggests there's news here; there's something alternative here.
To test it, and we're really testing the null hypothesis, we're going to decide whether we want to reject or fail to reject the null hypothesis. We take a sample. We take a sample from this population. Using that sample, we calculate a statistic that’s trying to estimate the parameter in question. Then, using that statistic, we try to come up with the probability of getting that statistic—the probability of getting that statistic that we just calculated from that sample of a certain size—given if we were to assume that our null hypothesis is true.
If our null hypothesis is true, and if this probability, which is often known as a p-value, is below some threshold that we set ahead of time, which is known as the significance level, then we reject the null hypothesis.
Let me write this down. So this right over here—this is our p-value. This should be, I'll be reviewing it; we introduced it in other videos. We have seen in other videos that if our p-value is less than our significance level, then we reject our null hypothesis. If our p-value is greater than or equal to our significance level (alpha), then we fail to reject our null hypothesis.
When we reject our null hypothesis, some people will say that might suggest the alternative hypothesis. The reason why this makes sense is if the probability of getting this statistic from a sample of a certain size, assuming that the null hypothesis is true, is reasonably low—if it’s below a threshold, maybe the threshold is five percent—then hey, maybe it’s reasonable to reject it.
But we might be wrong in either of these scenarios, and that's where these errors come into play. Let's make a grid to make this clear. So there's the reality; let me put reality up here.
So the reality is there are two possible scenarios. One is that the null hypothesis is true, and the other is that the null hypothesis is false. Then, based on our significance test, there are two things that we might do: we might reject the null hypothesis, or we might fail to reject the null hypothesis.
Let’s put a little grid here to think about the different combinations, the different scenarios here. In a scenario where the null hypothesis is true, but we rejected it, that feels like an error. We shouldn't reject something that is true, and that indeed is a type one error.
Type one error— you shouldn't reject the null hypothesis if it was true. You can even figure out what is the probability of getting a type one error. Well, that's going to be your significance level. Because if your null hypothesis is true, let’s say that your significance level is five percent, then five percent of the time, even if your null hypothesis is true, you're going to get a statistic that’s going to make you reject the null hypothesis.
So one way to think about the probability of a type one error is your significance level. Now if your null hypothesis is true and you fail to reject it, well, that's good. We could write this as a correct conclusion; the good thing just happened to happen this time.
Now if your null hypothesis is false and you reject it, that's also good. That is the correct conclusion. But if your null hypothesis is false and you fail to reject it, well then, that is a type two error. That is a type two error.
Now with this context, in the next few videos, we will actually do some examples where we try to identify whether an error is occurring and whether that error is a type one or a type two.