Introduction to experiment design | Study design | AP Statistics | Khan Academy

7m read

·Nov 11, 2024

So let's say that I am a drug company and I've come up with a medicine that I think will help folks with diabetes. In particular, I think it will help reduce their hemoglobin A1c levels. For those of you who aren't familiar with what hemoglobin A1c is, I encourage you—we have a video on that on Con Academy.

But the general idea is, if you have high blood sugar over roughly a three-month period of time, high blood sugar—and I could say high average blood sugar—you're going to have a high A1C, a high hemoglobin A1c level. If you have a low average blood sugar over roughly a three-month time, you're going to have a lower hemoglobin A1c.

So, if taking the pill seems to lower folks' A1C levels more than is likely to happen due to chance or due to other variables, then that means that your new pill might be effective at controlling folks' diabetes. In this situation, when we're constructing an experiment to test this, we would say that whether or not you are taking the pill is the explanatory variable.

The thing that it is affecting, the thing that you're hoping has some response—in this case, you want the A1C levels to be your indicator of whether it is helping controlling the blood sugar—we call that the response variable. That right over there is the response variable.

So how are we actually going to conduct this experiment? Let's say that we have a group of folks; let's say that we have been given a group of 100 folks who need to control their diabetes. So, 100 people here who need to control their diabetes, and we say, "All right, well, let's take half of this group and put them into, I guess you could say a treatment group, and another half and put them into a control group and see if the treatment group—the one that actually gets my pill—is going to improve their A1C levels in a way that seems like it would not be just random chance."

So let's do that. We're going to have a control group. So this is my control group, control, and this is the treatment group; this is the treatment group. You might say, "Okay, we'll just give these folks in the treatment group the pill, and then we won't give the pill that I created to the control group." But that might introduce a psychological aspect that, you know, maybe the benefit of the pill is just people feeling, "Hey, I'm taking something that'll control my diabetes."

Maybe that psychologically affects their blood sugar in some way, and this is actually possible. Maybe it makes them act healthier in certain ways; maybe that makes them act unhealthier in certain ways because they think, "Oh, I have a pill to control my diabetes, my blood sugar; I can go eat more sweets now and it'll control it."

To avoid that, or in order for just the very fact that someone says, "Hey, I think I'm taking a medicine," not to cause a bias in behavior, we want to do is give both groups a pill, and we want to do it in a way that neither group knows which pill they're getting. So what we would do here is we would give this group a placebo, a placebo, and this group would actually get the medicine, the medicine.

But those pills should look the same, and people should not know which group they are in. And that, when we call, when we do that, is a blind experiment. Now, you might have heard about double-blind experiments. Well, that would be the case where not only do people not know which group they're in, but even their physician or the person who's administering the experiment—they don't know which one they're giving. They don’t know if they're giving the placebo or the actual medicine to the group.

So let's say we want to do that. We could do a double-blind experiment, so even the person giving the pill doesn't know which pill they're giving. You might say, "Well, why is that important?" Well, if the physician knows, it might affect their behavior. The person administering or interfacing with the patient might give a tell somehow; they might not put as much emphasis on the importance of taking the pill if it’s a placebo.

They might, by accident, give away some type of information. To avoid that type of thing happening, you would have a—you could do a double-blind. And there’s even some people talk about a triple-blind experiment where even the people analyzing the data don't know which group was the control group and which group was the treatment group, and once again that's another way to avoid bias.

So now that we've kind of figured out we have a control group, we have a treatment group—we're using A1C as our response variable—so we would want to measure folks' A1C levels, their hemoglobin A1c levels, before they get either the placebo or the medicine, and then maybe after three months we would measure their A1C after.

But the next question is: how do you divvy these hundred people up into these two groups? You might say, "Well, I would want to do it randomly." And you would be right, because if you didn't do it randomly—if you put all the men here and all the women here—that might introduce bias. Sex might explain it, or the behavior of men versus women might explain the differences or the non-differences you see in A1C levels.

If you get a lot of people of one age or one part of the country or one type of dietary habit, you don't want that. So in order to avoid having an imbalance of some of those lurking variables, you would want to randomly sample. And we've done multiple videos already on ways to randomly sample. So you're going to randomly sample and put people into either group.

A very simple way of doing that: you could give everyone here a number from one to one hundred, use a random number generator to do that. Or you could use a random number generator to pick fifty names to put in the control group or fifty names to put in the treatment group, and then everyone else gets put in the other group.

Now to avoid a situation, just randomly by doing a random sample you might have a situation where there's some probability that you disproportionately have more men in one group or more women in another group. To avoid that, you could do a version of stratified sampling that we've talked about in other videos, which is you could do what's called a block design for your random assignment where you actually split everyone into men and women.

It might be fifty-fifty, or it might even be, you know, just randomly—here you got, you know, sixty women and forty men. What you do here is you say, "Okay, let’s randomly take thirty of these women and put them in the control group and thirty of the women and put them in the treatment group, and let’s put randomly twenty of the men in the control group and twenty of the men in the treatment group."

That way, someone’s sex is less likely to introduce bias into what actually happens here. So once again, doing this is called a block design, really a version of stratified sampling block design. There might be other lurking variables that you want to make sure don’t just show up here randomly, and so you might want—there are other ways of randomly assigning.

Now once you do this, you see what was the change in A1C. If you see that, "Hey, you know, the change in A1C—if you see there’s no difference in A1C levels between these two groups," and you’re like, "Hey, there’s a good probability that my pill does nothing." And once again, it’s all about probabilities; there’s some chance that you’re just unlucky, and it might be a very small chance.

That’s why you want to do this with a good number of people. As we forward our statistics understanding, we will better understand at what threshold levels we think the probability is high or low enough for us to really feel good about our findings.

But let's say that you do see—let’s say that you do see an improvement. You need to think about: is that improvement—could that have happened due to random chance, or is it very unlikely that that happened due purely to random chance? If it was very unlikely that it happened due purely to random chance, then you would feel pretty good, and other people, when you publish the results, would feel pretty good about your medicine.

Now even then, you know, science is not done. No one will say that they're 100% sure that your medicine is good; there still might have been some lurking variables that we did not properly adjust for. That just when we even did this block design, we might have disproportionately gotten randomly older people in one of the groups or the other, or people from one part of the country in one group or another.

So there are always things to think about. The most important thing to think about, even if you did this as good as you could, you still—some random chance might have given you a false positive or a—you know, you got good results even though it was random or a false negative; you got bad results even though it was actually random.

A very important idea in experiments, and this is in science in general, is that this experiment should be documented well. The process of replication—other people should be able to replicate this experiment and hopefully get consistent results. It’s not just about the results; it's about your experiment design. Other people should, it should be an experiment that other people could and should replicate to reinforce the idea that your results are actually true and not just random or just due to some bad administration of the actual experiment.

Introduction to experiment design | Study design | AP Statistics | Khan Academy

More Articles