Red Button: You Live, Blue Button: Everyone Might Live

20m read

·Dec 2, 2024

Hello, good morning! Hi, it's been a while since I made a video about green beard altruism. Let's not bury the lead. But, uh, it's going to take a while to get there.

There's a puzzle that's been going around social media for a while and recently boiled up again. The puzzle goes like this: you are presented with two buttons, button one and button two. If more than 50% of the players choose the first button, everybody lives; nobody dies. We just go on with our life. If fewer than 50% of players choose the first button, everyone who chose button one will die, though. So, pretty grim.

In this framing, I would say the immediate thought that I have is it seems pretty important that we choose button one so that nobody dies. Because if less than half of us choose button one, a terrible, terrible, terrible thing will happen. However, you can alternately and equivalently present this puzzle in a way that is more about yourself and less about the group. Because this first framing is a group-interested explanation; it's talking about what happens to the players in the game.

Or you can also explain the puzzle in a self-interested way. The self-interested way to explain this puzzle is: if you press button two, you will always live. If greater than 50% of players choose button two, the ones who chose button one will all die. If you frame the puzzle in the self-interested way, like this—where your frame of reference is about your decision and what happens to you—it sounds kind of ludicrous not to press button two. If you press button two, nothing bad can happen to you, in quotation marks, as we'll get to.

One of the reasons that this thought experiment, if you can call it that, goes viral on social media a decent amount is that it gets motivated with political ideology. So it's often presented with a blue and a red button, where the blue button is the one that can keep everyone alive, and the red button is the one where if you press it, you can't personally die. Then you get people making analysis about the buttons, which lines up with political ideology. People might say that pressing the blue button is basically like a suicide cult, and it's not their fault if the people who press the blue button all die. They can just press the red button because that's the logical thing to do.

Other people might say the red button is selfish, and if the majority of us pick the blue button, we can just ensure that nobody dies. So, the next part of this—I think, my personal belief—and this is a war that I have been fighting forever and we fight forever. I think there are people who just took a week of game theory and now think that they know what correct actions to take are in games, and have not continued with any sort of coherent or self-critical analysis in any way. They just like studied enough game theory to be able to use words to make it seem like their personal self-interest was justifiable.

And now they continue to use game theory, in quotation marks, to tell other people that they are right. So, the game theory argument, in full quotation marks, game theory often assumes that we are perfectly rational and self-interested actors. As such, people who do not understand game theory very well might claim that it is correct to pick button two, the red button, the one that keeps you alive but may kill up to half of the population, as it results in 100% of the players you pick at living.

They think that correct has some sort of abstract meaning here because of regular assumptions in game theory. However, assuming that players are perfectly rational and self-interested is not necessary in game theory. In fact, making that assumption makes it so game theory isn't applicable to the real world because—well, unless you're in some segment of the real world where actors are perfectly rational and self-interested, right? But usually, we understand, in analyzing humans, that humans are not perfectly rational and self-interested. Game theory as a toolkit allows you to also say things about the world with other assumptions, like, for example, that humans are humans or something like that.

So when you're talking about human behavior and trying to make game theoretical analysis of it, you should match your assumptions that you're making with the population that you're studying. You have to assume that the people you're studying are going to act like humans. You can't assume that they act like perfectly rational and self-interested actors because humans don't.

Okay, so I want to kind of walk through this from a more curious game theoretical perspective. This is only like an hour of work, so it's not meant to be exhaustive and it won't be perfect. But I was just kind of thinking: how would I try to model this situation? What sorts of useful assumptions can I make about the behavior of the population that could lead to some insights? I came up with some different player designators.

So rather than assuming that people are self-interested and rational and that you yourself are self-interested and rational, I thought it would be interesting to think about some things that the players of this game might be, as humans, that were relevant to the decision that they made in the game. I think it's really obvious if you look at conversation about this game online that the players are different.

I had somewhat of a crisis of confidence seeing the result of a poll on this game where like 45% of people chose blue and 55% chose red recently. Then there were a bunch of people in the comments talking about how they had successfully chosen red and the people who had chosen blue were idiots. When what has actually happened is you've just observed humans completely failing a collective action problem with the result that almost half of humanity has instantly died, which is really, really, really bad.

So, I mean, obviously it's a theoretical thing; it's not actually people making actual decisions, but it says something about what those people might make in scenarios in the real world where they were tasked with making an actual choice. Feels like a good time to talk through this kind of stuff and talk about why some people are choosing blue, maybe so that we can get to a point where nobody was dying in a play game like that.

One player designator, like a binary thing that I thought would be useful, would be to designate a player as having self-interest, which describes when a player considers achieving the best outcome for themselves, versus a player having group interest, where a player considers winning to be achieving the best outcome for the group. And that's present in the original framing of the puzzle, right?

Some of the reasons that your intuition changes depending on how you read the puzzle—I mean, if it does. I'm sure not everybody's will, but this puzzle frames the situation from the perspective of the group and so kind of primes you to think in terms of group interest. Whereas this framing frames the puzzle from the perspective of the individual and so kind of frames you to think about the puzzle from the perspective of an individual's interest.

So, yeah, while it is a common assumption of game theory that the players are self-interested, humans are not entirely self-interested. I mean, to some extent, yes, but humans who live in societies with other humans also care about people other than themselves. Unless they have some sort of psychopathy or something; I don't know what the mental designation would be. But generally speaking, humans are going to have attachments to other human beings and things other than human beings, which they will care about and be willing to give up parts of their own self-interest to aid.

So that's one kind of player distinction. We can think about how a self-interested player might play the game—maybe they pick red. How does a group-interested player play the game—maybe they pick blue? This is not about political alignment, by the way. I find that calling them the blue and red button is often done online, and so I'm going to do it a bit here, but I'm not trying to say anything about political parties.

Another couple of player designators: we have a disaster-sensitive designator. So a disaster-sensitive player thinks, for example, that 30% of all humans dying would be bad for them personally. So a self-interested person who is disaster-sensitive might think, "I need to work out how to not have 30% of humanity die right now," even though, whether they stay alive or not, they feel like 30% of humans dying would be bad for them.

You know, a lot of those humans have knowledge about certain things, work in vital jobs for the safety of the world—they're like doctors, and they are maintaining nuclear power plants. I don't know, but you know, it would be bad if lots and lots and lots and lots of humans suddenly die is the thought here. Then you have disaster-ambivalent players: a disaster-ambivalent player thinks that it would be winning to live in a world where a substantial percent of the population had died.

To live, regardless of whether or not substantial percentage of the population had died, so they'll press the red button. If they press the red button and everyone else does too, and nobody dies, they're like, "Yeah, what an awesome outcome." But if they press the red button and 45% of the human population dies—perhaps because, or let's make it even worse, if they press the red button and because they press the red button, there is one more red button pressed than blue button pressed, and almost half the human population dies—a disaster-ambivalent player is like, "Oh, whatever; that doesn't really matter. The important thing was that I lived," which is, you know, the idea of disaster ambivalence taken to the absolute greatest extreme in this puzzle.

Another one is high empathy versus low empathy. I feel like I see this a lot, and a lot of these designators are motivated by reading people's commentary on this puzzle, right? It's really surprising to me how many people think that it's possible for everyone to think the same thing about a puzzle like this, which just—nope.

A high empathy player might recognize that there are players who are differently motivated from themselves, who think about the puzzle in different ways from themselves. And so they'll understand that different people are going to choose the different options for different reasons. Whereas a low empathy player might not understand that there are players definitely motivated from themselves, and they might think that it is reasonable for everyone to make the same decision for the same reason.

I think that introductory game theory courses kind of teach low empathy in service of making game theory appear important. Game theory courses present game theory as the right way to make decisions and then send people out into the world thinking, "I understand game theory, and so when I am presented with a decision, I will know the right way to solve it."

The way for everyone to solve it is for everyone to learn game theory and then for everyone to solve the decision using game theory, right? But game theory is an individual technology for decision-making, which does not have to be used for decision-making and which not everybody knows.

So, I think introductory game theory classes do just an incredibly, incredibly poor job of framing what game theory is for people who leave those scenarios with that idea about what game theory is in terms of how you should make decisions and how other people can be expected to make decisions as well.

In terms of the original puzzle, though, a high empathy player, when thinking about pressing the red button, would not think it is possible for every single person to press the red button. Attached to them pressing the red button would be the realization that some people are going to press the blue button and that those people may die. Whereas a low empathy player might think everybody should press the red button; the only relevant decision-making technology for pressing buttons is game theory, and it says that a self-interested rational actor should press the red button, so I'm going to press the red button, and I can expect everybody else will think that they should too.

Yeah, and then another player designator I have here is high logic. A player with high logic can follow thoughts about motivations through from a starting premise to an ending strategy versus low logic, where a player with low logic cannot follow thoughts about motivations through from a starting premise to an ending strategy. The point here is if you're thinking like, "Oh, everybody should pick red, everybody should think this thing, and then this thing, and then this thing resulting in them all picking red and none of us will die," that is requiring that everyone else be functioning on the logic level which allows them to make that sequence of choices; whereas, in reality, different people have different amounts of logic being applied to this situation.

For what it's worth, as I was just talking about with game theory, it’s unclear that the amount of logic applied to this puzzle, which results in pressing red because of self-interest, is the right amount of logic to apply to this puzzle. I would not say that it was a high amount of logic to apply to this puzzle either; it's like a high amount of introductory game theory to apply to the puzzle, but logic is a different thing from that.

So, given that these designators exist in the population—these are the designators that I'm going to analyze the population with for right now. Now, obviously, there are other ones you could talk about. Obviously, in real life, in different populations, different people will have— I mean, they will be entire human beings; they will not just be a concept of a human being.

But given these designators, possible solutions: if everybody votes for the red button, if everyone presses the red button, there will be zero deaths and everyone will live. However, this is not a possible solution, so I buried the lead in this slide. This isn't a possible solution—it should be like impossible solutions given these designators—because there are people with high versus low empathy; there are people with high versus low logic; there are people with group interest versus self-interest.

It's just not feasible that everyone presses the red button here. You can't arrive at a solution where every single person does the same thing with any reasonable applicable strategy here, really.

So, this just can't happen. Its tier rating is invisible; it doesn't go on the tier list anywhere, in the tier list of possible solutions. Next, we have a sliding scale of solutions, where people are voting for pressing the button 50 to 99% of the time red. In populations which are self-interested and carry lots of low empathy and disaster-ambivalent players, this range may be seen as an expected or even the only achievable outcome.

So, it might be that like 95% of the population presses red and it’s just thought ludicrous that you could arrive at any solution that was much different from that. Like the idea that everyone could select a blue is impossible.

While I am not in general trying to make commentary on political parties, I want to point out that this is essentially exactly how people in America think about third parties when thinking about voting in elections, so I don't know; just leave that there. However, players with any amount of group interest or disaster sensitivity will probably see, you know, 5% of all humans dying as a colossal catastrophe.

Whether it would be worth living in a world where 40% of people had just died is very much a question different people will have different answers to. Imagine if, like, your family was full of people who thought that they should press the blue button here, and you pressed the red button, and like literally all of your family members died. It would be pretty bad, I assume.

So not everyone's necessarily going to think that surviving in this world is even one, let alone thinking that the outcome as a whole is a win. A suggestion I have here is that if your method of decision-making in a population regularly results in a quarter of it dying in situations where there's no reason for anyone to die, their method of decision-making in a population is very poor.

And I'm just going to say it is very poor. It's not that it, like, may be very poor; this is a very, very, very, very poor outcome. And you can use different technologies to make decisions. If the technology you're choosing to use to make decisions results in this, it's a massive observable flaw in your decision-making technology.

Yeah, like, if you vote for red in the poll that says 55% red, 45% blue on social media, your thought should not be like, "Oh my God, those fools who voted blue," or whatever. Your thought should be, "Oh my God, like the decision-making process that got us here is so bad. How can we repair it?" Clearly, like, some form of logic or thought going into making these choices is horrendously flawed and needs to be improved.

There's another possible solution: any amount from—wait, sorry, 0 to 50% red, not 0 to 50% blue. Any amount of 0 to 50% red button presses results in zero people dying. That is the good solution. Is it achievable for over half of the population to vote for the best solution? Again, I just said I wasn't trying to make parallels to politics, but that’s—I sure just said those words.

I think it's not that hard; you do expect 50% or fewer people to vote red here. I just think, like, yes, you can achieve the situation where nobody dies in this puzzle that has been presented where nobody needs to die and all people have to do is press a button that results in nobody dying.

I just don't think that that's hard. So, best to worst: best 0 to 50% blue, sorry; did this twice; 0 to 50% red, worst 50 to 99% red, impossible 100% red.

So in observing this, though, there is the problem of self-interest to still approach because I've just outlined everything from a group interest perspective, right? This is not self-interest. Self-interest is—maybe you might think that self-interest is to press the red button, so I can't die and stop your thought there.

You know, it's easy to keep going and be like, "But all of my family might die," and then think maybe that isn't self-interest anymore. But you know the Paradox of Selfishness is basically that pressing the red button cannot kill you, while pressing the blue button can.

So, you know, you might be thinking, "Don't press the red button." Of course, it's possible that if you press the red button, you kill your doctor who's curing you of cancer or whatever, and that does kill you, but whatever. Given that this is an immediate choice faced by individual actors, how can we escape what appears to be a logical mandate to choose the red button?

Scare quotes around that; it’s like an intro game theory mandate to choose the button again. So I have some options here. Option one: you can have group interest. Just because game theory often assumes that people are self-interested doesn’t mean that you have to be self-interested or that people are self-interested. You can just have a group interest.

You can just care about people other than yourself, and given the previously enumerated observations about possible results from this experiment, you can recognize that you are being called upon to do something dangerous to yourself but beneficial to the greater good and choose to do so.

We have a variety of social technologies that encourage people to do stuff like this and justify doing it and reward people for doing it. This type of decision-making is encoded into both genders here in the United States. You have the idea of men sacrificing themselves for their families; you have the idea of motherhood. I think a lot of people who intuitively think you should press the red button in this scenario are men, but maybe a moment of thought about, you know, whether motherhood seems self-interested and safe could be beneficial to think about this.

This idea that you can care about people out there is also coded in a variety of religious tenets. I really like Kant's categorical imperative. I don't like everything about Kant and I don't like everything about the categorical imperative either, but the categorical imperative is an ethical suggestion: the one deciding whether to take an action or not asks themselves—would the world be good if it was normalized for people to take this action in the situation that I'm in?

So when thinking about whether to litter or not in a park, you could think, "If it was normal for people to litter in the park, would the world be good or bad?" And, you know, while littering might save you 30 seconds, from a perspective of self-interest, that is good, and then you might further justify like, "Oh, my time is really important; I have to get back to work where I do an important job," etc., etc., etc.

You might go through all of these logical jumps to get to the conclusion that you should litter. The categorical imperative is just like: no. If everybody littered, the park would be... so don't litter. The categorical imperative is a nice one for simplifying things that might at first seem pretty muddy into a very simple analysis, which can end up seeming compelling.

Obviously, though, for what it's worth, again, these are all like decision-making decision-motivating technologies. Like, there’s nothing right here. Like rights and wrongs are not these abstract things which we can or I can apply to the scenario on a PowerPoint presentation. It's just a matter of what are your designators as an individual, I guess. Do you—maybe you think that 45% of the population who pressed the blue button dying is actually a good thing because you think they should die. Maybe that's what you think. I hope not, but maybe you do.

Yeah, so different people have different thoughts, right? Another option for motivating yourself to press the blue button could be disaster sensitivity. If we don't occasionally do things which are outside of our shallow self-interest, there are a variety of ways in which the social structures we inhabit may collapse catastrophically.

I have another categorical imperative kind of example here, where putting trash in appropriate bins instead of on the ground might seem paradoxical from a perspective of self-interest, but when framed as "Do what you want your things that we live with and enjoy to collapse," you can see like, oh, it actually does match my self-interest to do things which keep the things that I use intact.

Option three: you can have a high empathy. You can think, when looking at the red button, "I may be condemning 50% of humanity who think differently from me to die," and you might think that that is a really bad thing to do. You might think about the suffering that you're causing, etc., etc., etc., pretty easy way to do it.

You might have low logic. This is an interesting one because I don't think that high logic leads to you pressing the red button, but I think a lot of people who corner themselves into thinking that they should press the red button try to apply logic and just fail. I don't think that high logic actually results in you pressing the red button here, but I think if you are not that great at logic and try to apply logic, that's where you can end up.

So like one thing you can do is just be like, "Well, I'm kind of bad at logic and am aware of that, and I shouldn't use that as a strategy for deciding which button to press," and that would allow you to have thoughts like, "I don't want to live in a world where my wife picks the blue button and I'm left to live without her," and then just pick the blue button because this is a scenario where one of the buttons results in everybody living and the other button results in you living.

And there are other things going on there, but you can approach the scenario with very limited logic and be like: "Oh, everyone living—that seems better than just me living." Anyway, you can also approach this problem with low logic and think that you should press the red button, so I don't think low logic is the best way out of self-interest here, but I think it's worth pointing out that one of the ways that people back themselves into self-interest is trying to apply logic and not getting all the way through the puzzle basically.

I guess, again, it's also possible that you think that almost half of humanity dying is fine and that introductory game theory is the best decision-making technology in the world. You know, good for you. I don't know, again, I can't really tell you that you're abstractly wrong. I just think that most logical analysis of what sort of premises could lead to thinking that is right, doesn't make any sense.

You probably can't sensibly end up at the idea that almost half humanity dying is good; it's just really hard for you to think that in any coherent, internally consistent way. And you don't need external good or bad to see that something that's internally consistent is fractured and doesn't make sense.

There's a relevant technology here which I think is really important, and I mentioned it at the start of the presentation: it is green beard altruism. This is one of the major vehicles with which humans are able to achieve zero deaths in situations like this. Obviously, this is like a toy game, but in general, humans are faced with many scenarios where they can act selfishly for their own personal gain, but if you do that, lots of people might die, or you can contribute to society and work together, and if you do that, everyone will be fine.

In general, successful human societies have worked out ways to get people to all work together so that everyone's fine. Yeah, green beard altruism is a very relevant technology for human societies who have been able to do this. The idea of green beard altruism is that there are people who are altruistic, which means that they care about others; they will help others, etc., etc., etc. And it's also possible to tell who they are; that's the green beard part.

So there's some sort of visible perceivable designator that shows that people do this thing. An example that was given to me when I was studying this stuff a while ago—I don't know if this is still true or not, but it was having stickers that showed that you supported public radio. People would display their stickers in different places, and if you didn't have one, people would be like, "Why aren't you supporting public radio?" I believe this was a thing in Japan when I was studying it. I don't know if people still do that, but yeah, that kind of idea.

If you're living in a society where there are lots of green beard altruists and you can tell that they exist, you don't have to worry much in a situation like this because you're not really concerned that fewer than half of the people are going to press the blue button. So when you're there and you're presented the red button and the blue button, you can just be like, "Oh, we'll just press the blue button; nobody will die."

Well, everybody else is going to do that too; maybe not everybody else, right? Maybe just like 75% of other people, but still, if you have a high enough density of altruists who you can see are altruists, you get to cooperate with other humans even in scenarios where you have low information. This is what motivates things like being able to trust strangers.

Like, this is a large and important concept. It also feeds back into self-interest because by making altruistic choices, like voting blue here, by pressing the blue button, you are going to in some way have this designator assigned to you which will let other people see that you're an altruist. Then other people are going to want to cooperate with you and help you in the future because altruists are generally going to be helping other altruists before they're helping people who aren't altruists.

Altruism isn't blind here. The idea of green beard altruism is that altruists are altruists toward other people with green beards first and foremost, and so there's a self-interest reason to do something that gets you a green beard, basically.

Yeah, and concluding slide of the presentation: there's even green beard altruism in nature. So as a decision-making technology, this is something that we have found at a cellular level. So this is real in a way that is physically detectable. This is a powerful way to make decisions which has evolutionary advantages in the real world that we live in.

So in general, when faced with problems like this button one, if you don't want to overthink things, I think a behavior pattern which generalizes well to all problems like this one, pretty much—and there are lots of problems like this one and lots of problems that are similar enough to this one that you could apply this to them—is to act for the common good.

Live your life in a way which makes it clear that that is what you do and cooperate specifically with other people who see you acting in that way, who you see acting in that way alongside you. So like clearly be altruistic and care about others and help other people who you see being altruistic and caring about others. If you do that, a puzzle like this one—puzzle in quotation marks—it’s just like not a puzzle; it's just like, "Oh, we just do this thing together; nobody dies."

You create a society which is resistant to things like this and just ends up at good outcomes. And yeah, that's my thing on that. Hope you're all doing well. Bye-bye.

Red Button: You Live, Blue Button: Everyone Might Live

More Articles