A.I. Policy and Public Perception - Miles Brundage and Tim Hwang
Alright guys, I think the most important and pressing question is, now that cryptocurrency gets all the attention and AI is no longer the hottest thing of technology, how are you dealing with it?
Yeah, Ben Hamner of Kaggle had a good line on this. He said something like, "The great thing about cryptocurrency is people no longer ask me about whether there's an AI bubble." And yeah, it's hard to compete with the crypto bubble or phenomenon, whatever you want to call it. It's actually a good development, right? I mean, the history of AI is like all of these winners, and like having another hype cycle to kind of balance it out might actually be a good thing.
Yeah, absolutely. Let's talk about your paper to start it off, Miles. So yeah, what is it called, and where do you...
Yeah, it's called "The Malicious Use of Artificial Intelligence," and there's a subtitle—like forecasting, prevention, and mitigation. It's attempting to be the most comprehensive analysis to date of the various ways in which AI could be deliberately misused. So not just things like bias and lack of fairness in an algorithm that are not necessarily intentional, but deliberately using it for things like fake news generation and, you know, combining AI with drones to carry out terrorist attacks or offensive cybersecurity applications.
And, you know, the essential argument that we make is that that needs to be taken seriously: the fact that AI is a dual-use or even omni-use technology and that, similar to other fields like biotechnology and computer security, we need to think about whether there are norms that account for that. So things like responsible disclosure, when you find out about a new vulnerability is something that's pervasive in the computer security community, but hasn't yet been seriously discussed for things like adversarial examples, where you might want to say, "Hey, there's this new misuse opportunity or way in which you could fool this like commercial system that is currently, you know, running driverless cars or whatever." And so there should be some more discussion about those sorts of issues.
Mmm, okay. And so, Tim, have you been focusing on any of the stuff while you've been here at Oxford, or is your work totally unrelated?
It's somewhat related, actually. I mean, I would say that I've mostly been focusing on what you might think of as a subset of the problems that Miles is working on. He's sort of saying, "Look, AI isn't gonna be inherently used for good, and in fact, there's lots of intentional ways to use it for bad," right? One thing I've been thinking about is the sort of interface between these techniques and the problems with disinformation and, like, whether or not you think these techniques will be used to make, you know, ever more believable fakes in the future and what that does to the media ecosystem. So I would say it's like a very particular kind of bad actor use that Miles is talking about.
Mmm-hm. And so when you're doing this research for both of these topics, are you digging into actual code? Like, how are you spotting this in the wild?
Yeah, so, I mean, my methodology is really kind of focused on looking at what is the research that's coming out right now and like trying to extrapolate what the uses might be, right? Because I think one of the really interesting things we're seeing in the ad space is that it is becoming more available for people to do, right? Like you've got these cloud services. You know, we've got the tools that are widely available now. And so I think what's really missing is like the ability to kind of figure out how you do it, right? Like what is the methodology that you use? And the question is, can you see papers that are coming out and saying, "Hey, we could actually use it for this somewhat disturbing purpose," and then kind of extrapolating from there to say like, "Okay, well, what would it mean for it to get used more widely?"
Mmm-hmm, yeah. Yeah. So like reading papers saying what the hot areas are and, you know, cases in which some sort of potentially negative or positive application is, you know, on the cusp of getting just efficient enough to be used by a wide array of people or, you know, the hyperparameter optimization problem is close to being solved or whatever sort of trend that you might see might be a sign that certain technologies are going to be more widely usable—not just by experts, but potentially in, you know, a huge range of applications.
For the purpose of this report that I recently wrote, you know, we got a ton of people together, including Tim at a workshop, and we talked about, you know, technical trends, and you had people in like cybersecurity and AI and other areas sort of, you know, compare best guesses of what's possible and then prioritize what the risks are and what to do about them. So I think, you know, a lot of—I think often, like pulling together different disciplines is a good way to think about what's possible. And then one other thing that I'll point out is that you don't necessarily have to even look into the technical literature to find, you know, discussion of these sorts of misuse applications today because it's a hot topic already. So things like deep fakes for, you know, face swapping and pornography is a huge media issue right now that actually happened while we were writing this report. And then we, like, added something later about it because we characterized the general issue of, you know, fake videos and, yeah, you know, misinformation and AI as making that more scalable because, you know, it potentially requires less expertise. And like while we’re writing that, this deep fake thing happens, and it’s, you know, democratizing in some sense the ability to like, you know, create fake videos.
So, yeah, it's quite a live issue, right? I think there's a really interesting question here, particularly when you think about like prediction about like there's the realm of what can be done and then trying to understand like what's likely to actually happen.
Mmm-hmm. This seems to be the really challenging thing because there's like lots of terrible uses for almost every technology, right? But we see certain users more prominently than others, right? And I think that's actually where the rub on this sort of stuff is, and actually as part of this prediction problem.
Yeah, yeah, so that's why you kind of have to—yeah, I mean, first of all, have some humility about like, you know, what you can predict. Like, you know, if it's a fully general-purpose or a fairly general-purpose technology that can be steered in a bunch of different directions or applied to a bunch of different data sets, then, you know, you should expect it. If it's super widely available, a bunch of people are gonna find new uses for it. So that—I mean that I think that's a reason to sort of look upstream at the papers and see like where the technical trends are because then you can say like, "Well, you know, maybe this is not yet ready for primetime for any application" or "like this is starting to be like fairly general purpose."
Yeah, I mean a good question for you, Miles, is whether or not you think that like we'll see like the virtual uses be the ones that happen first versus the physical ones, right? So some people have said, "Okay, well, you could use AI to really make, you know, hacking much easier, right? We might be able to use it to write, create these like fakes, right, which we're already—but I'm wondering if those threats kind of evolve in a way that's like different or maybe even earlier than, you know, threats of like, you know, if people talk about like, "Oh, what happens if like someone built a drone that goes out and yeah, that rhythms to go hurt people?"
Yeah, it's hard to say. I mean, I think one, you know, your stick that I've used is that, you know, stuff in the physical world is often harder. And yeah, that like it's both more expensive and less scalable. You have to buy actual robots, and then there's often hardware issues that you run into. And like the general problem of perception, perception is much harder in the real world than in like, you know, static datasets. But, you know, we're seeing progress. Like just a few days ago, there are a bunch of cool videos from Skydio of their autonomous drone for like tracking people doing sports and flying around, and it seems to be pretty good at navigating in forests and things like that. So, you know, maybe technologies like that are sort of a sign that, you know, there'll be much more both positive and negative uses in the real world. But, yeah, I think in terms of, you know, near-term impact, I think, you know, those sorts of things that have those autonomous features still aren't super easy to use for end users outside of like a particular domain.
So I’m not sure that like anyone could just easily, you know, repurpose it to, you know, track a particular person or whatever. I think it's sort of for that domain application, and I don't know how expensive it is, but yeah, probably more expensive than like a twenty-dollar drive.
Right. I think about like what is like the first harm that comes out of the gate in a big way, because I've debated often like, okay, so like say there's a horrible self-driving car incident that occurs, right? Like maybe that turns society off in general to the whole technology, and like there's a big categorical outlawing of it. You like, I'm like, okay, that's kind of not so good, right? But at the same time, I'm kind of like, okay, well, what if like hacking becomes a lot more prominent in a way that's powered by machine learning? But like we know that, like, I don’t know the response to like huge data disclosures or huge data compromises, it's like actually quite limited public response, right? And that seems not so good either is basically like people either overrun or overestimate the risk or underestimate the risk depending on like what happens first.
Yeah, yeah, people are starting to get kind of desensitized to the, you know, these mega disclosures. And so maybe they won't even care if there's some adaptive malware thing that, you know, that we might be like, "Whoa, that's kind of scary," but it could be that, you know, something truly catastrophic could happen if you sort of combine the scalability of AI and digital technology in general with like the adaptability of human intelligence for like finding vulnerabilities. If you put those together, you might have like a really bad, you know, cyber incident that will actually like make people be like, "Whoa, this AI thing."
Yeah, so that's something that worries me a lot, but it's sort of like a moving goalpost on the positive and negative side, right? Like so, you know, news feed, for instance like you could call that AI to a certain extent, right? As it's feeding you information. People get mad at news feed; they don't get mad at AI, right? So like the notion that the public would generally turn on something like that seems almost unrealistic, right? Because you want to point at one thing, right?
I mean, I think that's—it is basically like what the public thinks about as AI is an AI, right? What we're actually talking about is like this weird amalgam of like popular culture, some research explanations that make it to the public, you know, all these sorts of things, and there's so much about like what does the public actually think AI even is, which is really relevant to the discussion, right? Because, right, like the newsfeed assuredly is AI, right? It uses machine learning; it uses the latest machine learning to do what it does, we don't really think about it as AI, right? Whereas like the car is like a—
I mean, I think a lot of robots kind of fall into this category where even robots that don't involve any machine learning are thought of as AI, right? And like actually impact a discussion about AI despite not actually being related to it at all in an absolute sense.
Well, then it sort of becomes a design challenge, right? It's like why these self-driving cars are shaped like little bubbly toys, right? They're so much less intimidating when you see it just like bump into like a little bollard on the street or whatever, but yeah, the robot, like the factory robot for instance, like those are terrifying to people, but they've always been terrifying. There's no difference here.
But surely there are positive things that you guys notice, you know, you're going around to these conferences. Like what questions are people asking you about AI? What is the public concerned about positively and negatively?
Mmm, so I think there's two things that are really on top of minds that I think keep coming up in the popular discussion around AI right now and also among like research circles. That's right. So, the first one is the question like international competition and like what it looks like in this space. So, this is a question of like, it seems like China's making a lot of moves to really invest in AI in a big way. What does that mean about like these research fields, right? Will like the US and Canada and Europe sort of stay ahead in this game? Will they fall behind? And what does that mean if you think that like governments are gonna see this as like a national security thing? So, that's like one issue I hear a lot about.
The second one I think is around the issues of like interpretability, right? Which I think are a really big concern, which is these systems make decisions. Can we render some kind of satisfying explanation for why they do what they do? And I use the word satisfy specifically there because there are lots of ways of trying to tell like how they do what they do, but like this question of how you communicate is a whole nother issue. And those seem to be like two really big challenges. I'm sure Miles has other things.
Yeah, I mean there's a lot going on, the whole like FAT (fairness, accountability, transparency) machine learning, and now there's like FAT* so it's like more general than just machine learning at conference series. And, you know, the broader community has been doing a ton of awesome work on those sorts of issues."
But you know, in addition to the transparency thing that Tim mentioned, I would also mention robustness. So that's a huge concern, and pretty much, you know, if you look at the like offense and defense competitions on adversarial examples like the often stems like we don’t really know how to make neural nets robust against deliberate or even unintentional things that could mess them up. Like, you know, they do really well according to, you know, one single number of like, you know, human versus AI performance, but then if it's slightly outside the distribution, they might fail or if, you know, someone's deliberately tampering with it. So that's a huge problem for actually applying these systems in the real world.
And I think, you know, we'll continue to see progress on that but we'll also see setbacks where people say, "Well, this proposal you had for defending, you know, no, that's actually doesn't work." And then, there are all sorts of other things besides just adversarial examples like, you know, there was a recent paper called BadNets that talked about like backdoors in neural networks. So essentially, like someone can put a trained neural network on GitHub or wherever, and then, you know, it seems to work fine but then like, you know, you show it some special image and then it goes wrong. So yeah, there are issues around that.
In terms of positive applications, one area that is super exciting and that there's so much work on it that I've had to like sort of, you know, take a step back and like not even try to like tweet all the interesting stuff I see on it is health.
So there's like pretty much every day on arXiv there's a new paper that's like, you know, superhuman performance on, you know, this dermatology task or, you know, this esophageal cancer task. So there's like a ton of activity in that space and there's that specific to, for instance, like image recognition, like CT scan type, so there's a lot of image recognition. I think that's like kind of the low-hanging fruit. Yes, there's all this progress in image recognition, and like things like adversarial examples aren't necessarily a problem in that domain. Like you're hoping that a patient isn't like fiddling with their image or like putting, you know, a little turtle on their chest when they're getting scanned and then it like gives the wrong answer. So yeah, there's tons of applications there, but there's also just more general machine learning stuff like predicting, you know, people relapsing and like having to come back to the hospital and like when's optimal time to like, you know, send people home or like given this huge data set of people's, you know, medical history is what's the best diagnosis. So there's a lot other applications.
Yeah, there's a workshop at NeurIPS a few years back—was it two years ago?—that was basically like AI in the wild—it was the name of it. And I think that's like a really good way of framing up a lot of the issues that we're seeing right now is like we're moving out of the lab in some sense where it's like, okay, the old tasks used to be just like could we optimize this algorithm to kind of do this thing better? But like now there's a bunch of like research trying to figure out like what do we do when we confront like the practical problems of deploying these things like in the world? And that links a lot of the interpretability stuff; it links a lot of the safety stuff; it links these questions that are specific to health. Like I think all these come out of a fact that like the technology is really finally becoming practical, and so you have to solve some of these really practical questions.
Mmm, and so as far as like deploying this stuff in the wild in the health use case, like who is using it right now? Where are we seeing it?
A lot of its pilot stuff. So, like maybe a hospital here, yeah, Medical Center there. I am not sure of, you know, any super widely deployed ones except for like apps for very specific things like, you know, looking at skin lesions and stuff. But yeah, as I said, it’s something that’s like so active that like I’m not the best person to ask because it’s just like I like haven’t even, you know, tried like you know assess what’s the hottest thing in this area. This is just like every day there’s like, "Oh, new pilot on this." But a lot of it, you know, as Tim said, is like at this age where it might get rolled out but it hasn’t yet been rolled out. So, there are like pilots on the one hand but then there’s also a lot of stuff that’s just training on offline data and they’re like, "Well, if we had implemented this it would have been good," but, you know, there are issues around interpretability and, you know, fairness and stuff like that that would, you know, have to be resolved before it was actually widely deployed.
Right. I mean, one of the interpretive debates that I'm loving right now is basically, so Zeynep Tufekci, this machine learning researcher, did this great paper called "The Doctor Just Won't Accept That," right? And it's basically a reference to that, that trope in a lot of the discussions where it's like, "Well, the doctor won't accept that," it's like not interpreted, it's not interpret, I mean it's not interpreted well. And like he's challenging, I think, what is like a really big question, right? Which is like will they care in the end? Like will interpretability actually matter in the ends? And like are we actually, in some ways, is like the field actually like, you know, over indexing on that or maybe in the very least not thinking as nuanced as it should be about like what kinds of interpreters actually needed or expected in the space? And I think that's like one big question is just like, you know, will these things become the norm for the technology or will, you know, the market kind of adopt it even without those things?
And I think you're worried about the safety of these technologies ends up being a question not just of like can we develop the methods, but can they be something that's just like expected that you use when you deploy the technology? Because it's possible that if you just sort of leave it to the market that will just kind of rush ahead without actually thinking about how to build a microphone.
Yeah, you’re totally fine using it. There’s always things in like—you probably see it with like, you know, anti-vaxxers.
Mmm-hm. They're like the old-school, homegrown version. Maybe that they don't want to accept it but the rest of the world seems totally fine, then yeah. And just another point, I think there likely to be differences cross-nationally, not just like international in terms of who's gonna be willing to accept what. You know, countries in the European Union might be like much more and the European at the EU level. There might be a lot more regulation, these sorts of things. You know, there's this whole discussion around right to an explanation and the general data protection regime. In China, there's like much less—I haven't seen as much concern about interpretability, though there are some like good papers coming out of China but in terms of like governance, I haven't gotten the sense that they're gonna like hold back the deployment of these technologies for those reasons.
And then in the US, maybe it’s like somewhere between the two. I mean, it's a real battle of like—I was reflecting on this because I saw a debate on interpretability recently where some researchers were like, "No one cares, let's just roll ahead with this stuff."
So just to pause and define that just in case someone is listening who's not like an AI.
Yeah, sure. So I think the most colloquial way of talking about it is interpretability is kind of the study of the methods that let you understand like why a machine learning system makes the decisions that it does, right? Another way is like kind of like an audit to understand how you got this output.
That’s right, exactly right. And there's two sets of problems there. One of them is can you actually extract like a meaningful explanation to like technicians? And then there's the other question of just like from a user point of view, like, you know, just like a doctor or someone who's not like a domain expert on machine learning, being able to understand what's going on, right? Okay, right. And the debate, I think, focused on just like does it matter, right? Because I think there's some machine learning folks who are like, "Look, if it works, it works, you know," and that's ultimately gonna be the way we're gonna move ahead on this stuff. And some people say no, we actually want to have some level of explanation.
And I actually kind of got the feeling that in some ways this is sort of like machine learning fighting with the rest of the computer science field, right? Because like when you're learning CS, it's very much about like, can you figure out like every step of the process, right? And like, you know, whereas machine learning has always been like empirical in some sense, right? Like in the sense that like we just let where the data tells us train the system, right? And like those are actually two ways of like knowing the world that are actually debating on this question of interpretation. It's sort of like statistical significance in bio where it's like, "I don't know, it works five out of five hundred times, like therefore it works fine."
It’s not a computer, yeah. And so, what are people pushing for? Like for instance, you know we're in the UK now, in the US, how are the conversations different?
Mmm, so I mean I think there is certainly very different regimes around like what is sort of expected from explanation, right? Because I think—this actually stems from some really interesting things about like how the US thinks about privacy and how the Europe thinks about privacy. But I would say in general, I think the US moves on a very case-by-case basis so the regulatory mode is basically to say, "Look in medical, that seems to be a situation where like there's like particularly high risks and like we want to create a bunch of regimes that are specific to medical," whereas in Europe, I think there's like broader regimes where the frame is, for example, automated decision-making, okay? Right? And the GDPR applies to automated decision-making systems, which is very broad.
And the actual interpretation will narrow that considerably, but you start from a big kind of category and you narrow down versus an approach I think which is taking much more like just starting from the domain that we think is significant. So it's more patchworky, I guess, in that sense.
Yeah, you would agree?
So, I’m curious about your PhD. What are you working on? And you're almost done.
So, yeah, so I'm studying science policy and the work on my dissertation is on like what sorts of methods are useful for AI policy. And, you know, the problem that I posed is that there's so much uncertainty. Like there’s uncertainty, as we were just talking about, about what where AI will be applied. But then there's also, you know, deep expert disagreement about how long it will take to get the certain capabilities like a human-level AI or even if that's like well-defined, and that let alone what happens after. So I've been taking more of a like scenario planning approach like let's think about multiple possible scenarios and I've done some, you know, workshops and I'm trying to understand, you know, is that a useful tool? And also can we do, like, your models that sort of express this uncertainty in some sort of formal way?
Yeah, there's a lot of like history you've looked into there too?
Yeah, yeah, so I mean I think that one way to—yeah, so I mean people have been talking about AI ethics and AI governance for a long time but there hasn't been much dialogue between, you know, this world and then the other worlds of like, you know, science policy and public policy. And, you know, one way to think about it is that AI is sort of less mature in terms of its, you know, methodological rigor. You know, the best we've sort of come up with is like let's do a survey of some experts, whereas, you know, you look at something like climate change, you know, they not only like, you know, do surveys of experts but also like synthesize that expertise into like an IPCC report that's supposed to be super authoritative and has, you know, error bars for everything and like levels of confidence in different statements. They have this whole process; they have models of different possible futures given different assumptions. Everything's sort of much better spelled out in terms of, you know, the links between assumptions and policies and scenarios.
So I think, yeah, I'm trying to take one small step in that direction of like more rigor and more sort of clarity of, you know, what are the actual disagreements.
Mmm, are you guys—are you familiar with the history of policy? Because I was driving over here with my girlfriend, and she asked me like has this like policy ecosystem around AI always existed around CS? Like for instance, you know, when rioting started were people at questioning the policy of like what does this mean? Is this like a new phenomenon given that, you know, you can establish for a lack of a better word like like a personal brand and like disseminate it out to the world or, you know, have their, you know, kind of always been policy advisors? And isn’t in as many number as you guys like working directly with governments and companies and stuff like that?
Yeah, I don’t know about writing but definitely early on, there were things like nuclear weapons and nuclear energy, and, you know, solar energy and coal and, you know, cars there were people debating the social implications and there were calls for regulation, and there were conflicts between, you know, the incumbent interests and the startup innovators. So I think, you know, those sorts of issues are not new. I think what's more new is, as you said, there's like an ability to spread, you know, views more quickly and to have sort of global conversations about these things.
Mmm-hmm, yeah. I mean I think it’s just sort of linked to the notion of like having specialists develop policy at all. Like I think that's like kind of the history of this, right? Which is like when do certain situations become considered so complex as to require someone to be able to like be like, "Okay, I can become an expert on it and be like the person who's consulted on this topic."
And I think a little bit about like what is like the supply of policy and then also like what is the demand for policy, right? So like in the nuclear war case, right, like governments have a lot of interest in trying to figure out how we avoid like chucking nuclear bombs at one another. I think so like suddenly there's a really strong demand; there’s also like funding. There are all these like—there are all these reasons for policy people to kind of enter the space. And I think AI is sort of interesting in that it kind of like floats in this median zone right now, right? Where it's sort of like you see this happen a lot where people are like, "AI, it seems like a really big deal," but then they get into the room and they're like, "So what are we doing here exactly? Like what is policy and AI?"
And I think that is part of the challenge right now is trying to figure out like what are the things that are really valuable to kind of work on if you think this is going to continue to become like a big issue because right now the technology is nascent in a way that we can argue about the relative impact of it at all, right? And then we can argue about like does it make sense to actually have kind of like policy people like you guys.
I mean obviously there are a lot of machine learning papers coming out all the time, but you're very much at the forefront. Like oftentimes I feel like you're sort of like ahead of the curve a little bit, like anticipating the needs and demands of a company or of a government—and so like planning head for the future. Like are you just like waiting for data to come? Are you like getting within companies to like see what they're working on? You like learning about the hardware? How are you spending your time to figure out what's coming next?
Yeah, I mean a lot of it is just talking to people, talking to people working on hardware and, you know, in industry and academia and like what they're working on. And sort of, you know, I mean, I find it personally helpful to have some sort of predictions or, you know, explicit model of, you know, what’s the future. And, you know, I’ve written some like blog posts about this, like my forecast for short-term—so like in 2017 I made a bunch of predictions; I found that to be a super useful exercise because then I could say okay, what was I wrong about? And was there, like, were there systematic ways in which I can sort of be better about anticipating the future next time?
Yeah, and I think we had asked an interesting question about like what is policy expertise because it's like different in different situations.
Yeah, imagine like the nuclear case and then you actually knew—the nuclear case is pretty interesting because early on the experts from a policy perspective also were like the physicists, right? And like you can imagine that existing actually in a field where in a technical field which is society is like, "Okay, what do we do with this technology?" And the response is, "Well, the scientists working on it will tell you about," right?
But it’s sort of interesting in that like there has been kind of the development of a community of people that I feel is fairly nascent, which I think suggests to me that like at least with two options, right? Like one of them is that like the field could be like—the technical field could be doing more policy stuff but isn’t right now or though it's an arbitrage maybe.
Yeah, I mean that's maybe one way of thinking about it. I mean, but there's also like this other question of just like what are other things that might help to inform the technical research, okay, right? Like I think a lot of my policy work really is like translation work, right? We like talk to policy people who are like, "Well, I understand like liability," and I'm like, "Well, you know, this is, it's mixed up because of AI because of ABC reasons," right? And so, like it’s bringing like the technical research to an existing policy discussion. There's also the reverse that happens, right? Which is basically like researchers being like, "What is this fairness thing?" right? And you're like, "Well, it turns out that you can't just create a score for fairness. There are these really interesting things that people have written about, you know, like how do you think about translating that into the machine learning space as well?"
Which is kind of where you can read like FAT ML doing and so I think that that translation role is like—it’s by no means certain but in the AI space seems to have been like a useful role for people to play again thinking about like what is the, like, policy supply and policy demands.
Yeah, absolutely. Yeah, I think collaboration is super important between people interested in the societal questions and the technical questions. And, you know, it's rare, not just in AI, but in other cases to like have the answer like, you know, readily available. So with like the IPCC for climate change, like they have to go back to the lab sometimes and do new studies because they're trying to answer policy-relevant questions.
So I think AI might be the sort of case where there's sort of this feedback loop between people saying, "Okay, here are the questions that AI people need to answer," like "here are the assumptions we need to flesh out," like in terms of, you know, how quickly will we have this capability, and so forth, that you can't just find that existing on arXiv. Like the answers aren't just lying out there ready to be taken by policy people. I think there needs to be this sort of collaboration.
Yeah, I’d love to actually look into the history of how it evolved in the climate science, right? Because you can imagine a situation where like you hear this from some machine learning people sometimes, or just like, "I just programmed the algorithms, man. Like, other people have to deal with like, I don't know, the implications of that," right?
And like, yeah, presumably you could actually have that in the climate space as well where research could be like, "All I do is really measure the climate, man," like you decide if you want to change the emissions, like, "gross, not my deal." But clearly like that field has taken the choice to basically say, like, "In addition to our research work, we have this other obligation," right, which is to engage in this policy debate, right?
Right, and I think that is really interesting is like what does the field actually think its responsibilities even are, and then like how do other kind of like skills or talents arrange themselves around that.
So then the question ends up being like, Tim, you were at Google before.
Mmm-hmm.
Neri, you were at the Future of Humanity Institute. And how do you guys deal with policy both within an institute and within a company? Like what are the differences, and how do those relationships work?
Yeah, definitely. So I've got kind of a weird set of experience I think just because like I was, yeah, doing public policy for Google, so that was like very much on the company side of things and then now I'm doing a little bit of work with Harvard and MIT on this ethics and governance of AI. And I should have been doing work with the Oxford Internet Institute as well, and it is interesting like the degree to which, you know, you actually find that like people in both spaces are often concerned about the same things but the constraints that they operate under are very different, right? So, you know, both sides I think like I talked to a bunch of researchers within Google who are like very concerned about fairness.
I talked to researchers outside of Google who are in civil society, right, who are very concerned about fairness. Have you found the same to be true?
Yeah, yeah. So I think there are people worried about the same issues in a bunch of different domains, but they differ in terms of, you know, how much time they're able to focus on them and what sorts of concrete issues they have to answer. So like if you're in industry, you have to sort of think about the actual applications that you're rolling out or like, you know, fairness as it relates to this product, you know, assuming that you're working on the application side.
There are also researchers who are interested in the more fundamental question but in terms of, you know, different institutions, and you know, if you're in government, you might have a broader mandate but you don't have the time to like drill down into every single issue. You need to sort of rely to some extent on experts outside the government who are, you know, writing reports and things like that. And then if you're in academia, you know, you might be able to take a super broad perspective. You're not necessarily as close to the, you know, cutting-edge research, and you have to sort of rely on having connections with industry.
So, for example, at the Future of Humanity Institute, we have a lot of relationships with organizations like DeepMind and OpenAI and others. But, you know, we don't have like a ton of, you know, GPUs or TPUs running the latest experiments outside of, you know, some specific domains like safety. So, yeah, I think, you know, having those different sectors in dialogue is super important in order to like have a synthesis of, you know, what are the actual practical problems we're pressing, what are the governance issues we need to address across this whole thing, and then like, you know, what are the issues we need people to drill down on and focus and like do sort of, you know, free rein, you know, wide-ranging exploration of that or like further down the road.
And so what does the population look like here of researchers? I'm curious in the sense of like who's around, like influencing your ideas? Like what are their backgrounds? What are they working on?
Yeah, so it's at the Future of Humanity Institute, it’s a mix of people. So there's some philosophers and ethicists, there's some political scientists, there's some mathematicians. And, you know, it's basically a mix of people who are interested in both AI—or not everyone's working on AI—but AI and biotechnology are two like technical areas of focus but also more general issues related to the future of humanity, as the name suggests. So it's pretty interdisciplinary. Like people aren't necessarily working just in the domain that they're coming from, like the mathematicians aren't necessarily, you know, trying to, you know, prove math theorems, but rather just like bringing that mindset of, you know, rigor to their work and trying to like, you know, break down the concepts that we’re thinking about.
Yeah, I'm curious about this too because I’ve never really understood this about FHI—is sort of the argument that like thinking about existential risk, there's like practices that apply across all these different domains or do they kind of operate as sort of like separate research?
We should pause there to like—is the existential risk at the crux of the FHI being founded?
Yeah, so it's a major motivation for a lot of our work. So the book, Superintelligence, by our founder, you know, talked a lot about existential risks associated with AI. But, but it's not the entirety of our focus. So we also are interested in, you know, long-term issues that aren't necessarily existential and also making sure that we get to the upsides. So I think I'm ultimately pretty optimistic about the positive applications of AI.
So I think, you know, we do a range of issues, but yeah, there are a lot of people who come at this from sort of, you know, like very conceptual, and like, you know, utility maximizing, you know, philosophical perspective of like, "Whoa, if we were to like lose all the possible value in the future of humanity to stop that, would be, you know, one of the worst things that could possibly happen." And so like, reducing the probability of existential risk is super important even if AI is, you know, decades or centuries away.
And even if we can only, you know, decrease the probability, you know, of that happening by like 0.1% or whatever in expectation, that's like a huge amount of value that you're protecting.
So before we wrap things up, I'm curious about your broad thoughts: like what should we be concerned about in the short term around AI and in the long term? And then how did the two mix together?
Yeah, definitely. I mean, so I think this is one of the really interesting things is that at least within the community of policy people and the kind of researchers, right, that there has been this kind of beef—
Well, I mean maybe beef is a little dramatic but a small beef, you know, between like what we might call like, yeah the long term, like you're talking about, which is like people who are concerned about AGI and existential risk and all these sorts of things and then sort of the short-term people saying like, "Well, why do we focus on that when there's all these problems of how these systems are being implemented right now?"
And yeah, I mean I think that is one of the kind of enduring sort of features of the landscape right now, but I think it's interesting question as to whether or not that will be, you know, the case forever.
I don’t know like I know Miles you've had some thoughts on that.
Yeah, yes, I think there are common sort of topical issues over different time frames. So like both in the nearer and the long term, we would want to worry about systems being fair and accountable and transparent. And maybe the methods will be the same or maybe they'll be different over those different time horizons. And I think there are also going to be issues around security over different time horizons. So, yeah, I think that, you know, there is probably more common cause between, you know, the people working on the immediate issues and the long-term issues than is often perceived but in some people who see it as like a big trade-off between like who's gonna get funding or like, you know, this is getting too much attention in the media, but I think actually, you know, the goal of most of the people working in this area is to like maximize the benefits of AI and minimize the risks, and it might turn out that some of the same governance approaches are applicable.
Like it might turn out that solving some of these near-term issues will set a positive precedent for solving the longer ones and start building up a community of practice and links with policymakers and expertise in government, so yeah, I think there’s a lot of opportunity for fusion."
Yeah, what makes the models—I mean you're in kind of like this kind of safety community and like do you hear people talking about like—I mean I use the phrase FAT AGI which I think is just not fascinating as a term just because it marries together these two concepts so well.
Yeah, but I don't know if that's—is that being talked about at all?
Yeah, so I think there—yeah, there’s common cause in the sense that you could sort of—so, I mean, take a step back. So one term that people often throw around in the like AI safety world particularly looking at long-term AI safety is value alignment. So how do you actually learn the values of humans and not, you know, go crazy and do—I mean, you know, to put it that way. But I think, you know, you could frame a lot of current issues as value alignment problems. So things around bias and fairness.
So I think ultimately, you know, there's a question of how do you extract human preferences, and how do you deal with the fact that humans might not have consistent preferences and some of them are biased. So I think, you know, ultimately those are issues that we'll have to deal with in the near term and like might take a different form in the future if AI systems are operating, you know, with a much larger action space. They're not just like classifying data, but they're, you know, taking, you know, very long-term decisions and thinking, you know, abstractly but yeah, I think, you know, ultimately the goal is the same; it’s still like get, you know, the right behavior out of these systems.
Mmm-hmm, and that was very interesting because the example that you just gave was saying, you know, a lot of the fairness problems we're dealing with right now, actually we value alignment problems, yes? Like the problem there is basically the system doesn’t behave in a way that's like consistent with human values.
Yeah, so that's the F in the FAT YA acronym. I mean, to take accountability and transparency. I think there's also common cause. So, you know, one of the issues I've been toying with recently is that that transparency might be a way of avoiding certain, you know, international conflicts, or it might be part of the toolbox.
So, historically, in arms control agreements like around nuclear weapons and chemical weapons, there've been things like on-site inspections and, you know, satellite monitoring and, you know, all these tools that are sort of bespoke for the purpose of the domain. But the general concept is we would be better off cooperating and we will verify that that behavior is actually happening. And so that, you know, if we detect defection by the Soviet Union or the Soviet Union detects defection from us, then they can respond appropriately. But, you know, we can build, you know, trust but verify in Reagan's terminology.
And I think that if you actually had the full development of the FAT methods and you had accountability and transparency for even general AI systems or superintelligence systems, I think that would open up the door for a lot more collaboration if you could sort of credibly commit to saying, "Okay, you know, we're developing this general AI system, but you know these are its goals or this is how it learns its goals, and, you know, we're sort of, you know, putting these hard constraints on the system such that it's not gonna attack your country," or whatever.
Yeah, I think it's—I mean, one of the things that’s so intriguing about it though is like the reason why I like FAT AGI for me is like, huh, it's like kind of a crazy idea is because I know typically in like the literature around AGI it’s very much like the idea that it would be accountable and that it could be transparent is usually considered impossible, right? Because like, yeah—
Yeah, I mean, we're making the move—you're making is to say like actually we might be able to do it.
Yeah, well there’s differences of opinion on like, how sort of interactive the development of, you know, an AGI would be and, you know, the extent to which humans will be in the loop, you know, over the long run or the end, right? And so I mean, Paul Christiano at OpenAI, for example, has a lot of really good blog posts, and you know, some of these ideas are in the paper "Concrete Problems in AI Safety" about, you know, about the idea that, you know, corrigible—what he calls corrigibility and what others have called corrigibility might actually be like a stable basin of attraction in the sense that if a system, you know, is designed in such a way that it’s able to like take critical feedback and it’s able to say, "Okay, yeah, what I was doing was wrong," that might sort of like stabilize in a way that it’s like continuously asking for human feedback.
So it’s possible that accountability is, you know, an easier problem even for very powerful systems than we realize. Like, you know, there are powerful, you know, maybe Trump aside, there are powerful people in the world who actually seek our critical feedback and like are aware and like more topical, yeah, want to hear diverse inputs, and like we want to make sure that they’re doing the right thing, right?
Right, but this is actually really interesting because it's like, it's both short term and long term again, right? Just like if we could get the research community to have certain norms around ensuring that like we are seeking to build corrigible systems, yes, right?
Yeah, that might set the precedent that the idea that eventually arises, yeah, will be one which is actually consistent with FAT, yeah, right? Versus like not right? We actually have control over the design of the thing.
I've always had such trouble understanding like the people who thought there are these AI engineers that we're trying to take over the world with their AGIs. Like, "No, they're gonna die too!" Like, all the incentives—the incentives are aligned. Just like imagine this apocalyptic scenario.
But do you guys have—you have strong opinions on people working in public first, working in private? I know there's like somewhat of a debate around development.
Mmm-hmm, yes, you mean like working in the US government?
No, no, sorry—like on like trying to build like an AGI, okay.
I think some amount of your data or training data, yeah, publicly versus private.
Yeah, so that's a super interesting question, and I think, you know, we sort of broached the topic in this report on the malicious uses of AI because I think there might be specific domains in which, you know, maybe it's not—maybe in a world in which, you know, isn't necessarily the world we're in today, but maybe in a world in which, you know, there are millions of driverless cars and they're all using the same like convolutional neural net that is like vulnerable to this like new adversarial example that you just came up with. You might want to like give those companies a heads-up before you just like posted on arXiv, and then someone can like cause tens of thousands of car crashes or whatever.
So I think, you know, we might want to think about norms around openness and those specific domains where, you know, the idea is until like never publish, but it's to like have some sort of process. But yeah, as far as general AI and research right now, the community is pretty open, and I think it's sort of both in the, you know, broad interest and in the individual interests of companies to be fairly open. I mean, they want to recruit researchers and researchers want to publish. So I think, yeah, there's a pretty strong norm around openness.
But if we were in a world where there was like more widely perceived, you know, great power competition between countries or where the safety issues were a lot more salient or there were some catastrophic misuses of AI in the cyber arena, then I think people might think twice, and it might be appropriate to think twice if, you know, your concern is that you know, the first people to, you know, press the button if they're not, you know, conscious of all the safety issues could cause a huge problem.
Yeah, I'm very pro-open publishing. Like, I think like it should be the default, and it’s like I’m still disputing situations where I'm like you shouldn’t publish on this stuff just because like I think it is actually to the benefit of everybody to know what the current state of the field is because it allows us to make like realistic assessments.
Regardless of whether or not you believe in AGI, you believe in superintelligence, like, you know, like it’s useful just to know like what can be done because even if you're thinking about the more prosaic like bad actor uses, right? Like it's useful to know like what are the risks and we can’t do that in an environment where like lots of people are kind of holding back.
Yeah, that's a great point. Okay, so my last year you wrote about predictions for 2017 or 2018.
Yeah, okay. I made the predictions early 2017 and then I reviewed them like a month ago.
Okay, this year 2018 you can you give a whole year? Has not prepared you can have a three-year time frame then even more three more?
Sure, yeah. I think there will be superhuman StarCraft and Dota 2 probably in that time horizon. In I think early 2017 that would be the end of that. I gave like 50% chance by the end of 2018, so this gives me more runway like seventy percent confident that, you know, they'll be superhuman StarCraft.
I'm actually less familiar with Dota 2, so...
Well, I'll say just StarCraft, alright.
Okay, Tim, I think meta-learning will improve significantly. So this is basically treating machine learning, designing machine learning architectures as if they were their own machine learning problem.
Mhm. It's something that basically is done by like machine learning specialists right now, and the question is how far will machine learning researchers go in replacing themselves essentially? And I think that will get really good in ways that we don't expect.
And your insight into why that will happen is what?
There's some of the results that we're seeing from the research right now just like it just seems like these networks are able to kind of tune their parameters in a way that at least I would have not expected. And so it's cool seeing that that adapts in advance.
These are all positive things.
Alright guys, well thanks for your time. Giving us you.