Oliver Cameron (CEO, Voyage) - MIT Self-Driving Cars
All right, welcome back to 6S09 for Deep Learning for Self-Driving Cars! Today we have Oliver Cameron, who is the co-founder and the CEO of Voyage. Before that, he was the lead of the Udacity self-driving car program that made ideas in autonomous vehicle research and development accessible to the entire world. He has a passion for the topic and a genuine open nature that makes him one of my favorite people in general and one of my favorite people working in this space, and I think thousands of people agree with that. So please give Oliver a warm welcome.
“Thank you very much, Lex, and thank you all for having me here today. I’m super excited to speak all about Voyage, but in reality, the kind of thing I want to share today is kind of like this title says: How to start a self-driving car startup. Rarely do you kind of get an inside scoop of how a startup is formed. You kind of hear all the PR, all the kind of very lovey-dovey press releases out there. I want to share kind of the inside of how at least Voyage came to be, which was a little unconventional compared to your average self-driving car startup. They always tell you that the path to a startup getting to the goal you want is kind of a zigzag. Ours was kind of an insane zigzag as well. We'll go through all of that stuff.
Let’s talk about my background. Also a little unconventional. I’m not very good at learning in a classroom. For me, I found learning by doing, by building, has always been the thing that’s worked best for me. So going all the way back to when I was a teenager, software just, in general, was my passion. This idea that you can make something out of absolutely nothing, and then all of a sudden millions, and in Facebook's case, billions, of people can be using that thing. After building lots of crazy stuff and perhaps not being too popular in high school because that's all I did, I started a company.
I won’t bore you with all the details, but I learned a lot during the experience and joined — went through Y Combinator, believe started right here in Cambridge, which is very cool. Then this very pivotal moment happened to me. I heard about this online class which was generating a whole bunch of scandal and lots of controversy, and it was from this guy called Sebastian Thrun. He'd taken this Stanford class he taught in artificial intelligence and just said screw it, we're gonna put the whole thing online. Back then—and this was around 2011—this was a very controversial thing to do. Today MIT and many others do this all the time, but back then there was a hell of a lot of controversy around doing something like this. But this learning format really just appealed to me.
Being able to sit in front of my laptop, learn at my own pace, build, build, build was something that really resonated with me. I took this class in 2013, Artificial Intelligence for Robotics, and this again was just this pivotal moment. My head exploded! All the enthusiasm I’d had in software kind of transferred to artificial intelligence and robotics, and I just became addicted to the format of what are now called MOOCs—massively open online courses. I loved them so much I decided, hey, I want to go do this and help others learn this stuff. So hey, let's go join Udacity and build more classes like this.
So I did that for four years, led our machine learning robotics, and eventually our self-driving car curriculum, which was a lot of fun. I got to learn directly from two great company builders, like truly great company builders. One was Vishal Makey Jonny, he was the operator extraordinaire of Udacity, understood how to build a company, how to build a culture, how to incentivize, and how to do all those things that we don't often talk about. And Sebastian Thrun, he of course founded the Google self-driving car project in its early days, and right now I believe he’s building flying cars. Just in general, I learned so much from him. But this idea that you are literally in control of your destiny, you can build absolutely anything, if you put your mind to it, was always pretty inspirational.
Today, of course, I drive in cars at Voyage, and we'll talk more about what makes us special compared to the other self-driving car companies you may have heard of in this class and beyond. Let’s talk about Udacity—raise your hand if you've heard of Udacity. Very curious. There you go, that's most of the room. Udacity, like I said, was founded by Sebastian Thrun. He took this class online and all just exploded, and he built a company around it. Udacity’s real focus is on increasing the world's GDP, this idea that talent is everywhere, that it isn't now just constrained to the best schools in the world. Because of this proliferation of content, there are talented students all over the world, and all they need is the content in which to be able to build crazy, cool, world-changing things.
What I see as my job today is to go out into the world and find these ridiculously talented people and then put them to work on the hardest problems that exist, and Udacity to me felt like the perfect place to do this. As a kind of prelude to this, about three years into Udacity, we had had this real focus, like I said, on machine learning and robotics, but we really wanted to take it to the next step. We came up with this kind of concept internally that we called ‘Only at Udacity’. What if we taught the things that other places weren't teaching? What if people all around the world could come and learn from what may appear to be niche topics but were just being taught at the right time because that industry was about to blow up?
The first one we did of this, and we’ve done some after including flying cars, was a much more in-depth curriculum on artificial intelligence with self-driving cars. This is a quick video that introduces it, and this is of course Sebastian Thrun, robotics legend. See this place? If you can build the same apartment, it will save you always. But on top of its transformations, just imagine instead of owning a car, you have a default in-stock apartment, and B Hawkins link is phenomenal.
There's an enormous market for surviving managed genius. Lots and lots of companies that you wouldn't suspect have entered that feeling. I challenge everybody to pass the Largy University in the middle of South Africa so that everyone in the world can become a self-taught engineer. Why did we want to do this? Our goal was to accelerate the deployment of self-driving cars. Like Sebastian says in that video, there are a number of reasons why self-driving cars are transformational. At the time—this was around 2016—it felt like self-driving cars were just taking a little bit too long.
We rewind to that particular spot in time, Google was really the only main effort going on, and what we believed is that it needed to happen faster. One of the reasons it wasn't happening fast enough is because there wasn’t enough talent in the space. So what we decided to do is, like I said, build something quite special and want to pair up a world-class curriculum and an actual self-driving car, which we'll talk about more, and what we called our open-source challenges, and all of that would come together to build this quite special curriculum.
So let’s start with the curriculum. One of our beliefs was that partnering with industry was the right way to go. That was because it felt—and I believe this—that the knowledge of how to build a self-driving car was not necessarily trapped in academia; it was trapped in industry. So we had to go straight to industry, work with engineers that were already challenging themselves with these problems, and get them on camera—have them teach the concepts that they know and build day-in, day-out and have that be transplanted to thousands of minds around the world.
These are just some of those partners; there were many, many more. But we had a real focus on finding these engineers wherever they may be and getting those folks on-camera. We also built an incredibly talented team. This is just a small snippet of the curriculum team, but of course Sebastian Thrun was a big part of this curriculum. When I told folks that I’d gotten the chance to work with him on specifically self-driving cars, he likened it to getting basketball lessons from Michael Jordan, which I thought was pretty fun, and they were probably just as entertaining.
But some really truly great folks are working on this curriculum and still doing that to this day, who deserve all of the credit frankly. Here’s a quick photo of the first lecture recordings with eventual Voyage co-founders Eric and Mac. Eric, who's on the left, hates this picture and here’s why—there you go, he still isn’t at Max height, but we built a whole 12-month curriculum to take an intermediate software engineer who may be in consumer software or just some other part of the software world and take them into self-driving cars. We wanted to cover perception, prediction, planning, localization, controls, even just the whole breadth of the industry.
The reason we want to do that is because we saw the best fit for a Udacity student not necessarily being a specialist in a niche, for example, you know, just perception, although there have been a whole bunch of folks doing that as well, but that the skills of a Udacity student tend to pair themselves well with being a generalist, someone that can contribute all across the stack.
So we tried to give these folks that breadth of knowledge. The curriculum that we built included some real-world project—just like real economists—that people have to do. In term two, you'll learn about sensor fusion, localization, and control. This is fundamental robotics that every self-driving car engineer needs to know in order to actually move the vehicle through space.
In the localization module, you'll build a kidnapped vehicle project, which takes a vehicle that's lost and figures out where it is in the world with the help of sensor readings. In a minute, this is exactly what real self-driving cars have to do every time they turn on in order to know where they are in the world. In the control module, you'll build a model predictive controller, which is a really advanced type of controller that actually combines no credit cards move through the world and use the steering wheel, throttle, and brake to follow the set of waypoints or trajectory to get from one to another point.
In turn three, you'll learn about pathway and electives. You will also learn about system integration. Path blending is really the brains of a self-driving car; it's how the vehicle navigates from one point to another as well as obstacles in between. I'm going to give you a sneak preview of how it works and this is something that nobody's ever seen before, so get ready.
Patenting involves three parts. There’s prediction, which is figuring out what the other vehicles are going to do around us. This goes on for a while, so it'll posit that the impact of this curriculum was bigger than we thought it would be. When we pitched as a small team this idea to Sebastian and Viche at Udacity, there was a lot of skepticism that something like this was going to be successful.
The reason that there was skepticism is that one of the kind of formulas that Udacity looked at to determine the impact of building a certain type of content was the number of open jobs available. If there was, you know, for example, in web development, mobile development, all that good stuff, there were millions of jobs open. So it felt like there was a massive opportunity to impact that area. But if you were to in 2016 search for, you know, self-driving car engineers or the different disciplines that exist within, it was—it was kind of just Google.
It was very interesting just to see the instantaneous reaction that we had to launching this curriculum. Today, over 14,000 successful students from all around the world. As you can see, where the most exciting thing is to see what students have done with this. For example, I learned recently that a set of our students here are building a self-driving truck in India. Another set of students in South Korea are building a perception engine for self-driving cars. Just a whole bunch of folks are building truly amazing things, and not only that, they've gotten jobs at Cruise, Zooks, Waymo, and all the big names, and are actively impacting those companies today.
Now for the fun stuff. We also decided to make a curriculum extra special, and we decided to do that by building an actual self-driving car. Whenever I talked about this internally, Udacity people asked me why—why do we need to do this? Right? Isn’t the curriculum just enough? Why go to the, you know, length of building an actual self-driving car? And selfishly, some of it was just a personal want to, you know, build a self-driving car. But the reasoning that I use is that what better way to prove to these students that putting their faith in us that we know what we're doing than to build our own self-driving car?
And also, what better way to collaborate with these students on an area that is really infantile than again by having this platform that students could actually run code on a car? So we decided to buy a car, and we'll talk more about that in a second, but we set ourselves a milestone for our self-driving car. It was to drive from Mountain View to San Francisco—32 miles of driving with zero disengagements. It should be repeatable; it won’t be zero disengagements every single time because otherwise we’ve got an actual self-driving car.
But in a short period of time, how much progress can we make towards this stated goal? Raise your hand if you’ve been on El Camino Real in that sort of region. Okay, so you probably understand it’s got a lot of traffic lights. In fact, on our route, about 130 traffic lights; it’s a multi-lane, three-lane speed limit of about 40-45. It’s fairly complex, but it’s also got some constraining factors, which is what we were looking for.
So we focused our tech efforts. This is the car we bought; you're probably very familiar if you follow self-driving cars with the Lincoln MKZ. They’re everywhere, and there’s a reason for that in terms of the drive-by-wire nature of the vehicle and other stuff. We outfitted a whole bunch of sensors—some cameras, some lidars, all that good stuff. We also tried to build our own mount; we affectionately call this the periscope. I don’t know why it’s in slow motion, but this was not our final design. We built all this from parts at Home Depot—a truly MVP.
Then we got to work. The goal was to accomplish that milestone within six months. So, of course, we had to work fast and assembled a dream team of folks that I worked with on different projects at Udacity. They also wanted to come and dabble in this—folks that worked on the machine learning curriculum, robotics curriculum, etc. So this was one of our first days testing, and we did this at the Shoreline Amphitheater parking lot.
You see, now it’s a very popular place to test self-driving cars in the Bay Area because Google used to do it in the past. We saw a lot of weird stuff. For example, you’ll see here we saw what I believe to be a motorcycle gang, and we made progress. We kept iterating, kept building, and it started to come together. In fact, some stuff that we thought wouldn’t work surprisingly just started to work. This is on El Camino Real; I’m in the backseat here. So Mac discovered that we shouldn’t have stopped at that traffic light, but we did; we resolved the mystery later.
Let’s go to the next video, and of course, we learned a lot by going this route, the different behaviors of drivers. One of the things that we were worried about is vehicles cutting us off. And when we say cutting us off, it means a vehicle pulling out in front of us even a few hundred feet in front. You’ll see here we drove a little slow—25 mph—and said that was fine.
Pretty soon, it got quite boring; the car was doing very well driving itself. We built some cool algorithms to change lanes when necessary, similar to what you see with Tesla autopilot these days. We collaborated with some students on a traffic light classifier, which was integrated into ROS there, and yep, pretty boring stuff. So you can tell Eric was surprised that it was just fine.
We also had a penchant for building recording themed videos like you saw maybe from Elon Musk and the Tesla team with “Paint it Black.” We’ve got our own version of that. Eventually, we became pretty confident, but we always wanted to test most of the day just to get the most learnings out of everything. This video was made at 2:30 a.m., driving from Mountain View to San Francisco all 32 miles, cost as a backing track.
Maybe we want to tone it down so it's easier because there's less traffic, right? This is kind of cheating and didn’t count as the milestone, just to be clear. You’ll see that we eventually hit the 32 miles, and Mac, who's in the driver seat, was pretty excited about that. And they hit it, but of course that didn’t count because it's in the middle of the night and that's not gonna be a very useful route. But it was an awesome accomplishment just to even make it 32 miles with no disengagements with traffic lights, lane changes, all that good stuff.
But after four months, this is in the daytime, this began, I believe at like 6—sorry, 7 a.m.—we accomplished it! That small team had come together and built something pretty cool that could handle, again, multi-lane roadways, varying speed limits, traffic lights, objects, all that good stuff. The thing that really brought this home to me is that the industry was now ready, right? It felt like this feeling I had in software where someone in their bedroom can go and build something and launch it almost feeling overnight could now—not quite the same, but close to the same—happen in self-driving cars.
But we’ll talk more about what this led to in a little bit. Let’s talk about open-source challenges. We also got the same question: why do this? And it was clear to me that for something like self-driving cars, which was so formative, we had to collaborate with students to figure out the best stuff because, you know, even the folks that were at Udacity were not necessarily the world’s leading experts in these topics. We wanted to use this hive mind of activity from around the world to teach the best stuff.
So just through a period of a year, these are all the different challenges we launched: there were prizes, leaderboards, and all this sort of fun stuff. The one that I’ll focus most on today is using deep learning to predict steering angles. The challenge was clear; it was that given a single camera frame, you have to predict the appropriate steering angle of the vehicle. If anyone had read in videos and papers in 2016, this stuff was all the rage, and it felt like one of those areas that was just begging for more exploration.
Again, let’s use this—all these students from around the world—to do it! And we did have students from all around the world; there were over a hundred teams. People self-organized into these little groups to go and build this, and over the course of about four months, we had a whole bunch of submissions, all taking incredibly different approaches to the problem. We released two sets of validation sets, all that good stuff.
Here you’ll see are the winning model, and I later found out that the author of this model actually went on to lead the self-driving car team at Yandex, which if you’ve been following CVs, is doing some pretty cool stuff in self-driving cars today. But you’ll see this is on a route from the Bay Area to Half Moon Bay—very windy road—and you’ll see that the prediction matches pretty closely to the actual, which is nice. And if you read his description of his solution, it’s a pretty cool solution.
I think the most exciting thing was just the number of different approaches to the problem, all resulting in some awesome stuff. And again, in true Voyage fashion, we recorded a video of what this model performed like on our car. It wasn’t perfect; as any first model, it had its faults. One of the main ones that we realized after trying all this stuff out is that, of course, a car when steered by such an input performs differently in a car than it does, you know, on your desk in a simulator through pre-recorded camera frames.
So adjusting for those corrections that might need to be made is something that students, after the fact, added, which was pretty cool. So after all of these things—building that curriculum, building a self-driving car, launching these challenges—it felt like it was time for something new. It was awesome to go and collaborate with all these students, and it felt like, you know, I had to go build something.
So I gathered that same team that had built this curriculum, and we said we’re gonna go build a self-driving car. This is from my pitch at Khosla Ventures; you can kind of see the pitch deck; they are a little bit Voyage. It’s a new kind of taxi service. Our pitch has changed somewhat through time, but that's still pretty accurate. And we started what is now called Voyage.
Our goal, really, was that we wanted to again build a self-driving car, but we wanted to do it differently. We didn’t want to follow the same formula that we felt we’d seen from some of the other folks in the field. The reason is that those folks have real advantages, right? When you think about Google's project, of which I'm a big fan, they have this massive engineering pipeline of folks that want to go build a self-driving car at today’s Waymo. But they also have a cash bank balance of billions of dollars that is hard to match.
They also have the brand recognition of getting to work with Google and all that good stuff. So we just knew we had to think about this problem quite differently. What motivated me is that today, as we all know, we have this incredibly broken transportation system. You step outside onto the roads today, and I don't know about you guys, but I don’t feel particularly safe when I jump into my car. Overwe all know the stats: over 1 million people have suffered fatalities on the roads today. That doesn’t include folks that break necks, that injure break bones, all that horrific stuff.
It’s also incredibly inefficient; we again all observe this as we go about our day—just the number of lanes that exist on a road today to account for peak traffic, the number of vehicles which have enough room for eight people usually have one person in that front seat. I read a stat recently that only 7% of the average vehicle's energy usage is going towards moving the things that are actually in the car; the rest is waste.
So, an incredibly inefficient system. It's also expensive. The reason you know we see a lot of old cars on the road today is that that's, at least today, the most optimal and affordable way to lots of folks to get around. It's inaccessible, and you'll see why this matters to us in particular. Our goal is to introduce a new way to explore our communities. This is a video of one of our cars at a particularly cool place, which we’ll talk more about, and this is kind of our mission.
Why now? Why is it possible to build a self-driving car now? A number of factors that we learned during that Udacity experience, but some new as well. It feels from everything we see that sensors are now in this position where these sensors are now capable of level four self-driving cars—the resolution, the range, the reliability—all those things that were necessary for a level four, self-driving car are today ready. That didn't used to be the case; if you rewind to 2007 and look at the cars that were participating in the DARPA challenges, you'll see a lot of single channel lasers, you’ll see the relic of the valid iron age—TL 64, the spinning bucket as it's called today—and no one would have claimed those sensors were ready.
But today, you’ve got this enormous breadth of sensors that can take you that way. Compute is there; when we think about the recent rise in GPUs and whatnot, finally, you know, being able to have enough performance in the back of a car with the power constraints that you have—it's there! And talent, you know, again, this is not just Google. Today you’ve got all of these great minds from all around the world building this technology, so you’re able to recruit those folks and put them to work on the problems they've solved in many cases beforehand.
The reason I have yellow for computer vision, which is not a knock against computer vision, is because it's not quite there yet for a fully driverless self-driving car. If you again rewound, you know, three, four, five years, this would have been a red— but today, with all the community and whatnot around computer vision, this is steadily getting to a green state. So pretty soon, there’ll be green, and of course, then you’ll have that perfect formula for level four driving.
What we run after is ride-sharing. We believe that the optimal way for people to move around is to be able to summon a car. But the thing that’s suboptimal today is that you have to have a human driving you whenever you want to move around. It prevents the cost from being lower, prevents some safety issues, and prevents some quality issues. We think solving that will mean these next-generation ways of moving around will come to fruition.
But what we also see is that if you, let’s say, never remove the driver from the car—that a ride-hailing network always had a human driver—you are inherently limited by the number of miles you can drive, which means that it will never replace personal car ownership. It will never fix that fatality number I talked about—all of those things we must solve! So we think by having a self-driving car that these next-generation transportation networks will come to fruition.
A lead VC is a guy called Vinod Khosla, the founder of Khosla Ventures, an awesome guy who’s done some truly world-changing things. He has this quote, which I’m a big fan of: "Your market entry strategy is often different from your market disruption. Start where you find a gap in the market and push your way through." This better communicated what I mentioned at the very beginning, which is that we should build a self-driving car but do it in a different way because if we don't do that, we're gonna fall into the same traps as many of the others that have died along the way.
We have to find a way to do something different that we own and that we are really, really good at. And for us, that was retirement communities. Hands up if you’ve ever visited a retirement community and seen one. There you go, surprise Lex, you’ve got to get you out to one! But these are just amazing places, and the reasons we choose retirement communities first to deploy our self-driving technology are for these four reasons: they are slower! The speed limits in these communities tend to be far slower than you’d see on public roads; it’s a much calmer roadway.
When you visit these locations, I liken it to listening to a podcast at 0.75x—just very constrained, very slow, and a little boring from time to time. But you've also got these dreadful transportation challenges. We hear from these residents all the time about how transportation is a pain point and that their only option is a personally owned vehicle. These folks know, in many cases, they shouldn’t be driving, but because they don’t have an alternative, they still drive.
We hear from folks that put off much-needed surgeries—hip replacements, things like that—because they don’t have a friend in town who’s gonna be able to move them around. We hear from folks with vision degeneration that they just don’t see a way that they'll be able to move around and keep that quality of life that they've been able to have—folks gripping steering wheels for extended periods of time. All these challenges felt like the best first place for a self-driving car to begin and a clear path to customers.
We see that on, you know, the roads today, ride-sharing on public citizen mantra is a particularly brutal battle—a race to the bottom in terms of cost. If we owned every retirement community in the country, meaning the transportation networks there, that would in itself be a very valuable business.
One of my favorite passengers is on a hed. She came to visit recently and gave this quick speech about why self-driving cars matter to her in her community. Let’s talk about our first community; this is The Villages. Whenever I show this slide, people are astounded by the number of residents in a community like this—over 125,000 and growing—over 750 miles of road. And what we have in this location is an exclusive license to operate an autonomous vehicle service.
This is one of our other beliefs, which is that by partnering very deeply with the community, it means that we're able to deliver a better service and that we're able to grow a more reliable business. We won't have, you know, entrance and competitors from all of the other self-driving car companies in our communities. What we actually do in exchange for that exclusive license is grant these communities equity because if we win, it’s probably, in fact, highly likely as a result of those communities.
And the addressable market transportation in these regions is massive. These residents tend to be, as a lot of seniors, tend to be quite affluent, which means that they have some disposable income when it comes to being able to pay for ride-sharing services and other things like that. So we find that that recipe is absolutely perfect here, and we're launching and have launched passenger services for these residents.
I’ve got a lot of awesome feedback, learned a lot about the needs of providing ride-sharing for senior citizens. Just some quick stats; this is from my series X fundraising deck. Just about the size of the senior market—again, this is the first place we go, but you can get a feel for just how large this transportation market is today. There are 4 to 7 million seniors; that's growing by 2060 to over a hundred million seniors in the U.S.
The total addressable market for just seniors is incredibly large—2,500 plus communities, all that good stuff. And this is how we see the world—the landscape of potential deployments. You've kind of got a lot of the big guys focusing on that bottom left quadrant, they’re focusing on large cities. It makes sense because it’s playing to their unique strengths. It’s playing to their ability to deploy thousands of cars, tens of thousands of cars.
It plays to the strengths that they have, at least some patience or ability to have more extended timelines when it comes to building this technology. But first up—you know, like us that fight for survival every single day—it means that we have to do things differently. So we focus on that top right quadrant there, what we’ve kind of coined is self-contained communities. These places are simpler, slower, but they also have this ability for us to have that exclusivity that I talked about.
And there are others, of course, that we play in, whether it's the senior market or maybe even small cities and things like that. Let's talk about autonomous technology. So, just to reiterate, why do we deploy in retirement communities? Slower speeds, simpler roadways, there is a central authority; these places tend to be run by private companies which makes for a quite unique relationship in a very positive way. It means we can deploy faster; it means we have the potential to have more impact in these regions.
It also turns out that retirement communities tend to be located by this ideal weather for self-driving cars. Think about Arizona, Florida, etc. We have a world-class team building this at Voyage from all the major programs out there, and that makes our lives infinitely easier. One thing that also makes our lives easier is the sensor configuration of our car.
We made this decision that we’re not going to focus on optimizing for costs today but to optimize for performance. We want to get to truly driverless sooner than most, and one of the easiest ways you can, again, make your life easier is by optimizing for high-resolution sensors. At the very top of the vehicle, we have the VLS 128, which is a 128 channel lidar that’s capable of seeing in three hundred meters in 360 degrees.
Many other different lidar sensors on the vehicle cover different certain blind spots, and altogether we sense twelve point six million points per second, and then just looks incredibly high-resolution. You’ll see our car at the bottom there, and that's the raw point cloud output that we see in the world. We run towards level four, and for us what that means is that if you’re building a demo self-driving car—kind of like we did at the Udacity project—you may focus on just the top four items, that top row. You may focus on perception, prediction, planning, and controls.
And it turns out you can build a very impressive demo quite quickly by just focusing on those things. But of course, those things fall apart whenever edge cases are introduced, which happen all the time. So we’ve spent a ton of time on all the items here because, again, our goal is to build not a demo but a truly driverless vehicle.
We also have an emphasis on partnerships because what we’ve noticed in the self-driving ecosystem is that there’s not just more self-driving car companies building the full stack; there are now folks going into simulation, mapping, middleware’s, route operations, routing, sensors, and ton more. So we make our lives again easier by partnering with companies like this so that we don’t have to spin up a simulation team, or we don’t have to spin up an operations team to go map the world; we can just work with these very cool companies.
Let’s talk about one unsolved problem which fascinates me. It’s to do with perception, and you probably won't be able to notice this unsolved problem from just this picture, but maybe if I add some annotations, you might. Foliage—trees, bushes, whatever you want to call them—you may have seen some quotes in the media about some popular AV programs struggling with such foliage. For example, Cruise cars sometimes slow down or stop if they see a bush on the side of a street or a lane dividing pole that was in the information wrong way. This one proves self-driving car software has routinely been fooled by the shadows of tree branches, which it would sometimes mistake for real objects. Insiders say that’s Business Insider, and even Voyage, there’s only one hard stop on the way—the culprit is a bush two feet high that protrudes into a lane from a street median, which Voyage considers a possible threat.
Voyage made a remit and we did, but we don’t think that's scalable. Well, maybe it is, I don't know. But we, at the beginning of 2018, decided to solve this problem. So of course, all of this resides in the world of perception, an area of particular fascination for me. We’re sharing these slides, but these are just some of the papers and research that we see going on that intends to solve those sorts of issues.
One of the reasons you’ve seen those programs, including ours, be particularly sensitive to foliage is because from a perception perspective, one of the most well-known ways to detect objects is to utilize the map. If you have this map and you effectively simplify to a certain extent but subtract objects that aren't in the map and then use that as a way to understand what's in and around you that’s dynamic, then of course you’ll end up with decent representations of cars and pedestrians and whatnot. But if you know foliage grows, which it does—trees—then that’s gonna extend out from the map and mean that that particular bush is now an object in your path.
These networks here, which these are all neural networks, don't use that same technique. They don't use the map as a prior. Instead, what they do is take, of course, this 3D scan of the world and then take a more learned approach to the problem. You’ll have, you know, tens of thousands, hundreds of thousands of labels of cars, humans, etc., and then these next networks will be able to pick these ones out.
We’re particularly fascinated by Pixel, which came from some great researchers at Wilbur ATG. Voxel net came from Apple SPG. I’ve heard our engineers talking a lot about Fast and Furious recently, which merges together perception, prediction, and tracking into a single network, which is pretty cool, and Point Pilots, which I think came from the new Tana ME team recently.
I think Kyle is speaking soon, right? So just in general, we see a whole bunch of work going on to solve these issues. The other one that these sorts of networks solve, which I also find particularly fascinating, is that if you use traditional clustering algorithms, what you might see is that if two people have stood next to each other, traditional algorithms will cluster as one object, which when you’re trying to move away from those edge cases to build a truly self-driving car, that’s a non-starter, right? Because pedestrians are the most important thing you can probably detect, and detecting two things as one thing is not going to cut it.
And of course, it does that because it’s a dumb algorithm; it’s not trained on any sort of information. But these networks, again, are very good at understanding the features and perspectives of humans, even if they are in crowds and whatnot. That then helps all your stack downstream because if you have accurate perception information about objects in and around you, your predictions are much better, your tracking is much better, and ultimately how you navigate the world is much safer.
I’m also particularly fascinated by reinforcement learning, which I know Alexis as well. If you’ve read our Waymo’s recent work on imitation learning, I think that’s particularly cool. Another company we track quite closely, just because they do amazing stuff, is Wave, trying to build an entirely self-driving car powered by reinforcement learning. Think about disengagements as rewards and things like that to better performance as well as learn behavior planning, ultimately fusing rules of the road with more learned behaviors.
The ecosystem, I think, is this area that is thriving today, seeing just how many folks are diving into not just the full stack but building tools and building other really important parts of the stack—the maturation of sensors—not just higher resolution lidar but things like 3D radar. We get pitched all the time from these companies, and it’s clear to see there’s been a rise in volume from all these great efforts.
Lessons learned! Now that I’ve been building Voyage for two years and prior to that four years at Udacity, what things have I personally learned? They’re not technical in nature. So many things! So these all may look like clichés, but I promise you they came from lessons that were really, really painful in the moment.
Don’t be intimidated! So the thing that I feel happens a lot in self-driving cars is that because it started in this very academic sense, meaning, you know, Stanford, Carnegie Mellon, and whatnot, that it felt like to break into the industry, you had to also go through that same path. You had to get a PhD in something and really go that path that was well-trodden—but I think that only takes the industry so far.
I think it’s really important that we get folks from all different backgrounds, all different industries, to come contribute to this field because if we don’t, then there’s no driverless. It can’t happen in that isolated bubble; it needs to be extended out. So don’t be intimidated by those things.
Understand your limitations. This is perhaps more of a kind of CEO lesson for myself, but I think when you’re building out a company from, you know, one person or five people to today with forty-four folks, you cannot do everything, and it’s really important you build a team around you that is able to do what you used to do but do it ten times better.
I probably didn’t spend enough time building out that team until we had some challenges our way when it comes to that stuff. Be proactive versus reactive. I think it's really crucial again when you’re building a company to try and predict what’s gonna happen next because if you’re reactive, you’re constantly, you know, two steps behind what other folks are doing.
I love the way I think a lot of folks, again, perhaps overstay their welcome in certain areas of the company when they should just say, okay, I’ve got experts now, I can just step aside and let those folks do what they do best. And speaking of which, hire the best!
It’s really easy when all this pressure's on when you're building a company to kind of sacrifice when it comes to your culture, when it comes to hires. It’s really crucial that you find folks that are not just the best in their field but are the best match for your company and always be curious. I think it’s one of the things we believe in Voyage, it’s that it’s important that knowledge is not isolated to just one person, that that knowledge should be spread throughout the company because even though it may feel like over-sharing or over-communicating, what that knowledge may mean for someone that has a particularly unique background is that they may do something incredibly cool with it; they may build something that totally transforms our company.
That’s about it; can jump to questions if that’s helpful.
“Thank you, that was great! Please give a big hand to Oliver!”
“How did you identify retired communities as the target market to prioritize?”
“Yes, so retirement communities for us was actually—there's a really long story about it. I'll turn me down a little bit. So, when we were starting Voyage, Sebastian Thrun was very helpful in starting, helping us start this company. And of course as kind of naive, you know, founders of a company, we're like, well let's just take this El Camino thing and like put it on everywhere else that looks like El Camino and just do that over and over again. But he cautioned against that, and very wisely so; because again, you're nothing special compared to the other self-driving car companies out there by doing so. And in 2009, he had really advocated to Google, you know, leadership, etc., Larry Page, that retirement communities for self-driving cars might just be the best way for Google to go about deploying their self-driving cars.
But I can understand why. I think the Google folks were, you know, what Google, right? Well we're not just about retirement communities, we're about the world—like level 5 or nothing, right? So he got some pushback, but he did some research in that process and met some folks. So when we were starting, he’s like, you got to check out these retirement communities. So we did; we went to visit, and eventually, we got there, so we want to get to that point without Sebastian pushing for that.”
“Follow-on question on the retirement communities: Do you ever think about the other collateral issues especially about how the retirement community would have to get into a car? How exactly would they interface? Like, somebody wants to make a call to have a car come to their wherever they are, and they have to move from point A to point B? So how did you ever think about all these issues that are very germane? It's not just a vehicle moving on its own.”
“Yep, but these are all collateral issues. How do you plan to address this?”
“It's a good question, so the way we think about this is that today, we've intentionally focused it on a segment of the market which is called ‘Active Adult Communities’. These folks tend to be able to, you know, go into their own cars or into a taxi, open the door, sit down without the need for any, you know, assistance when it comes to that. But they may have vision issues, they may have other issues that prevent them from driving—for example, we hear a lot that folks feel really uncomfortable driving in the evenings; they feel comfortable driving in the daytime because their vision supports it, but when it comes to evening time, they have this mad rush to get home.
But there is that other market which you’re talking about, right? Which is folks that just need that helping hand towards getting to the car. One of our beliefs as a company is that the senior market—like I had in that slide—is surprisingly large. What that means to us is that we think we can own it. We think we can be that company that any senior citizen in that situation thinks, ‘Oh, I should call Voyage because I need to get from point A to point B,' instead of thinking, ‘I should call Waymo or Cruise or any of the folks that are gonna go after the general big market.’
They’ll think about Voyage, and the reason they’ll think about is is because we’ll deliver a product to them that is meant for those folks that’s designed for their use cases. It may be that actually, you know, if they’re going on a long trip, let’s say they’re traveling 50 miles, the first mile of that trip and the last mile of that trip may involve a human like, you know, helping them into the car and then dropping that human off somewhere else to go do that all over again.
It may involve crazy robots that help people from their cars. We’ve heard from, you know, folks at Toyota— they’re building these back-carrying robots and other things that may assist seniors from getting to the car and whatnot. So I think that's why that market thrives; it's particularly exciting because it feels like you can deliver these tailored products that would enable us to be the market leader. But today, we focus on active adults, but who knows where we go next.”
“Can you talk a little bit about how you determined your final sensor suite?”
“Mmm, yeah. So the truth is, it’s never final. So we think about generations of vehicles. So we have a first-generation vehicle, which was a Ford Fusion. It had a single Valiant TL64 and a bunch of cameras, radar, and we set some milestones based on that vehicle, and we accomplished those milestones. And then once we reached kind of that, the max we were able to take that vehicle, we then say, oh, we need, you know, to bring on a g2 vehicle, a second-generation vehicle—so we did that, and we said, okay, we have these certain goals in mind which are pretty lofty and pretty ambitious; we need incredible range, incredible resolution for these things.
What we’ve discovered is that in our particular communities, going at the speeds that we’re going at, radar isn’t particularly useful. So we don’t have radar on our second-generation vehicle, for example. But I'm sure that when we go to that third-generation vehicle, there’ll be other driving factors that, you know, we work backwards from the milestone to say, what do we need on this vehicle? Maybe cost on the third-generation vehicle, right? We may say that, hey, we need a more affordable sensor suite than what exists in our second-generation vehicle. But they’re driven by technical requirements and that means that, you know, we are able to marry the two with the vehicle.”
“I was curious when you showed the student LED content, or when you showed one of the students in your first practice car had developed a traffic light sensor and then you showed later on that, you know, you were getting student input for deep learning models for steering wheel turns. I was wondering how, what your system architecture kind of looks like in terms of the kinds of perception that you take in, how modular it is, and to what extent deep learning algorithms have played a part in those different parts of that system.”
“Yeah, it's a good question! So, I really encourage folks to get familiar with ROS. So, ROS has always been this kind of playground for robot assists of all different types of robots to be able to try things out on robots, and ROS 1 is particularly notorious for kind of hobbyist types of projects, but it’s not meant for production.
ROS 2, though, which is in kind of an alpha release state, is definitely meant for more production-oriented things. The reason I mention ROS is because it has this awesome architecture which lets you plug and play what they call nodes and be able to experiment with different approaches to the problem. So, for example, what you know is running that deep learning model predicting steering angles effectively replaced our more rules-based planner and perception engine.
We just plug the output of that steering angle straight to our controller to just actuate the vehicle. And ROS is particularly good at those sorts of architectures, and it's all open source, so you can do some cool stuff with it.”
“Can you tell how you handle the liability insurance for passengers for your vehicles? Also, how do you handle insurance?”
“That question, so we have a pretty cool deal with a company called Intact Insurance. The idea is that insurance in the autonomous age is gonna be very different than insurance today, right, for human drivers because there’s different risk assessments and whatnot. And one of the ways that we’re able to prove to these insurers that, you know, we’re good at what we do is actually sending them data, right?
We send them data from our cars as we drive, showing that as we move through the world, we accurately detected things, planned around things, and all that good stuff, and then they use that data to inform our rates of insurance. I think that the future of insurance will be on similar lines but perhaps more extreme. Where, for example, the rates will change depending on the complexity of the environment; if we’re just driving down a straight road that’s completely straight and has zero vehicles around us, our insurance rate should be super low, right?
But if we enter a city center and there're thousands of people and cars and all that crazy stuff, our insurance rates should just rise almost instantaneously. So what pond, would someone today that insures the passenger, the cars, senses all that stuff? But I think there's a lot of room for innovation there too.”
“Did you have any problems onboarding the people initially when they were like, you know, skeptical, scared? Then the other question is, what are the major missing pieces in computer vision to achieve level four?”
“Gotcha. So one of the more interesting insights I think we had about retirees is that, again, in my kind of naïve state back in 2016, my general feeling was retirement communities might not be the first to adopt this technology, right? Because they may be slower to adopt new technology, might be scared of the technology, all those sorts of things.
To kind of validate that, I went to talk to some senior citizens because I talked to my own grandma; she hates self-driving cars. Sounds like that’s not a good sign, but went to talk to these folks in these sorts of locations, and the really interesting thing we learned is that traditional consumer software or devices, yes, there is definitely a lag in adoption with senior citizens, and that’s proven in many studies, many stats that senior citizens are slower to adopt the Facebooks of the world or the Instagrams or the WhatsApps, all those sorts of things—cryptocurrency, etc.
But that’s because they have these very well-defined processes that they've had for most of their lives, right? Instead of using Facebook, they call someone up and have a chat, you know, a conversation with someone about their day or all the stuff that's going on. Or they, you know, don't share a picture on Instagram; they physically mail a picture or something like that. So to change that behavior is tough, right? Because that’s a behavior that is fundamentally different than what they used to.
They have to log on to a computer, go to this weird Facebook thing and like share pictures with thousands of people. That's weird. But the difference between that and a self-driving car is that our experience is no different than the car they used to; it just turns out it's being driven differently, right? Like, they see a car; it’s the same, you know, similar form factor to what they used to. They open it, they sit in the back seat, okay, there's a button I have to press to say ‘go’, but it's pretty similar to what I'm used to in my past.
I want to learn a new behavior; I have to change something that I'm used to. So that was our first learning. And then also, they actually really don’t care too much that it’s autonomous. They have very—when I’m in the car—quite curious and enthusiastic about the technology and want to tell them about, I don’t know, lidar and deep learning and perception. And they, you know, don’t want to hear any of that stuff, and it kind of dawned on me that the reason that is is because what they senior citizens have witnessed over their lifetimes is far more dramatic than I have, right?
Like, we’ve oldest passenger was 93, and she told me a story about how when she was very young, she remembers literally moving on almost a daily basis in a horse and cart. So when you talk about like self-driving cars to those folks, like they just, you know, they couldn’t care less because between that period and today they’ve seen the birth of like flight, planes everywhere—they’ve seen car proliferation, they’ve seen scooters, now they’ve seen like all of this crazy subway systems. So a self-driving car to them is like, ‘Oh, that’s cool. Just move me!’
That’s our biggest learning! But the question was computer vision; what needs to happen between now and level four?”
“Yeah, so I think the holy grail, right? So if you had perfect perception, self-driving cars are solved. If we knew every object that was on the road in and around is within a reasonable distance, self-driving cars are solved! False positives are accepted today, which I think is good, but you really want to minimize false negatives, right? You want to zero false negatives in the world, and I think that's why we still have a tiny bit of work to do because when you think about the reason for a test driver being in the vehicle, well perception feeds everything downstream, right?
So if you miss an object, miss identifying an object, any of that sort of stuff, then that effect causes the whole stack downstream to become quite chaotic. That’s why I’m excited about all those networks I talked about. One of the other things we believe that helps us minimize false negatives to non-existent state for us is that we band together multiple networks. So we don't just rely on a single layer of perception; we say different networks have different strengths.
For example, Voxel Net is particularly good at pedestrians, but Pixel’s not so great as pedestrians because it’s from a bird's eye view where pedestrians are quite thin and whatnot. So let’s band those two networks together, and let’s also band together some more traditional computer vision algorithms that may not be processed on the entire 360 scan but may be processed on a small sample, maybe at the front of the vehicle for example.
So there's just lots of little bits and pieces like that to go through to minimize the worst-case scenario, which is a false negative. But it’s clear when you see, you know, Waymo and whatnot that they feel very, very, very close to that.”
“State, you mentioned that weather was one of the main reasons this was a great place to start. Can you talk about hurricanes?”
“Yes, it was funny—I got a question recently from Alex Roy, I mean Lex, just talking about, okay, in the event of a hurricane, right? Let’s not talk about the technology second, but in the event of a hurricane, like we’ve all seen those pictures of people, you know, getting on the freeways and trying to get out of the path of the hurricane, right? How is that going to work in a world where self-driving cars are everywhere and personally driven vehicles may be more of the smaller set? I don't quite have an answer to that yet, but I think it's an interesting kind of thought problem from a technology perspective.
The really important part of whether fuzz is remote operation. All of our vehicles have a cellular connection, right? And each of those vehicles is connected to a remote operator that’s sat in somewhat close proximity to that vehicle, and that remote operator has a few jobs. One is to just ensure the safe operation of the vehicle, make sure that vehicle is doing as intended to do, all those good things. But another is to make sure that the operational domain that we are currently operating in is the one that's designed for.
So all these different camera feeds are being, you know, live-streamed to this remote operator, and if there’s a sudden downpour of rain, that remote operator has the ability to bring that vehicle to a safe stop until that rain shower disappears or whatever or a hurricane, whatever it may be. But there are companies—I was pitched recently by companies building weather forecasting on a scale that is not really used today, but really microclimates. So thinking about just like this small subsection of the Villages, predicting and understanding exact weather within those regions, and then having webhooks to tell you or as Voyage that that’s about to happen.
So there's a lot of cool stuff happening there, but remote operation is currently kind of the eyes and ears of all vehicles to prevent that sort of issue.”
“So please give Oliver a big hand. Thank you, guys!”