A Conversation with Werner Vogels
This is a real privilege for me. We are here today with Dr. Verner Bogles. He is the CTO of Amazon and, of course, has a lot of really exciting experience with that. We’re gonna be talking to him today about his experience with Amazon, about his experience with startups, and about lots of technical topics as well that will be relevant to many of us. So thank you all, and let's give it up for Dr. Bogles.
Thank you. Okay, so we're gonna be talking, of course, about Amazon today and about your role there, but I'd like to start with a little bit of background. Would you mind telling us about your career before you started at Amazon and sort of what brought you to that point?
How much time do we have? I'm sorry. I was an academic before I joined Amazon. I had been a research scientist at Cornell for 10 years building very large-scale distributed systems. And, as is common in American academia, you have to be motivated to do startups on the side. So, we did two startups on the side; one of them already existed when I joined, and they got sold off and were successful to a company called Stratos. I don't know if anybody remembers that—this is before your time. And then another company that actually failed. So, we had both experiences. That was great, kind of.
Before that, I'm not the typical sort of computer scientist. It wasn't until 28 when I decided to actually go back to school. I worked in hospitals before that—radiotherapy in the Dutch Cancer Research Institute, doing radiotherapy in cancer patients. I don't know; I realized that I really hated all these people dying around me. So I decided to go do something that had no humans involved whatsoever. So, computer science seemed like a really good thing to go into. This is me in the mid-80s. So, you know, the computer scientists know where it is now. Turns out, I had humans back then; I had put it like that, yeah. So, it turned out I had a gift for it, and I didn't know that upfront.
From there, I wanted to research because I was interested in the kind of things that I really was passionate about. I pitched the idea of working in a research institute and then was invited to come to Cornell. At one moment there, so what I did do around that time, also when I was still at Cornell, was actually either consult for large companies like Nike, HP, and the Suns and whatever of this world. I also often gave talks. At one moment, Amazon invited me to come give a talk about some of the material I was working on. I thought, like, “Really? I have to go? What is this—this is a bookshop?” You know, it's a web server in a database; how hard can it be?
Yeah, one glimpse into that kitchen, however, and I realized that this is a massive technology operation. It’s not a retailer; it's a technology company. They are operating at a scale that I'd never seen before, definitely not at all the companies that I had consulted for. The challenges that they were faced with from a distributed systems researcher's perspective were amazing. I didn't need to think very hard when they offered me the new job.
Well, that's incredible. So that's interesting. Do you feel like that was a change—like, today, sort of like the interesting distributed systems problems are kind of like huge companies? Maybe before with academia or was that just the way?
Yeah, I think that still is the case. I think most distributed systems researchers have become more aware of sort of the kind of scale that these very large companies need to operate at, or not even these very large companies. I think if you think about any successful internet company or digital-only company, they need to operate at a scale that is unparalleled. You know, in 2004, when I joined Amazon, I mean, many of you, if you're going to be successful, will be easily operating at that particular scale that Amazon was at in 2004. But there was no body of work that you could really be relying on. There was a lot of effort went into basically keeping the lights on—long things that iCloud and other technologies now give you.
Amazon had gotten it in 2004 to scale purely by being practical. It's not that there was a book or so that you could read on how to build a scalable organization or a scalable company. You know, it didn't. Everything I was involved in was five to ten years ahead of the curve, both in terms of the usage of technology, the development of technology, and also at the operation of scale.
Especially if you want to be a fast-moving company, it's not like you could look at traditional enterprises because they all sort of might suffer from the innovator's dilemma, and all the things are really slow. At one moment, once they become successful, how to build a company that needs to continue to move fast—it’s a whole different story. There may be things that you may be making business trade-offs for, for example, the creation of technical debt or the allowing duplication to happen and things like that—things that would be out of the question in the traditional enterprise. Because efficiency is their main goal, at Amazon, moving fast and innovating fast and having a very long pipeline of experiments is the most important thing. You are willing to allow duplication to happen; you allow creating technical debt as long as you know that you have to pay it off.
So, the many trade-offs that Amazon has been willing to make over time—you can't find them in the traditional MBA book. Most of these things Amazon also had to develop for themselves, whether it's technology or whether it's business processes. That helps, of course, that with Jeff Bezos at the helm, you have someone who's a true visionary and truly understands sort of how the next modern world will look like or should look like.
Thanks. A couple of years after you joined, you were appointed the CTO of Amazon, and someone asked you, “Do you want to be the CTO?” Guys, he's not gonna ask next year, is he?
No, so Alphamoon, who was that company, by the way, didn’t really—he really wanted to be a developer. They had been looking for a green replacement for him, and, as I said earlier, sometimes you have gifts that you don't know about until you start doing them. I think I joined in September, and by freaking January, I became the CTO.
Well, so tell me what that was like at Amazon and then sort of what that role has shifted into as we get told today.
Yeah, so if you look at Amazon, one of the reasons for hiring me—and actually a few of my students, former students, came with me—was to really instill more academic rigor into the way that we wanted to approach scaling. Amazon had become really good at scale, but the challenge was that we wanted to do orders of magnitude more. In essence, if you go from order of magnitude growth, you have to revisit almost everything that you do—whether it's your processes and also your technologies. So to really be on a solid footing to do sort of the next one or two orders of magnitude of growth that were coming, we needed to sort of insert way more rigor into our thinking.
For example, around performance: how do you measure? What kind of infrastructure do we need for measurements? If you want to be a truly data-driven company that makes data-driven decisions, you need to first of all have faster data, but you also need to have a culture around how do you measure, and how do you interpret those measurements. I mean, a medium latency of, let's say, 1.2 seconds from your webpage to your customers doesn’t say anything—it just says that 50% of your customers have a worse experience. You need to know how much worse. From an engineering perspective, for example, the 99th percentile or the 99.9% is way more important often in those cases. Then, how do you create an engineering discipline that has control over that 99th percentile so that you can actually start pulling it in if that's what you want to do? Then being able to associate that with business decisions, for example. The commonality is that if you improve the latencies of your pages, conversions go up. So the question then is how much are you willing to spend to drive that conversion?
You need to have more capacity; you need to do more engineering. When you have diminishing returns on the investments that you have to make, it can't be controlled. So all year, we focused really on performance and measurement—things that also had been in progress already when I joined. A whole year focusing on single points of failure—really, I think, in 2004 we were pretty good in terms of reliability. There were rules around—so we had to use data centers; you have to replicate whatever you did over these data centers; you’d be able to lose a data center; customers should not be impacted. You could lose two data centers; customers could be impacted in terms of latency but not in functionality—all of those rules were there.
I think we were pretty good at it until one moment we decided, “Why don't we pull the plug out of one of these data centers and see what happens?” The first time we did this, it was not like we made it a surprise; we gave everybody a heads-up that this is what we're going to do. We weren't really propelling to plug—let me pull the network basically, yeah. So, data center gets isolated. It turns out all these things that look great on paper didn’t really look that great in practice. Although the first time when you do it, there are lots of manual processes, still manual database failover swings like that. This is all pre-AWS, and so that sort of a whole year focusing on that.
By the time you do the third or fourth of these, what we called “game days,” you actually get to the point where these things will come really well automated, and you can do them without human intervention. One of the biggest surprises when we did the first time was, and then you bring the other data center back online, then certainly that data center has to sync with the other data centers. Man, that's a nightmare—that was a scenario that nobody had thought about. Until you do these things, you don't know them, especially at scale.
And then we did hear on efficiency—a lot of things. So mostly the role of a CTO is to drive really big programs. I would think about sort of what does the technology that we need for the future, to be able to be on solid footing as a business, or in the case of Amazon, what are the kind of unique technologies that we've developed that may be able to turn into a business? Then things changed at all moments because in the early 2000s we opened up the catalog; we put an API on the catalog that was popular in those days. You just put an API on something and see what innovation happens.
It turns out lots of great companies were being built, so you would have access to the catalog, search, shopping cart, and if you want more from snails. Anyway, lots of new e-commerce completely being built based on the Amazon catalog. They would do comparison shopping and have wonderful new UIs towards the whole site—really great stuff being built. But as soon as one of those companies became successful, they all started to stutter in the execution. They all needed to get investment—not because they needed to operate, do the business operations; they only needed to start buying hardware. They needed to hire IT people. All these investments were, in essence, indirect funding of the American hardware industry.
Because now, and that was where all them—so all these companies had a hard time doing that. They had a hard time getting the investment; they had a hard time getting the hardware. Because, you know, you might order 50 servers; it's not like driving up to Costco and actually getting them in your shopping cart. No, it's—and then hire IT people and things like that. Most of those companies that were initially successful around the Amazon catalog all failed. They didn't fail because they didn't have great ideas; they failed because they couldn't get their IT stuff ready or get the money to actually do that.
I think one of the things that has shifted with AWS and cloud is that the investment that you get no longer goes in large numbers to hardware companies. It goes into hiring more people that are relevant for your company and actually build your products better. So then we became AWS, and we could talk more about that later. But when you become a technology provider, your world as a CTO changes completely.
I think there are four different types of CTOs. In my eyes, it’s sort of four large categories. One, in enterprises, they're often sort of the infrastructure manager. They will report to the CIO, manage large amounts of infrastructure. Then there is sort of the second world—I think this is often seen in younger businesses—where the CTO is the technical co-founder or the technical lead with the technical vision.
I think that the world is a dangerous one because there's so many other things that actually fall into that bucket in terms of VP of engineering teams and things like that. The VPs may not necessarily be really good at—I'll talk about that later. Then there is the role of a big thinker. That’s really where you’re driving innovation. If you look at companies like AT&T and others, all their CEOs have had an office of the CTO that will be really building next-generation technologies or experimentation there.
Then there's the role of the external-facing technologist. If you're a technology provider to other companies, there's a role for executives to actually interact at a deep technical level with your customers to understand how are you using my products and looking for debating bigger patterns among your customers—what are the bigger pain points that they still have, not only with our technology but just in general? You know, I often jokingly said we’re in the business of pain management, AWS.
So, what kind of pain do customers still have that they have to work on that actually don't contribute to the products that they want to build? On one hand, helping customers understand things is sort of being a bit of the evangelist for the whole notion of cloud, which, definitely, they say, 10 years ago, was much more important than it is now. But then, most importantly, take the feedback from your customers back and start thinking about sort of what new features or products you have to do or what kinds of processes we need to change to make sure we serve our customers better. So, it’s a much more customer-focused position than a technology-focused one.
Great. I’m curious; you were talking earlier about, you know, especially in the early days, there were lots of tasks, like the game day you were talking about, to make sure the infrastructure was up for scale. I'm curious if you've noticed in your time at Amazon, as it has grown and the engineering culture has grown so much, are there any tips or tricks you've seen on how to grow an engineering culture and keep it moving quickly, you know, keep it effective?
Well, given that Amazon is, in reality, my first real job for a long time, I thought that the rest of the world was just like Amazon, and it's not. There’s a very unique culture at Amazon that I think works really well for fast-moving companies. That is to build your teams as independent as possible, to remove as much hierarchy and structure from your organization as that. Both hierarchy, in my eyes, is totally unnatural. I mean, you need to have some form of IQ maybe for reporting or things like that or for some management pieces. But, you know, if you look at nature, there's a head monkey and all the other monkeys; there are no lieutenant monkeys.
No, I mean you may have multiple groups of these head monkeys, but there is—and it scales really well. If you look at ants and also all the patterns in nature, if you're able to build a self-organizing organization, which means basically that you hire people that really want to be independent—not want to be followers—that actually want to own a piece of the product and take ownership of that, I think that's extremely important when you're a younger business.
Yeah, where you don’t need followers. You don’t need just coders; you need people that want to have ownership over the piece that you want. I think look for that. There is something at Amazon that we call the leadership principles. There are 14 of them, and basically, that’s all around culture. It’s a customer obsession, ownership, dive deep, and so this 14 of them—check them out; if you type them into your favorite search engine, you'll find a deep explanation of all of them.
That's really what drives what the Amazon culture is. So when we hire, let’s say we bring in someone for an interview, you know, you really figure out whether he or she is the right kind of engineer through the phone interviews and things like that. You interview all about culture; it’s really do you have a real good culture fit? Because there’s nothing worse than a bad hire when it comes to sort of the culture kind of a van because that sort of disrupts your small teams tremendously.
So at Amazon, we’ve sought to believe him into two fairly small teams, 10 to 12 people. Because we tend to tell people, you’re still—everybody knows of each other what you’re doing. Yeah. I mean, by the way, if you want squirm—if you want to stand in the corridor in the morning with 25 people—that doesn’t work. Yeah. And so you’re really focusing on small teams and making sure that they have strong ownership over the pieces that they need to do.
One of the building teams that have, you know, our independent thinkers have control—want to have control over their own destiny, I think it’s important. I think one of the challenges is when you grow with a young business, often, I mean, the role of a CTO in a region is probably that it does everything that has to do with tech; let's basically think about the technology but also manage the teams. Now, I think managing teams is a completely different discipline.
Now, there’s a major difference between what I would call the VP of engineering and the CTO. A VP of Engineering, every morning, wakes up thinking, “Do I have the absolute best team? Is my team in the best position to deliver what they need to deliver?” It’s a people person, where I'm really thinking about making sure that your engineers are in the right position to deliver the technology that and the products that they need to deliver. A CTO thinks about technology; are we building the right technologies? Do we have the right tools? Do we have all of these kinds of things?
So two very different jobs in my eyes, that of a VP of engineering. You guys know my Calab rants. I think he’s well-known as sort of the blogger’s right—these pieces, if you're a manager. If you think about managing, think in terms of technology management. He's probably sort of the most well-known model writer that really thinks deeply about how do you make teams effective, and that's very different. And he, but he's not a CTO; he's a VP of engineering, so I think that sort of really embodies that.
Check out his stuff; Michael Lob Rants is his rants and responses, we suppose. I think he is the blog, so really check this example. Great. So you mentioned earlier that in sort of the early days of Amazon you had those companies integrating with you, and they were failing for the infrastructure reasons. Then it’s always—that's where AWS came from. Tell me how did the idea for AWS happen? How did that get popular internally?
Well, internally, we were gems on the retailer. AWS, as well, is more or less organized in such independent teams that most of those teams look like startups as well. They have complete control over their own destination. Believe me—the recommendation engine for boots is very different than the recommendation engine for shoes. And so, there's a different team that has different goals; they have their own innovation agenda all developed by the team themselves.
By the way, what we’d gone through a number of architectural changes. So at the end of the '90s, Amazon’s goal was to get big fast, and we were basically violating all sort of architectural principles to get to that point. That meant we had ended up with some sort of monolith and a massive database infrastructure in the backend that was very brittle and basically couldn’t go anymore from an architectural point of view.
Remember that nobody had done this before, so they can't be blamed. It's not like you're violating the textbook; there was no textbook. Amazon moved to what's called a service-oriented architecture. That was not a word that existed around that time either, carving off pieces of the monolith, bringing together the data that they operated on, putting an API on it, and then having a networked infrastructure that can call these public health services.
That took two to three years to get that done, mostly because there was, again, no history around how do you run services. Basically, by what we were doing, we were swinging the pendulum all the way to the other end. In this one, with the monolith, the whole set of databases were a shared resource and basically the constraint.
What we saw were the reasons for changing architecture. We saw that the effectiveness of engineers was dropping off. Basically, deployments were coming slower; new features were getting slower—all these kinds of things on why—because the backend databases were managed by a group of DBAs, yeah?
Database administrators, the DBA cabal. To make any schema changes, you had to go through them, and these guys were responsible for these databases, so they were as conservative as you can imagine. The reliability of the site is dependent on them. So all that, we sort of—our effectiveness in innovation was really dropping off. Moving to this new architecture, we traded all these themes that were going to be independent at the code; they would have their own data stores.
It was no longer a centralized data store anymore; there were no longer single and centralized resources—things like that. So, all great; except for that, we made some mistakes there. One of them gets to see years later; we carved off all these pieces. The monolith was largely gone; all services, since—I mean, that’s a side story. We made a mistake in the three very large data sets: customers, items, that’s the catalog and orders.
We basically took all the code that operated on that particular data as being one service. Within two years, that was as big as that monolith was, yeah? We had to decompose it into much smaller functional building blocks. For example, if you talk about the item master, the customer master service basically consisted of ten different smaller services, each with their own unique scale, reliability capabilities.
For example, if there would be a login service in there that basically is called on every webpage, but in that same piece of software also said the address book service that only is needed really at checkout. But that whole thing needed to scale at the scale of the login service because that was the one that required the most scale.
So, decomposing that into different building blocks that each have unique scaling and reliability, or maybe security requirements, made that we went to what now is called the microservices architecture. Okay, so moving forward, all these teams have small pieces, and then we see the effectiveness of these teams tapering off again.
Because what happens is that each of those services now becomes—everyone needs to scale; they all need to work on the same thing. You need to manage hardware; you need to manage the load balancers, hardware databases. Each of the teams now is responsible for replicating the database, a seed event of three different data centers.
All these teams, you see the communication between these teams and a networking infrastructure increasing while there's always new tech being created. All this work being done, but they're not working on innovation; they're not working on keeping the experimental pipeline going. So, it was another red flag, and what I realized was that all these teams are doing the same thing because it's wrong dependent to that side. Now you had to manage your own database.
So dropping actually databases into a shared services platform, dropping storage, dropping networking in the end, and for example, also dropping—making it an environment where servers no longer are servers but became virtual, and you could actually manage them. So we—I think we were really good in those days in terms of hardware provisioning. As an engineer, you would come to a portal; you would fill in what unique you need—10 more Linux servers of this particular size—and an hour or two later, you would have them.
If you can imagine, at the end of the year, during the exchange, giving the spikes or spikes are four times as high, you would have a need. Teams need to know a lot more hardware if you then would go talk to them in January and tell them, “Why ain't you releasing any capacity?” Well, you know, there’s this project coming up in March that we thought and we’re just hanging on to it. Apparently, just an hour to get capacity is not enough; it's not a good enough incentive to sort of release stuff again.
So we needed to go to a model where you could take engineers out of the loop, and we could have business rules that decide how much capacity is applied to something. So things needed to go together when the servers needed to become virtual; they needed to get an API fix. We built these things for ourselves internally, and when we looked at these companies that were failing on the outside that required scale, we thought, “Could we solve this for ourselves?”
Then starting to think to say some of those technologies—not the technologies themselves, but think about how we built them for internal and then rebuild them on the outside. The first one we launched was Amazon S3, the Simple Storage Service, in the spring of 2006. Storage for the internet—that's what we called it. In the early days, we thought this would be targeted at what I would now call internet scale companies. I don’t like to refer to them as startups because I think there are many other types of businesses that eventually require internet scale, and that this is really sort of where we were pushing for storage became the first firm we launched.
EC2 in the fall of that year—same kind of mobile and interfaces that we used internally, where suddenly compute capability became programmable. Within a few months, enterprises figured out that this was a way to go to deal for them as well. We know what the stories are since then.
So, I am curious; I mean, at the time, as you were developing this before it was released, did Amazon have a sense—I mean, it’s a dominant product now in the world; it’s changed the way developers work at small and large companies. Did you really have that belief coming in? “This is gonna change the landscape,” or was it just sort of like an iterate and every year more people are using it, and it just slowly grew into what it is today?
I think one of the things that we were— I won’t say caught off-guard by is how fast it grew—that was definitely did we know it was going to be really big, yes; that was the bet that we were making. Amazon has two types of innovation that happens. One of them is that each by themselves is in charge of the innovation roadmap for the coming year. So teams do that for themselves, so as I said, the recommendation engine for shoes has a metric to reduce the number of returns.
So can you recommend—so if this fallacy knows nine-fitted you really well? If you want to buy these Jimmy Choos, maybe doing eight and a half, yeah? This seems silly, but doing returns is a—it’s a tax; it’s a customer unfriendly converting. The more you can do that—so those guys are in charge of building a roadmap for the coming year—how to get new data sources, how to maybe engage with customers differently.
That’s something they make up themselves. There's no way; there’s no top-down saying, “And down shelf duties.” So that's one level of innovation. The other level is the one that requires significant capital investment, and that was clearly something that AWS falls into that category. Things like the Kindle, Amazon Prime, and others are all things that we needed to do significant capital investments.
We have a rule that if we do that, if it’s going to be successful, it needs to be successful in a way that has a significant impact on the Amazon balance sheet. You know, it’s not that we’re interested in another, let’s say hardware; that sounds quite a bit like another $50 million opportunity. Now, where if you make these capital investments, you need to look for really big opportunities.
So, we knew that if AWS was going to be successful, it was going to be massively successful in a way that we’d never seen before. We knew that we had to develop lots of new processes and technologies and techniques and whatever for all of this if it was going to be successful. Now, remember, when we were developing SC, we wrote a number on the board before O, and within six months, we were probably, let’s build this for this number of objects in the storage engine.
Then just for the heck of it, we put two orders of magnitude behind it. We blew through that in the first three months. Yes. Now it suddenly turns out that we made some decisions we made early on from a technology point of view that were really smart. We knew that with every order of magnitude of growth, you probably need to revisit your architectures that you have.
So you need to build software that needs to evolve over time. For example, take a storage engine—if you go to the next release internally of your software, you can't just copy the massive amount of petabytes that you have on the other storage disks to do this. You have to live with multiple architectures at the same time, multiple versions, all these kinds of things.
Fortunately, with all the lessons from Amazon.com, we were actually taking the right steps there. There have been some challenges over time in that sense, but I think also just like with any other company, it’s not only the tech that you have to scale. When I think of the early days at AWS— I believe we said something like, “We don’t need any salespeople; this stuff will sell itself.” Yeah, this is all self-service.
Well, it turns out that’s not the case. Yeah, it turns out you need solution architects; you need technical account managers; you need customer support; you need all these kinds of things to build around your product. They have nothing to do with the tech itself, but you cannot become a successful company without that.
I was looking just right before this interview, actually, and I was looking at sort of the AWS directory page, and there's, I don't know, I think it was like 130 services. Probably that number will change, you know, by the time we finish this conversation. I'm curious: how do you, as an organization, determine—how do you decide to build something new? Is there a top-down process? Is it just an individual team? What does that look like to launch something new?
I think there's a workspace. I think there is. I mean, we expect all of our teams, as they are now, to be in very close contact with our customers. So about 95% of features and services that we deliver are in response to direct requests from our customers. That’s a massive influx, of course. The early, early services that we built—you could almost think about what they should be?
I mean, what are the basic IT infrastructure, storage, compute, databases, networks, security? I mean, that’s—you didn’t need customers to actually tell you that; we knew that those were their basic pieces that we needed to build. But pretty quickly, customers came with also the photo requests. I mean, if I look at the fun kinds of things that AWS was really good at at the fundamental level: the scale, performance, security, reliability, and managing cost—those are not profits on the outside, but those are sort of core capabilities that come back in each and every one of these services.
So customers then came to us; they said, “Well, you know, can’t you want analytics for us?” And this, this is the early days. So everything about analytics around IoT, about mobile development, about, you know, these days blockchain—all these other technologies that customers actually want to use do not want to manage themselves.
Yeah, and we will be helping them sort of putting up the right features or tools. This is important. We also had a very strong culture around when we launch new products, new services; we want to launch them with a minimal feature set. You could call it an MVP, yeah?
But that sounds like—that’s technology—where other people need to build their business on. So, you can’t actually launch things that are flaky. It needs to be rock solid. So we launch things with a minimum feature set and then work with your customers on what should the other features be.
Now, in general, we have an inkling of what the other features are going to be when we launch DynamoDB. For example, we really knew customers wanted secondary indices. We didn’t launch with them, but that was obvious that that was what they wanted.
Mostly it launched with a minimum feature set because these are sometimes services nobody else has used before and nobody built, so you need to observe your customers how they're going to use your product. You don’t know upfront; like, see, they probably use it in every possible way except for the one that you intended it to.
That is good because if you didn’t launch everything in the kitchen sink, then you can focus on how your customers are using your product and then slowly start to iterate and add new features and services to it in the way that sort of their new modern way of development is—we're working.
Yes, when we launched Lambda, which is our server environment, so basically, you just write code; you dump it in S3, and you don’t need to think about servers or anything else. You don’t need to think about idle time; you don’t have to pay for what you use and things that upgrade the environment—nobody had done that before. So how is development going to change? What kind of other support structures are around it?
Yeah, or, I'm sorry. Maybe we launched it much more as an event-driven environment. So, if a new file arrives in S3, it triggers some code, you know, a new message arrives; it triggers some code, API gateway, things like that. But it turns out some of the companies that jumped on board a server for some of the largest enterprises—why? Because you don’t need to manage anything there, and you don’t need to pay if no execution happens.
I mean, if you run a whole batch of EC2 instances—whether you use them or not, you’re being billed for them. In this case, the only being billed for execution. So, it changes the way development happens. So you don’t launch everything in the kitchen sink. I said earlier around servers, you're going to see how your customers are using this, and they quickly start iterating.
We developed it X-Ray as being the debugging environment, delivered step functions to build more complex applications, and there are just a lot more things coming down the pipeline, mostly because you can observe how your customers are using your product.
So, for example, in DynamoDB, we knew secondary indices. It turns out customers want to throw an item-level access management much more important for them than secondary indices. So basically, customers reordered our roadmap to be able to sort of—and then we started delivering the things that mattered most to them, which I think is an important part in this.
But again, even though it looks like an MVP, we can't treat it like an MVP because people will be building their business on it and will be depending on it. So, it comes with a very different culture structure on that. Last year, 1400 new features and services. As the number of teams grow, of course, it accelerates as well, and we use within AWS sort of the same structure—the Neptune team, the graph database, supposed to be in contact with their most important customers—let’s say the most demanding customers—and understand what their needs are.
So each of these teams has their own customer set and customer base, and they all build a roadmap. The more services you get, the more roadmaps you get. But, you know, this is really a fast-moving environment where the way that people are building software is changing dramatically.
If we would have sort of taken the step in deciding for our customers how they should develop software, you probably would be developing software in, let's say the way that five years ago—maybe 10 years ago, because that sort of the structure you have at this time. If we need to decide developing software how we would like to develop software in 2020 or 2025, that's kind of the thinking that we have.
We can’t do that by deciding for your customers; you need to work closely with them and allow them to sort of drive your innovation engine. About a decade ago, you wrote a blog post called, I think, “Working Backwards.” Do you remember what I'm talking about?
I’m curious; could you describe sort of the product development flow that you put in that post and then tell me what you learned and whether Amazon still uses that structure?
We use it everywhere. It’s a—so the protocol is working from the customer backwards. Amazon has a very strong focus on sort of developing only those things that really matter for your customer. So even though we're a technology company—and I think if you're a heavy technology company, with lots of engineering, there’s a risk that the engineers get in charge, yeah? If they get in charge, you do not necessarily build products; you build technology.
For us, it’s a drink; there’s a big difference between the two. I think what makes AWS successful is because we focus on products—meaning what do we want to build for our customer much more like, “Oh, this is this very cool storage system; it sits underneath there that we’ve never built before,” or “How do we do global tables in DynamoDB,” or so this is all this amazing tech that we’re building.
But that’s not what’s driving it—it’s what do you want to build for your customer. What’s the problem you want to solve for them? So we want to make sure that, whether that is in AWS or whether that is in retail or whether it is opening a new office in Menlo Park, we use a process called working backwards. Why are we doing this?
So the first thing you do is you write a press release, and it’s not—it’s not a press release that will come out; this is a press release you write for yourself in which we describe, in very clear and simple terms, exactly what you’re going to build. Then you write a document. That’s the 20 most frequent questions, and then you have to answer those in very clean, simple terms as well.
Sometimes, especially for more complex situations, we iterate on those first two documents maybe 10 to 15 times until it's really absolutely clear what you're going to build and not more than that. Then you write the UX document—basically, how are my customers going to use this or what is the interaction if the customer is going to come to B?
Then the fourth document is part of the user manual, glossary, and some other terms, things like that. At the end of that, you have a set of four documents that describes exactly what you’re going to do, and then the ruling with the elements is that, “And thou shalt not build more than that.”
Because as engineers—and I’m guilty of this—you have the tendency to sort of anything that should that makes it easy for you to build version two—no matter what. I start putting that in version one—that’s not an option, yeah? It’s really focusing on building that and exactly only that. It gives us a really strong structure around how to think about customers, how to feel about think about products—much more about them than about tech technology.
So this is the product you want to build; now what kind of technologies do we need for the four or five, or four for that? There's a very strong product thinking around that. It combines with another process that we have. So meetings at Amazon—we have a moratorium on using slides. So, no PowerPoint, no key, nothing like that. Why? Because in meetings, I think slides are deadly. Half of the room will be on their phones, and the other half will be already complaining by slide number two that they don’t understand what you’re talking about. That’s obvious because you haven’t seen the whole presentation yet.
So we operate in Amazon with what's called six pages. This is a six-page document, a narrative that you have to write. The first 30 minutes of a meeting will always be spent by reading this document in complete silence. Sometimes halfway through the reading, you go like, “Guys, go back to the drawing board. You know, don’t waste our time here!” Because why? Because it is very important, because it’s very hard to write a clear document if you do not have clarity of mind writing.
A narrative is extremely hard, and this might be this often a collaborative effort. Why did you give it to some of your colleagues for feedback? You put it in your drawer. A week later, you pick it up again and revise it until you get to a point where you think you’re really clear about describing what that’s a feature, or a product, or an activity, or a new business that you want to go into.
And after reading those six pages for 30 minutes, everybody in the room is on the same page—no pun intended. Battery for an intent? No. And so you get a fairly high-quality discussion after that because everybody now knows exactly what you’re talking about.
So now, often part of that process is then, as an addendum to the sixth page, it will be DPR and FAQ, and you’ll iterate for longer. So we have a very unique culture around that. Five years ago, containerization—sorry, virtualization was still pretty new. A lot of companies were just moving to that. Today, everything is being containerized. You have lots of big customers moving to Lambda, too, sort of totally serverless platforms.
Where do you think development is going to go five years from now? How are we going to be building our apps?
If I would have this crystal ball, I will be sort of—I do think I see quite a few companies skipping the container step more and more. Not, not that I think, you know, so I think there’s a—a—I think some part of the popular—the pile of, I think what makes containers popular. One thing is everyone wanted to go to more microservices environments, and I think that combines really well—sort of if container technologies, we can scale things up and down easily into these single components that you have.
So I think that matches the microservices thinking really well. I think most of the people that actually come into that phase often come out of the monolith phase or at the breaking off monolith into sort of those container people that start from scratch. More and more, we develop around sort of serverless environments.
And so you could even consider, for AWS, we started off the thing with containers—especially with the Elastic Container Service. It's all around Docker and container capabilities and things like that and deep integration of all the able services. So that's how we started off.
The thing with containers is, especially before we deliver the Fargate—and we’ll talk about it in a minute—is that basically, it almost brings you back to the pre-cloud days. Certainly, you need to manage multiple containers over multiple availability zones; you're—there’s almost so much you need to map them onto virtual machines because these things don’t run by themselves.
You still need to manage all virtual machine environments underneath there. So, even though containers is a great sort of dev-experience abstraction, customers need to do a lot of work to win those containers. Part of that will be taken care of because it’s a managed service. We also deliver something called FireGate.
So FireGate basically takes away all the management of virtual machines underneath there. So basically, you just write a container; you drop it in there; and that’s the state where you want to be. Yeah, I think every time we—you need to do things that actually have nothing to do with building your product or running your product in the most efficient way. Then that’s sort of a waste of effort.
So we continue to look at how we can take more and more of those pain points away because, you know what? Nobody cared who was managing containers. Nobody cared about the virtual machines underneath there. Yes? It’s just the tech that you had to pay.
And I think we were sort of bringing up—especially I think the interaction in the Kubernetes group is really feeding many things back into sort of the mainstream Kubernetes open-source environment, especially when it comes to things like security and things like that.
How are we—how I am doing things differently five years or—that was the question? Yeah, I think there’s a lot more service development because I think we see already an enormous pickup in the market. I think we’ll see more and more tools and support platforms in infrastructure around the ability to build more complex service environments—better integration, I think, with other services—that’s definitely something we’ll see.
But I'm going to shift to something else. I think one thing that we will be doing, I hope, five years from now is that everybody will be taking security as their number one job. Whether you’re the CEO, whether the CTO, whether you’re an engineer, we all need to become security conscious.
If you’ve looked at the past four or five years, I mean, there isn’t a week that goes by where there isn’t some massive data breach. It’s embarrassing. I think as technologists and as energy and as digital business leaders, we should be embarrassed that we are not where’s the outrage. We almost seem to accept this as being sort of part of a base.
It's not. Without protecting your customers, you have no business, and I think forever. This is something that maybe as a young business, as a startup, you didn't think about 10 years ago or five years ago, where I think everybody needs to start with thinking about, “Oh, this is the kind of data I'm collecting for my customers. How am I protecting my customers?”
You may say, “Who is interested in when Werner rented this bike here?” Well, you know what? That data set, plus all the other C's data sets that you may get from other places may have a very valuable position. So I think we all need to become extremely security-conscious and make sure that we can continue to protect our customers.
It would be embarrassing, I think, as young businesses to actually start losing your data of your customers. It will impact everybody else in your environment—not just your customers; it will reflect on other young businesses as well. And I think if you collect data from your customers—if you have consumer data, you have a great responsibility to keep that data secure.
And, you know, there are also tools for its secure; there’s this encryption, and we give you dozens of different tools that can all help you to do this. But you need to keep it in your mind that that's your first and foremost job. Of course, you’re thinking, “Yeah, but we don’t know; we’re building this new surface readability, this new consumer service.”
We don’t fit now. I think the only way if you actually build your new business without security baked in, it will be very hard to retrofit it into it. You’ll have a nightmare, let’s say, two to three years down the road when you become successful, and you grow, you need to retrofit security. You can't be in a major mess.
So that means also that your development processes—because that’s what you asked about—need to change. Now security needs to become a default part of your, for example, your continuous integration and continuous deployment pipeline, yeah? It needs to trigger events whenever you’re building something, and someone adds a new open-source library to it, an alarm should be going off for someone to inspect why are we adding this? Why are we doing this? Do we check this out from a security point of view?
And so your development pipeline itself needs to be secure. There need to be all sorts of alarms coming off in that whole section, and security needs to become your first and foremost concern. There are lots of automation tools around it, yeah? So you deploy; you can use Amazon Inspector to test vulnerabilities.
But if you do continuous integration and continuous deployment—especially if you, for example, in FinTech or in healthcare, you're subject to also two regulatory requirements—how do you know that these new next five lines of code that you just voted don’t violate your HIPAA, yeah? Or that you’re still in compliance with whatever the financial regulator wants from you.
So there are all sorts of automation tools that can continuously test this for you. But you do need to do it, and it needs to be in your mind. So development, especially if you do continuous deployment, changes radically from how we’ve been approaching security in the past.
Now, I honestly believe continuous deployment actually is better from a security point of view because in the past, you would write 50,000 lines of code; the security team would come in, would review it, bless it. They had no clue what they were doing; they would bless it 50,000 lines of code. You have no idea, yeah? And then you would deploy—changing five lines of code is something that you can actually test with automated processes, with automatic reasoning, and things like that—so that you actually are building the right, right, right, right things.
So I think it is great advances for security. We do need to keep security in our mind. So I hope that in five years’ time, well, not necessarily security engineers, but we all become super security-conscious, and protecting our customers will be our first concern. It is at Amazon; it will be forever a number-one investment area, both in terms of intellectual capital as well as, sort of, financial capital.
Protecting your customers—you do not have a business. Beyond not taking security seriously enough, are there any other common mistakes or errors you see startups doing as they use AWS?
Well, first and foremost, I think there’s a technical one. We've got quite a few people who are the first-time AWS users but have already had experiences, say, building services and using a traditional data center. Using AWS as if it were a data center makes it—you lose out. I think, yeah, there are some advantages to it; you have some elasticity and things like that—you can make use of—but if you don't use sort of the higher services and security around data analytics, around mobile—all of these we take away—a lot of the development for love—the heavy lifting, you're losing out if you just treat it as a data center with virtual machines, a database, and storage. It's not—that’s not where your major productivity improvements will come from.
But not the major ones. Next to that—sort of a more of a meta-level—you have to figure out what kind of company you are. I think there are sort of two different startups: one of them is, you know, you go for the high growth; very far get big fast kind of thing, you know, acquire as many customers as possible as quickly as possible—not necessarily thinking that much about revenue, taking a lot of investor’s money, and really get big really fast to become successful and then probably be acquired because that’s what people are interesting in. That requires that makes your use degree of cloud very different from if you’re a mere supporter—really well—because you can become a lot of worry; that much about getting capacity or services, whatever.
It’s all there, and it comes with a clear cost picture as well; that’s different. Or if you want to become a sustainable company—basically, I want to build this company but still being business-sweetie—from now—you know, not necessarily focused on that position, but just building a business. I think if you guys follow Signal v. Noise, the guys from Basecamp, you know, DHH and others, Taco and Jason Fried talk a lot about how they build sustainable—how they want to build—they want to be in business still; they want to still do this business 10, 20 years from now.
How do you do that? That’s a very different—that requires different architectures; it requires much more control of a cost; it requires a clear association between cost and customer acquisition. So, maybe we are supporting them as well, but it requires you to build very different architectures because you won’t have much more control of costs to scale than in the other one. We’re not concerned about cost; we’re much more concerned about sort of if they say can be changed really faster; we can address our customers’ concerns because we need to grow really fast.
Jeff often makes the distinction between mercenaries and missionaries. Yeah, mercenaries are the startup founders that are in it for the money, and missionaries are the ones that are in it for the love of the product. Yeah, and I want to build this product, or I want to build this business, and you support both of them. Both approaches entrepreneurship are valid; it's just that the tech support and the tech that you built for each of these different groups is very, very different.
So, figure out what you are. Alright, well, thank you so much. This has been wonderful, and we've learned so much. And thank everyone; a big thanks to Dr. Verner Bogles. [Applause]