yego.me
💡 Stop wasting time. Read Youtube instead of watch. Download Chrome Extension

Apple Vision Pro: Startup Platform Of The Future?


18m read
·Nov 3, 2024

How much of like the hard interesting stuff Apple did is with the hardware in The Vision Pro versus the software? Well, you need to understand the real world in order to augment it—technology of a self-driving car but on a headset. This is maybe where Founders should sort of pay attention: is this a good opportunity for startups? There are all kinds of new interactions that I think we have not figured out yet that really truly take advantage of this platform. The dream has always been to get to something like this.

[Music]

Welcome back to another episode of The Light Cone, and as you can see, it's not just any other day in tech. There are some new platforms that are coming up right now. You might have seen other places where there are reviews. We're not doing reviews today; we're going to talk about what these platforms might mean for Founders and people who want to build things for a billion people. We actually have an expert at the table right now, don't we? We do. Diana, who's a group partner at YC. Before she worked at YC, she's been working in AR and VR for 10 years since the dawn of the Oculus, before VR was a mainstream thing. In fact, her grad school research was in computer vision, so she's been interested in this long before it was a thing.

Diana, do you want to talk about your startup that you did, which was an AR/VR startup—a really early pioneering one? Yeah, we went through YC with a startup called Asher Reality. What we were building was an augmented reality SDK for game developers so that they could build multiplayer experiences and AR games and build the code once so that it would work on any platform. So between not just iOS and Android mobile devices, but the dream has always been to get to something like this, or that, or that, so that developers would write the code once and it would work across all devices.

What happened to your startup? So what happened is this took a lot longer to come to market, that's one thing. The other thing that ended up happening is we ended up getting acquired by Niantic, the makers of Pokémon Go, so I ended up heading a lot of the AR platform over there at Pokémon with Niantic. We actually shipped a lot of this AR SDK into a lot of games, so millions of players are running our code, which is really cool. If you've ever played Pokémon Go, you've literally used code that Diana wrote, and I'm so excited with this platform coming in—and we can go dive deeper into it.

Okay, should we take the headsets off so we can talk? Yeah, let's go. So, it's been a long road. You've seen this technology basically evolve over the course of a decade. What's, you know, why AR? Like, that's one of the big things here. Previous platforms may be really focused on VR and the gaming aspect. HoloLens from Microsoft seemed to try to do the AR thing. What's going on with the Apple Vision Pro? You know, why is this important? Why are we talking about this?

Yeah, I mean, we have to go even back in the history of computing. Actually, the attempts of building augmented reality and VR headsets have been actually since the beginning of the first computer. Actually, the very first one was by this guy called Ivan Sutherland back in the 60s. So people have been thinking about it. It’s kind of one of the dreams, and it's one of those things that really fascinated me. I think it's so much in our consciousness that we want to make it really happen, but the challenge why it has not happened, unlike tablets and phones, is that it's just really, really hard to make.

So you bring up the Microsoft HoloLens; they had version one and version two, and sadly the latest version got scoped down or the team kind of got let go because they tried an optical approach. So the AR approach was that they were seeing actually the real world and then the digital content would be rendered just within the eyes, and it had a very little field of view. It was actually the same approach that Magic Leap was trying. What Apple is trying is actually more of a pass-through, which is actually more of a full high-res video feed of the real world.

Arguably, a lot of the technical challenges are a lot easier, and the hard part of optics is that it is not a problem of more law and just like forcing with more computation, more pixels. It is actually figuring out new physics and photons so that they render properly to the human eye because the human eye is actually very, very incredible. Your field of view is actually 210°, so you put your hands behind your ears, you can kind of see them. To have a display system that can really render all of that is so hard.

And the other part that's really hard, which I want to touch upon a bit more, is our eyes are incredible at doing infinite ability to focus. So we can look close here or very far, and in some senses, you have to find a way to make that discrete for computers to work right because computers just understand ones and zeros. To get that working in a display is just so hard, and Apple has done some clever things with that that's different from the optical approach.

Because the optical approach is what, like, it's actually looking through to the real world?

Or it's how? What's the difference?

Yeah, so if I'm looking at Jared right now, I'm actually seeing Jared. And if I overlay digital information in the optical system, I would only overlay the digital information. Here for the Vision Pro, and what the Meta Quest 3 or Meta Quest Pro or The Vision Pro technically VR headset, the full video is all digital. Like, Jared is technically pixels when I see him through the Vision Pro.

And so you said like the Apple Vision Pro being a video feed actually reduces a technical challenge?

Yeah, because I think there are a couple of things you could do. You can play a lot with the video feed, and one of the cool things, if you're really best in the world with display technology, which Apple is, you can get away a lot with it. One of the cool things they've done and foundations of what they build, which is actually helpful if you're going to build apps here so much of it is built upon eye tracking.

So they actually have a variable rendering for focus. They had to get the eye tracking to be working so well for this to work. So in the Vision Pro, wherever you look, the pixel density of your focal point will render more high-fidelity than where it's not. The reason why this is important is because, to fit it in such a small form factor and not to burn, and there's so much heat dissipation to push so many pixels and battery, you have to do trade-offs.

So they did this thing of rendering more high-res where your eye focuses. So you can notice a little bit in the periphery with the Vision Pro where it's more blurry or a little bit—it's not quite pixelated but blurry. Some of the people do complain online with the FIA view. I mean, that's I think a bit of the artifact with the lens, but that's like a different discussion.

So how much of the hard interesting stuff Apple did is with the hardware in The Vision Pro versus the software? I think the cool thing about them is both because the Vision Pro is sort of a culmination of a lot of the ecosystem of what expertise they built in iPhone. They have custom silicon; they have the R1 processor, which is a co-processor to the M2. The M2 is basically the same processor that runs on the MacBook Pro, so very beefy, but that processor, M2, is for regular kind of like a CPU regular workload.

But the challenge for building an AR headset or AR in general is that you need to understand the real world in order to augment it. For that, you need a lot of sensors. So this has over 10 cameras; it even has a LiDAR, it has a true depth camera, it has a bunch of IR cameras inside to track your eyes. So that's a lot of data, a lot of high data bandwidth that it needs to process.

Underneath the hardware, I think this, you're going to get throughput blocked. So the R1 is a custom processor that processes all of the sensor data with very high data channel bandwidth, and I suspect they are even running a real-time operating system along the Vision OS, which is kind of interesting for what it means for developers to process all of this in real time.

And it's starting to sound a lot like actually a technology of a self-driving car, but on a headset.

Yeah, that's exactly. As you were talking about what this is, that like springs to mind: like LiDAR plus a bunch of cameras and processing the video feed.

Yeah, can you draw the connection? Like, it's probably not obvious to people what the connection is between like VR, AR, and self-driving cars.

Yeah, actually, this was one of the jokes with my co-founder when we started Asher Reality with the core tech for localizing in the world and knowing where you are. It comes from the world of in robotics called SLAM, simultaneous localization and mapping. So you want to find where a robot is in the world based on just visual data, and that is the same thing that self-driving cars look to navigate where they are in the 3D world.

So you notice in that car there’s 3D LiDARs, there are radars, there's a bunch of cameras—same thing here to know where you are in the world. So it's the same technical challenges but with so much more hardware complexity because you don't want to burn people's heads. Imagine because with self-driving cars, you can actually fit the actual hardware that runs in the self-driving car processing; they put server-grade GPUs and CPUs which fits in like the trunk or underneath.

But this is actually pretty cool what they've done, and they built a lot of that because on iPhone they learned how to build custom processors. They built with the TrueFace true 3D on the camera, which is like IR for mapping 3D, and LiDAR they added on the latest iPads. They’ve been building a lot of the ecosystem one by one.

Yeah, it's interesting here you talk about how Apple can build on their previous products. So it's like you're saying this is sort of a lot of the technology here is coming out of the iPhone. This sounds like this sets them up to build their car, like pretty well—the same expertise.

Let's talk about the use cases a little bit. I mean, one of the things that's pretty clear in everything about the launch of this is it's focused on productivity, and I kind of like it because when you're talking about these Oculus devices, they're much more focused on gaming, on VR, where you're sort of in a totally different place. Whereas, you know, my guess is one of the reasons why VR/AR hadn't been embraced is that it wasn't something that a busy person would use every single day.

But now, you know, it's got the M2; it's the same chip that I have in my MacBook Air. I can actually, with a keyboard, do all of my work all day if I wanted to, and that's a really big difference in how they're positioning this device, which is a big departure from Meta. Meta is so much in the gaming community, and actually, there was, I think, a bit of an uproar from the VR community that there's no controllers.

Apple has really focused full-on on productivity, which I think if this was my dream when we started AR, that if AR was going to happen, we're not going to notice it because it's going to solve all the very mundane things and it could replace all screens. I think if done well, this is going after the market cap of all screens that get sold if done well. I mean, there are still a lot of things to be done. This is still B zero, but yeah, this motion, like this, was incredibly natural, and being able to look at things and have it be something that you interact with, I was just blown away at how simple, how easy that was to reprogram my brain, which is cool.

I think there's half—I remember, I guess, a question for you, Gary. Do you remember when the iPhone came out? Apple had this Human Interface Guideline, MH. Yeah, they had a lot of things about communicating information, hierarchy with touch and focus and gestures with your thumb, and things like that.

Yeah, it was an incredibly comprehensive document. They basically took all of the learnings that they had gotten building the iPhone for years, and they distilled it into a really thorough document. Then they published it for everyone. I think it taught a whole generation of designers and developers how to build great mobile apps. They would just read that document.

There is a Human Interface Guideline for the Vision Pro, and one of the things you notice is so much of it is about eye tracking and communicating information with depth and space. I think what brings—maybe this is actually something for Founders to think about if you're building an app in this space is that with the Vision Pro, they invested so much in eye tracking to make it work for so many reasons. I mean, we talked about to get just the rendering to work; that was a building block, but for the UX, I think it is the moment that we're seeing with capacitive touch where Apple got it right for the iPhone.

The eye tracking is starting to look a lot like that. So I think there are a lot of cool UX things yet to be discovered with just eye tracking. And the funny thing is that the VR community, I think, was very skeptical of this because actually it was a bad practice to do eye tracking because it tires the user too much, and the reason is because the hardware was not good enough.

I remember the same thing before the iPhone came out; I remember like lots of the conventional wisdom from consultants and experts was that the virtual keyboard wouldn't work, that people wanted like a physical keyboard, and that just it wouldn't—like people would never treat it as a serious device to do their email on because it didn't have a real keyboard on the phone.

Yeah, oh yeah, yeah, that was all the reviews of the iPhone.

Yeah, yeah, but there were—I mean, this is maybe where Founders should sort of pay attention. There were still things that Apple had not figured out yet that third-party developers ended up figuring out. So if you remember the pull-down to refresh, that was something that I think was in a Twitter client, and, you know, I think that founder ended up selling their Twitter client to Twitter and working at Twitter for a while.

But there are all kinds of new interactions that I think we have not figured out yet, like the sort of like pinch to move around is merely the first of a whole bunch of different things that frankly end-user developers will actually figure out.

I think I'm curious also, Diana, what's the difference for a developer between the Meta SDK and the Apple Vision Pro SDK?

One of the big ones is Meta comes from the DNA of gaming, so they have very good support for Unity and Unreal, and those are game engines which are cool to build for games—3D environments in a game which are literally more like a constrained 3D world. But for spatial or spatial computation, the real world is infinite, so sometimes game engines don't quite fit.

One of the things you'll notice is to build an application that opens a PDF for the Meta platform, it actually takes a lot of lines of code. Huh? Whereas to build that for the Vision OS, it's actually just a few lines of code. Interesting.

I guess the other big question that probably a lot of people in the community have is, is this an iPhone moment or a Newton moment?

Well, when the iPhone first launched, there wasn't actually an app store, right? So I think that came maybe a year later, something like that. All of the initial apps that got distribution on the App Store were like frivolous apps, right? It's like the fart app. There were a bunch of things that were getting really popular—the $2,000 I Am Rich app, which is like an image of a ruby or something.

Yeah, oh my God. If you think about from at least the YC perspective, the iPhone or mobile didn't start driving really big companies being started until, I would say, probably like 2012. Like, 2012 is the year where we had Instacart come through. I actually think mobile was a fairly big component of Coinbase, right? The fact that they just had an easy-to-use mobile app.

DoorDash was 2013, and so all of these things start—and you, of course, had the rise of Uber—not a YC company. But it took so—you could say five years from the launch of the iPhone for the actual good companies to even be founded. Right?

And so, yeah, so you haven't missed it yet.

Yeah, well, I don't—when I think about the Vision Pro, I'm not sure if we're at—is this the iPhone moment in the sense of the iPhone just got launched, and like it's still going to be a few years? Or is this like, hey, actually, like this device has been around for a while, this is just like the iteration that was needed on it to unlock like the Instacarts and DoorDashes and Ubers that are going to be built on it.

I'll give one argument for why it's probably more like the iPhone moment. We don't know, but you know, when the iPhone came out, like people forget smartphones were already an established category, and the iPhone was like the new entrant to this established category. A lot of people were skeptical that Apple could actually execute, as you mentioned—they were very skeptical of the iPhone as the right product to challenge the Blackberry and the other like incumbent smartphones at the time.

Famous Steve Ballmer quote about, I think there's like Steve Ballmer just, you know, making fun of it and saying it would never be a serious device. Right, right, right.

Why was it that it took like five years for the good iPhone companies to come out? I think adoption had to happen, so that's why it actually maps very closely. I mean, I don't know how many Apple actually sold, but it's probably on the order of hundreds of thousands, right? So which probably mirrors the iPhone—maybe the iPhone, you know, broke a million even.

When you look back to the Instacart or DoorDash or Uber moment, these mobile workforces could only happen at the moment that 70% to 80% of the people in society had these devices. The reason why that was such an important moment was that was the first time normal average people had always-on internet connectivity and an app ecosystem that was actually stable enough.

You know, remember back, you know, sort of 10 years before? It was like J2ME or do we write it in Flash? You know, Gustaf and his, you know, Voxer and Hyain experience—you know, the platforms were literally so broken and so fragmented that you couldn't have 80% of the population on one platform. And then suddenly, all of the platforms sort of coalesced and then it opened up the market.

I guess a question with this device, and in general with VR, it will be different than mobile. It won't be a type of device perhaps. I mean, it depends on the price point when it gets to maybe phone cost, perhaps. But it will take a lot of time before we get that level of mass adoption.

But I think what could happen is it will capture a lot of the kind of high-end use with what we talked about earlier—with high information density, construction, CAD, engineering type of workflows.

So Diana and I were actually doing group office hours yesterday with a group of our companies in this current batch who are all working on hardware hard tech ideas, and we did this exercise we call it the premortem where you sort of give them different flavors of how companies can die, and you get them to say this is like how I think I'm most likely to die, right?

And like the one I'm coming up—the thing that springs to mind here is we were talking about how Tesla’s strategy was very successful to launch the Roadster—a very high-end device—and then you bring out like the Model S and the Model 3 and the Model Y.

But like that wouldn't have worked if they just stuck with the Roadster, right? And so maybe one failure mode for the Vision Pro is like this is the Tesla Roadster. It's great, it carves out a niche for people who are really into this stuff and are willing to pay like for a very high-end device, but it can't follow it up with like the Model 3.

And I think there's a bit of a chicken-and-egg aspect with it because for this to be relevant, to become the Model 3, let's say, we need an ecosystem of applications and an incentive for developers to work on it because if I were a Founder right now and I'm looking for a new idea, do I want to put all my eggs on here when there's not enough user yet? When should I do it? Should I just take a leap of faith?

How do we advise Founders when they're in this space? Like, why should they do it?

I definitely think that's relevant to like the Instacart, DoorDash thing. For example, if you think about it, those companies weren't making a bet like their apps were not specific to iOS or Apple, right? Like, like everybody had a device, and they worked equally well on like Android. They frankly could have just been a web view stuffed like in an app, right?

And so that's a good point, and they also weren't the first entrants in their categories. Like, before DoorDash and Instacart, there were many would-be DoorDash and Instacart players that launched earlier that actually didn't succeed.

Yeah, well, even more extreme, the in their case mobile actually made ideas that seemed very bad into like good ones. I actually think it's really cool that Sequoia invested in Instacart because they'd had the big failure with Webvan, and so they had all this egg on their face with like grocery delivery is this bad idea that like you would expect it's very natural to never want to fund that again.

But like mobile actually turned that into a good idea. I did a dinner talk with Max, the co-founder of Instacart, and he said that when Sequoia led the Series A for Instacart, they gave him the Webvan business plan that they had been given in the 90s, but the problem was it was on a floppy disk, and he couldn't find a floppy disk reader, so he never read it.

That's hilarious.

I'm sort of taken by even the path of consumer social networks. You know, Facebook started as the Blue app; you know it was a desktop experience killing MySpace. It sort of looked like literally bank software. Like if you logged into Facebook or, you know, Chase.com, it even had the same color.

And I remember being at YC when Mark Zuckerberg came to talk about why they bought Oculus. It was actually very much from what I could tell, trying to fight the last war that Facebook had just bought Instagram. I think it had not bought WhatsApp yet, and they felt he felt really scared like that basically Facebook had this monopoly.

It had like owned the industry of, you know, consumer social, but—but then they almost lost it because Instagram, you know, easily could have outstripped it. And that was because of a platform shift, and so he wanted to, you know, very clearly own the next platform.

And he's right—should Founders go build on this? Is this a good opportunity for startups?

I just sort of wonder what are the things that could actually fully take advantage of this in a real sort of professional context? I mean where my head goes—maybe it's too obvious—but traders with their like sort of 20 screens, you know, wouldn't you rather have something that allowed you to take in the breadth of that information and dive into it very easily just by going like that?

You can imagine that being something that people are actually willing to pay—not just, you know, hundreds of dollars a month, but maybe thousands of dollars a month for.

I think we're going to be in quite some time at the beginning in this awkward phase with spatial computing type of apps because even with the Apple SDK and Meta, a lot of things are still flat 2D. And I don't think we know how to develop for full 3D what truly takes advantage of this platform.

What is unique about this platform, whether it's, you know, 360-degree view, being able to dive into more data easily—like what are aspects of this new technology that mean that it can upend even what seems like an unassailable incumbent like, you know, Snapchat versus Facebook?

But would part of you try and talk them out of it? Like would part of you be thinking this is too early, you should work on something else?

I think if you look back in our history, YC has weirdly been pretty good at this where every time there’s a platform shift, whether it’s like the Facebook thing which didn’t go anywhere or the iOS thing which did go places, we were reasonably accurate actually funding the right stuff.

I think the way that we did it is rather than having a strong thesis on each technology and each platform, we just kind of look at each application from first principles, and we talk to the founders, and they have some idea. We just try to figure out if the idea makes sense.

I think that's what allows us to have had a pretty good track record of discriminating people who are just like cargo culting the new thing and just like jumping on the hype train, and have some idea that doesn't really make sense from the people who are building something like DoorDash that actually like totally makes sense.

Yeah, it's fine. I mean, the other thing that I would look at, to Jared's point, is actually there's a strong belief from the Founder that they want to make a bet in the space. I think there's just something about Founders where they go all in; they become unstoppable, and it's going to take time, so they have to have the faith that this is going to be different than building, let's say, a standard SaaS application or consumer app or AI application.

Let's say if you stick long enough, you're going to build a lot of expertise and be world-class by the time is the right moment.

But someone that's genuinely excited about it—and the cool thing about it is there's a lot of technical challenge with it, which I think is going to attract the right kind of Founders because it's actually hard to build something good on this right now because it's so new.

So this will be the main thing I'll look for when I'm reading applications for people putting VR stuff actually, and I feel okay sharing it because it's very hard to fake.

It's basically what we’re saying is if you’re the kind of person that just is irrationally compelled to build applications for VR, we will happily fund you. And like, we need some evidence of that just like you—just like your SP in your free time, you are like building VR apps and you have been for a while. Like, yeah, we would never try and discourage Founders from building stuff they just think is cool.

Well, that's a great place to end. We're out of time, but thank you guys! Another good episode of The Light Cone. Guys, see you next time!

Catch you guys!

[Music]

Next.

More Articles

View All
Marginal utllity free response example | APⓇ Microeconomics | Khan Academy
We are told that Teresa consumes both bagels and toy cars, and they tell us that the table above shows Teresa’s marginal utility from bagels and toy cars. The first question is, what is her total utility from purchasing three toy cars? So pause this video…
What skills will set you apart in the age of automation? | David Epstein | Big Think
In a rapidly changing work world, it’s important to be a constant learner to be able to change and evolve your skills. Especially when you’re facing automation of certain types of work. So, I want you to think about a spectrum of work to get automated. O…
Khan Academy Live! In Khanversation with Barbara Oakley
So Sal here at Khan Academy worldwide headquarters, and I’m excited to be here with Barbara Oakley, who’s an expert on learning and learning how to learn. So Barbara, let me just start with a question that I’m sure many of Khan Academy users or young peop…
Profiling Is the Lazy Man's Security Policy, says Juliette Kayyem | Big Think
Profiling is the lazy man’s national security policy. I mean it’s the easiest low hanging fruit. It’s red meat to the masses. But I want to be clear here. Profiling isn’t only wrong from the perspective of, you know, sort of who we are as Americans. It’s …
Does Pressure Melt Ice?
I’m gonna try to demonstrate something called regelation. Which is where you provide a pressure onto ice and that turns it into water, but after that pressure is removed, it freezes again. So, in order to demonstrate this, I’ve taken apart the high E st…
HTTP and HTML | Internet 101 | Computer Science | Khan Academy
I’m Jasine Lawrence, and I’m a program manager on the Xbox One engineering team. One of our biggest features is called Xbox Live. It’s an online service that connects gamers from all around the world, and we rely on the internet to make that happen. This …