yego.me
💡 Stop wasting time. Read Youtube instead of watch. Download Chrome Extension

How Benjamin Button got his face - Ed Ulbrich


13m read
·Nov 9, 2024

[Music] [Music] [Applause] I'm here today representing a team of artists and technologists and filmmakers that work together on a remarkable film project for the last four years. Along the way, they created a breakthrough in computer visualization. So I want to show you a clip of the film now; hopefully, it won't stutter. If we did our jobs well, you won't know that we were even involved.

I don't know how it's possible, but you seem to have more. Higher? What if I told you that I wasn't getting older but I was getting younger? Everybody [Music] else. I was born with some form of disease. What kind of disease? I was born old. I'm sorry, no need to be. There's nothing wrong with old [Music]. Age, are you sick? I heard Mama and Tizzy whisper, and they said I was going to die soon. But maybe not. You're different than anybody I've ever met.

There were many changes, some you could see, some you couldn't. Hair has started growing in all sorts of places, along with other things. I felt pretty good considering that was a clip from The Curious Case of Benjamin Button. Maybe you've seen it or you've heard of the story, but what you might not know is that for nearly the first hour of the film, the main character, Benjamin Button, who's played by Brad Pitt, is completely computer-generated from the neck up.

Now, there's no use of prosthetic makeup or photography of Brad superimposed over another actor's body. We've created a completely digital human head. So I'd like to start with a little bit of history on the project. This is based on an F. Scott Fitzgerald short story. It's about a man who's born old and lives his life in reverse.

Now, this movie's floated around Hollywood for well over half a century. We first got involved with the project in the early '90s with Ron Howard as the director. We took a lot of meetings, and we seriously considered it. At the time, we had to throw in the towel; it was deemed impossible. It was beyond the technology of the day to depict a man aging in backwards. The human form, and particularly the human head, has been considered the Holy Grail of our industry.

The project came back to us about a decade later, and this time with a director named David Fincher. Now Fincher is an interesting guy. David is fearless of technology, and he is absolutely tenacious. David won't take no for an answer, and David believed, like we do in the visual effects industry, that anything is possible as long as you have enough time, resources, and of course, money.

And so David had an interesting take on the film, and he threw a challenge at us. He wanted the film, or the main character of the film, to be played from the cradle to the grave by one actor. It happened to be this guy. We went through a process of elimination and a process of discovery with Dave, and we ruled out, of course, swapping actors. That was one idea; we would have different actors hand off from actor to actor, and we even ruled out the idea of using makeup.

We realized that prosthetic makeup just wouldn't hold up, particularly in closeup. Makeup is an additive process; you have to build the face up, and David wanted to carve deeply into Brad's face to bring the aging to this character. He needed to be a very sympathetic character. So we decided to cast a series of little people that would play the different bodies of Benjamin at different increments of his life and that we would, in fact, create a computer-generated version of Brad's head aged to appear as Benjamin and attach that to the body of the real actor.

Sounded great! Of course, this was the Holy Grail of our industry, and the fact that this guy is a global icon didn't help either, because I'm sure if any of you ever stand in line at the grocery store, you know we see his face constantly. So there really was no tolerable margin of error.

There were two studios involved: Warner Brothers and Paramount, and they both believed this would make an amazing film, of course, but it was a very high-risk proposition. There was lots of money and reputations at stake, and we believed we had a very solid methodology that might work. But despite our verbal assurances, they wanted some proof.

So in 2004, they commissioned us to do a screen test of Benjamin, and we did it in about five weeks. But we used lots of cheats and shortcuts. We basically put something together to get through the meeting, and I'll roll that for you now. This was the first test for Benjamin Button, and in here, you can see it's a computer-generated head; it's pretty good attached to the body of another actor.

And it worked! It gave the studio great relief after many years of starts and stops on this project. And making that tough decision, they finally decided to greenlight the movie. I can remember actually when I got the phone call to congratulate us that the movie was a go; I actually threw up. You know, this is some tough stuff.

So we started to have early team meetings, and we got everybody together. It was really more like therapy in the beginning, convincing and reassuring each other that we could actually undertake this. We had to hold up an hour of a movie with a character, and it's not a special effects film; it has to be a man. We really felt like we were in a kind of like a 12-step program, and of course, you know the first step is admitting you've got a problem.

So we had a big problem; we didn't know how we were going to do this. But we did know one thing: being from the visual effects industry, we, with David, believed that we now had enough time, enough resources, and, God, we hoped we had enough money. And we had enough passion to will the process and technology into existence.

So when you're faced with something like that, of course, you got to break it down. You take the big problem, you break it down into smaller pieces, and you can start to attack that. So we had three main areas that we had to focus on. We needed to make Brad look a lot older. We needed to age him 45 years or so.

We also needed to make sure that we could take Brad's idiosyncrasies, his little ticks, the little subtleties that make him who he is, and have that trans through our process so that it appears in Benjamin on the screen. We also needed to create a character that could hold up under really all conditions. He needed to be able to walk in broad daylight, at nighttime, under candlelight.

He had to hold in extreme closeup; he had to deliver dialogue. He had to be able to run, he had to be able to sweat, he had to be able to take a bath, to cry. He even had to throw up, not all at the same time, but he had to do all of those things, and the work had to hold up for almost the first hour of the movie. We did about 325 shots, so we needed a system that would allow Benjamin to be able to do everything a human being can do, and we realized that there was a giant chasm between the state-of-the-art of technology in about 2004 and where we needed it to be.

So we focused on motion capture. Now I'm sure many of you have seen motion capture, and the state-of-the-art at the time was something called marker-based motion capture. I'll show you an example here. It's basically the idea of you wear a leotard, and they put some reflective markers on your body.

Instead of using cameras, there are infrared sensors around a volume, and those infrared sensors track the three-dimensional position of those markers in real time. Then animators can take the data of the motion of those markers and apply them to a computer-generated character. So you can see the computer characters on the right are having the same complex motion as the dancers.

So we also looked at numbers of other films at the time that were using facial marker tracking, and that's the idea of putting markers on the human face and doing the same process. As you can see, it gives you a pretty crappy performance; that's not terribly compelling. And what we realized was that what we needed was the information—what was going on between the markers.

We needed the subtleties of the skin; we needed to see skin moving over muscle, over bone. We needed creases and dimples and wrinkles and all of those things. So our first big revelation was to completely abort and walk away from the technology of the day—the status quo, the state-of-the-art. And so we aborted using motion capture, and we were now well out of our comfort zone and in uncharted territory.

So we were left with this idea that we ended up kind of calling technology stew. We started to look out in other fields, and the idea was that we were going to find nuggets or gems of technology that perhaps come from other industries, like medical imaging or the video game space, and reappropriate them. We had to create kind of a sauce, and the sauce was a code in software that we written to allow these disparate pieces of technologies to come together and work as one.

So initially, we came across some remarkable research done by a gentleman named Dr. Paul Ekman in the early '70s, and he believed that he could, in fact, catalog the human face. He came up with his idea of facial action coding system. He believed that there are basically 70 basic poses or shapes of the human face, and that from those basic poses or shapes of the face, they can be combined to create infinite possibilities of everything the human face is capable of doing.

And of course, these transcend age, race, culture, gender. And so this became kind of the foundation of our research as we went forward. Then we came across some remarkable technology called contour, and here you can see a subject having phosphorescent makeup stippled on her face. Now what we're looking at is really creating a surface capture as opposed to a marker capture.

The subject stands in front of a computer array of cameras, and those cameras can frame by frame reconstruct geometry of exactly what the subject's doing at the moment, right? So effectively you get 3D data in real time of the subject. If you look in a comparison on the left, we see what volumetric data gives us, and on the right, you see what markers give us.

So clearly, we were in a substantially better place for this, but these were the early days of this technology, and it wasn't really proven yet. But you know, we measure complexity and fidelity of data in terms of polygonal count. On the left, we were seeing 100,000 polygons when we could go up into the millions of polygons. It seemed to be infinite.

This is when we had our "aha!" moment. This was the breakthrough. This was when, like, okay, we're going to be okay; this is actually going to work. The "aha!" was what if we could take Brad Pitt, and we could put Brad in this device, use this contour process, and we could stipple on the phosphorescent makeup and put him under the black lights?

We could, in fact, scan him in real time performing Ekman's facial poses. Effectively, we ended up with a 3D database of everything Brad Pitt's face is capable of doing. From there, we actually carved up those faces into smaller pieces and components of his face, so we ended up with literally thousands and thousands and thousands of shapes—a complete database of all possibilities that his face is capable of doing.

Now, that's great, except we had him at age 44. We need to put another 40 years on him. At this point, we brought in Rick Baker, and Rick's one of the great makeup and special effects gurus of our industry. We also brought in a gentleman named Kazu Suji, and Kazu Suji is one of the great photoreal sculptors of our time.

We commissioned them to make a maquette or a bust of Benjamin. So in the spirit of the great unveiling, I had to do this. I had to unveil something. So this is Ben at 80. Right now, we created three of these: there's Ben at 80, there's Ben at 70, there's Ben at 60. This really became the template moving forward.

Now, this was made from a life cast of Brad, so in fact, anatomically, it is correct—the eyes, the jaw, the teeth, everything is in perfect alignment with what the real guy has. We have these maquettes scanned into the computer at very high resolution and enormous polygonal count. So now we had three age increments of Benjamin in the computer, but we needed to get a database of him doing more than that.

So we went through this process called retargeting. This is Brad doing one of the Ekman facial poses, and here’s the resulting data that comes from that—or the model that comes from that. Retargeting is the process of transposing that data onto another model. Because the life cast or the bust, the maquette of Benjamin, was made from Brad, we could transpose the data of Brad at 44 onto Brad at 87.

So now we had a 3D database of everything Brad Pitt's face can do at age 87, in his 70s, and then in his 60s. Next, we had to go into the shooting process. So while that's going on, we're down in New Orleans and locations around the world, and we shot our body actors. We shot them wearing blue hoods. So these are the gentlemen who played Benjamin.

The blue hoods helped us for two things: one, we could easily erase their heads, and we also put tracking markers on their heads so we could recreate the camera motion and the lens optics from the set. But now we needed to get Brad's performance to drive our virtual Benjamin. So we edited the footage that was shot on location with the rest of the cast and the body actors.

About six months later, we brought Brad onto a soundstage in Los Angeles. He watched on the screen, and his job then was to become Benjamin. We looped the scenes he watched again and again; we encouraged him to improvise. He took Benjamin to interesting, unusual places that we didn't think he was going to go. We shot him with four HD cameras so we could get multiple views of him, and then David would choose the take of Brad being Benjamin that he thought best matched the footage with the rest of the cast.

From there, we went into a process called image analysis. Here you can see again the chosen take, and we are seeing now that data being transposed onto Ben at 87. What's interesting about this is we used something called image analysis, which is taking timings from different components of Benjamin's face.

So we could choose, say, his left eyebrow, and the software would tell us that in frame 14, the left eyebrow begins to move from here to here; it concludes moving in frame 32. We could choose numbers of positions on the face to pull that data from. Then the sauce I talked about with our technology stew—that secret sauce—was effectively software that allowed us to match the performance footage of Brad in live action with our database of aged Benjamin, the face shapes that we had.

On a frame-by-frame basis, we could actually reconstruct a 3D head that exactly matched the performance of Brad. So this was how the finished shot appeared in the film. Here you can see the body actor, and then this is what we called the dead head—no reference to Jerry Garcia. Then here's the reconstructed performance now with the timings of the performance, and then again the final shot.

It was a long process. So thank you.

Next section here, I'm going to just blast through this because we could do a whole TED Talk on the next, you know, several slides. We had to create a lighting system, so really a big part of our process was creating a lighting environment for every single location that Benjamin had to appear.

So that we could put Ben's head into any scene, and it would exactly match the lighting that's on the other actors in the real world. We also had to create an eye system. We found the old adage: you know the eyes are the window to the soul, is absolutely true. So the goal here was to keep everybody looking in Ben's eyes, and if you could feel the warmth and feel the humanity and feel his intent coming through the eyes, then we would succeed.

We had one person focused on the eye system for almost two full years. We also had to create a mouth system. We worked from dental molds of Brad, and we had to age the teeth over time. We also had to create an articulating tongue that allowed him to enunciate his words, so there’s a whole system in software written to articulate the tongue. We had one person focused on the tongue for about nine months. He was very popular.

Skin displacement, another big deal. The skin had to be absolutely accurate, and he's also in an old age home. He’s in a nursing home around other old people, so he had to look exactly the same as the others. So lots of work on skin deformation; you can see in some of these cases it works and in some cases it looks bad. This is a very, very, very early test in our process.

So effectively, we created a digital puppet that Brad Pitt could operate with his own face. There were no animators necessary to come in and interpret behavior or enhance his performance. There was something we encountered, though, that we ended up kind of calling the digital Botox effect.

As things kind of went through this process, it did kind of—Fincher would always say it sandblasts the edges off of the performance. One thing that our process and the technology couldn't do is it couldn't understand intent—the intent of the actor. So it sees a smile as a smile; it doesn't recognize an ironic smile or a happy smile or a frustrated smile.

So it did take humans to kind of push it one way or the other, but we ended up calling the entire process—and all the technology—emotion capture as opposed to just motion capture. So take another look. Well, I heard mama and Tizzy whisper, and they said I was going to die soon. But maybe not.

Well, I heard mama and Tizzy whisper, and they said I was going to die soon. But maybe not. Well, I heard mama and Tizzy whisper, and they said I was going to die soon. But maybe not.

That's how to create a digital human in 18 minutes. Uhh, thank you! [Music] You just a couple of quick factoids. It really took 155 people over two years, and we didn't even talk about 60 hairstyles and an all-digital haircut. But that is Benjamin. Thank you! [Applause] Incredible.

More Articles

View All
Combining mixtures example
We’re told a partially filled tank holds 30 liters of gasoline with an 18% concentration of ethanol. A fuel station is selling gasoline with a 25% concentration of ethanol. What volume in liters of the fuel station gasoline would we need to add to the tan…
Fishing in Thorne Bay | Life Below Zero
COLE: You ready to reel a fish in, Willow? WILLOW: Yeah. COLE: It’s been a while, huh? WILLOW: Yeah. COLE: We’ll see. Well, today, Timber and Willow, Willow mostly, they both been asking to go fishing. So, see if we can just pull one winter king in. K…
What If We Detonated All Nuclear Bombs at Once?
Many of our viewers have asked us a very serious question: What if we made a big pile of bombs and exploded every nuclear weapon in the world all at once? Strangely enough, we couldn’t find a good source to answer this question to our satisfaction. So, we…
Writing y = mx proportional equations worked example 1 | Grade 8 (TX) | Khan Academy
We are told in a rowing exercise Claudia completes 450 strokes in 15 minutes. Write an equation that can be used to find the number of strokes y she can row in x minutes. So, pause this video and see if you can figure that out. All right, now let’s think…
Safari Live - Day 356 | National Geographic
This program features live coverage of an African safari and may include animal kills and carcasses. Viewer discretion is advised. A very good afternoon to you all! Welcome to the sunset safari of today. My name is Lauren and on camera I do have Senzo th…
Determining and representing the domain and range of exponential functions | Khan Academy
We’re told to consider the exponential function f, which they’ve after righted over here. What is the domain and what is the range of f? So pause this video and see if you can figure that out. All right, now let’s work through this together. So let’s fir…