Your Favorite Youtuber Will Soon Be Replaced By AI

11m read

·Nov 4, 2024

How do you know that the voice you're hearing right now is human? Most of you have no idea what I look like, so how can you tell I'm a real person? What if your favorite YouTuber is actually an AI?

2023 is shaping up to be the year of artificial intelligence. Between the controversy swirling around various image generators and all the hype about ChatGBT, AI has been dominating news headlines for months, and for good reason. Known as generative AI, these programs are capable of performing tasks previously reserved for humans, namely the generation of text, images, video, and other creative media.

YouTube's new CEO, Neil Mohan, has even said that the company is looking to expand AI's role in content creation. In a letter outlining YouTube's yearly goals, he stated, "The power of AI is just beginning to emerge in ways that will reinvent video and make this seemingly impossible possible." It's likely that in a few months, you may not be listening to my voice, but one created by an AI.

Of course, this technology isn't exactly new. The AI video platform Synthesia has been around since 2017 and has partnered with major brands like Nike, Reuters, BBC, and Google. Starting at just $30 a month, you can use this service to create your very own digital twin—an AI-generated avatar that both looks and sounds just like you. The process is simple. First, you record yourself reading eight pages of pre-written scripts, each one capturing a different tone, like instructional, professional, or cheerful.

Next, after a bit of hair and makeup, you stand in front of a green screen, working with a director and film crew to record various movements. The whole thing only takes three hours, and afterward you're getting access to a platform where you can insert text or upload audio files to the avatar. You can even tweak the audio to more accurately represent your natural speaking pattern. Recently, ChatGPT has been added to the mix, further automating content creation.

This means creators can hand over every part of the production process to AI—from coming up with the idea to writing the script, recording the audio, and shooting the video. One of the scariest things about the rise of AI is that a lot of people are sadly going to lose their jobs. AI itself told me that jobs like data entry clerks, bank tellers, and assembly line workers are at risk of being taken over by automation.

In light of this, it has become more important than ever to learn high skills that cannot be easily automated out of existence. If you want a high-paying career in the technology industry but don't have previous experience or a degree, Course Careers is here to help. All you do is go through an affordable online course where you learn everything required to actually do the job. Once you're done, you have the incredible opportunity to work with one of the hosts of the companies they're partnered with.

These companies drop their degree and experience requirements to hire Course Careers graduates into entry-level positions and internships. You no longer need to spend a fortune on college to get a good-paying job. And you don't have to take my word for it. Here is Nyla, a 19-year-old who went from being a Starbucks barista to making over sixty thousand dollars in a remote technology sales career.

And Ben went from being a college dropout working as a middle school janitor to making eighty thousand dollars as a tech sales rep, working fully remote. To join Nyla and Ben, go to coursecareers.com or simply click the link in the description down below and sign up for their free introductory course, where you'll learn exactly how you can start a high-paying technology career without a degree or previous experience. When you're ready to get the full course, use code "APERTURE50" to get $50 off.

Back to our story: the advantages of this technology are obvious. On the most basic level, digital avatars don't have to worry about camera shyness—they always look presentable and never need reshoots. Simply assign the parameters, hit a button, and you've got a piece of publishable content. Not only does this allow creators to manage their workflow better, it also allows them to oversee multiple projects simultaneously.

Rather than being limited to a single production, creators can practically be in several places at the same time. Some YouTubers are already actually doing this, albeit in a more analog fashion. With over 130 million followers, Mr. Beast is the most popular YouTube celebrity on the planet. His videos feature expensive stunts, competitive challenges, Let's Plays, and a wide variety of other fun content—I'm sure you've seen him.

In order to maintain his demanding production schedule, Mr. Beast created a clone of himself—only instead of using Synthesia, he hired a living, breathing person. Mr. Beast 2.0 was trained seven hours a day for two years to learn how to make the exact same decisions that Mr. Beast himself would make. This allowed the YouTuber to essentially be in two places at once, effectively doubling his creative output. Since then, the Mr. Beasts have gone on to make some of the most amazing videos on the platform and start an entire fast food chain.

This cloning strategy offers us a hint of the potential of generative AI. Having multiple creators working under the same name, whether they're a look-alike or artificial intelligence, opens up completely new avenues to explore. Take social media influencers, for instance. Their name is their product, and they sell that product to prospective companies looking to market their goods and services.

Normally, influencers are limited to a single IP themselves, but with generative AI, they can create dozens of digital avatars, each with its own talent agent, associated brands, and licenses. These clones can then be sold to corporate partners who could then use them to create advertisements without the influencer ever having to show up to work. Not only does this increase the potential output of creators, as dozens of videos could be pumped out in the time it used to take to make one, but it also lowers the cost of production.

Instead of hiring an entire team of writers, videographers, editors, makeup artists, and other industry professionals, you only need to pay for a single piece of software. The potential payoff is absolutely staggering. Imagine a world where your digital twin runs around the metaverse doing your work for you, or AI-generated celebrity avatars interact with fans through VR—all thanks to artificial intelligence. All of this will soon be possible.

Synthesia has worked with over 15,000 businesses and created more than four and a half million videos. Though to be candid, these videos tend to be fairly corporate and are limited to a single avatar standing in front of a background. Well, this is fine for HR training videos or marketing promotions, the platform lacks the crucial tools necessary for more creative media. You won't be making an entire short film using Synthesia—at least not yet.

Still, the technology offers us a peek into what's possible. The pieces are all there. Attempting to put them together is Snapchat, which recently announced the launch of its own chatbot dubbed "My AI," powered by ChatGPT. My AI is able to interact with users and respond with natural-sounding dialogue. However, unlike Microsoft's new Bing AI or Google's Bard, it's not meant to serve as a search engine. Rather, Snapchat's AI is presented more like a personality, even appearing in your friends list with its own profile in Bitmoji.

Snapchat's CEO Evan Spiegel has indicated that the company's goal is to humanize AI and to normalize these kinds of interactions, saying, "The big idea is that in addition to talking to our friends and family every day, we're going to talk to AI every day." It seems as though it's only a matter of time before AI-generated personas will be popping up in your feed, though for some of us, that may already be the case.

Meet Xeravega, created by the LA-based production studio Corridor Crew. Xeravega is 100% AI-generated social media influencer. Their videos have been posted on Instagram and TikTok for a little over a year, amassing an audience of around 30,000 followers between platforms. Everything, from the dialogue and animation to the tone and the camera angles, is AI-generated, and the results have been well mixed.

If you scroll through Xeravega's videos, most are a bit nonsensical. The character's speech is odd, their movements are jerky, and each video ends with a random dance sequence—perhaps as an homage to early TikTok dances. Most of the videos are filled with the kind of bugs that you'd see in a video game from the early 2000s. Xeravega's avatar frequently walks through walls, jumps around the room, and makes painfully awkward facial expressions.

Despite all this, what Corridor Crew has accomplished is actually pretty remarkable. The trickiest part of generative AI is successfully combining different elements to form something new and cohesive—making sure a character's lip sync to the audio, that their interactions with locations and objects are organic, and that their decisions form a logical narrative. Even for their quirk, Xeravega has been doing all of this. Their videos contain multiple ongoing stories that build off of each other, including one where they get a jet ski and another where they become trapped in their basement, only to discover that they are in fact an AI.

The biggest technological hurdle that both Xeravega and Synthesia need to overcome is what's referred to as "The Uncanny Valley." It's the psychological gap that we humans experience when seeing something that is close to us yet still an imperfect replica of ourselves. Xeravega's behavior is almost human-like but lacks coherence. The digital avatars created by Synthesia are convincing, but when you watch them, it's clear something is off—the voices are a little too Siri-like, and the avatars are somehow both moving too much and not enough.

It's like they're trying to overcompensate for the fact that they're not real. But this is just a limitation of current technology. Generative AI is still very new, and given a few years, The Uncanny Valley will inevitably be crossed. In reality, there are much bigger problems that everyone—not just content creators—should be worried about.

In a previous video, I talked about AI bias. Since the launch of generative AI programs, many of them have demonstrated clear racial prejudices, likely the result of the way these programs are trained. But more disturbingly, other programs have acted aggressively or erratically towards users who attempt to stress-test their systems.

Companies like Anthropic and Synthesia claim to have installed guardrails to prevent these sorts of behaviors. Others haven't been as diligent. Facebook's chatbot, Llama, was leaked online in early March 2023, and since then it's been downloaded by plenty of people looking to exploit the technology for their own purposes. A group of programmers on Discord created a version of the AI made specifically to spew racial obscenities and hate speech. Groups like these claim that by exposing vulnerabilities in the programs, they're fighting back against the companies behind them—companies that are becoming increasingly secretive about their technology.

OpenAI, the company behind ChatGPT, has done a complete 180 on its original open-source principles. Instead, they've chosen to keep the latest iteration of the chatbot behind closed doors. Microsoft has also made some worrying decisions, including firing the entire ethics and society team in its AI department. This is concerning given the recent wave of lawsuits against generative AI programs like Midjourney and Stable Diffusion—both of which have been accused of training their AIs by using copyrighted works of art without obtaining consent from the artists.

Visual artists have been sounding the alarm about this for months, but it's now a problem that other creators are waking up to as well. It's bad enough when another human steals your idea, but imagine being a comedian and hearing ChatGPT rip off one of your jokes, or being a celebrity and seeing an AI impersonating you online. In fact, this has already happened.

11 Labs is an AI that generates voice clips using audio uploaded by users. You enter a recording of whatever you want and put in some text, and suddenly you have the ability to make Joe Biden and Donald Trump argue about video games, or you can make a dead YouTuber say whatever you want. This is what happened to John Bain, otherwise known as Total Biscuit, a YouTube commentator who passed away in 2018.

In March of 2023, an AI voice model impersonating Bain appeared online, making various inflammatory statements, including transphobic comments. While Bain will never have to endure hearing his voice used as a tool to promote bigotry, Bain's widow has. She's now faced with the choice of whether to remove Bain's 3,000+ videos from YouTube or leave them online, vulnerable to abuse.

Other celebrities have fallen victim to AI impersonation too. One video showed Emma Watson reading sections of Hitler's Mein Kampf, and another showed Mary Elizabeth Winstead using transphobic slurs and repeating 4chan memes.

Besides becoming platforms for trolls that create hate speech-spewing deepfakes, generative AI is also being used by governments as a tool for propaganda. In January, it emerged that someone had used Synthesia to generate a series of videos of a newscaster expressing support for Burkina Faso's new military dictatorship. A few weeks later, state-run television stations in Venezuela began playing a video they claimed was of an American newscaster debunking negative claims about the Venezuelan economy, when in reality, the country has been facing a terrible economic crisis.

In reality, the man featured in the video was one of Synthesia's avatars. Similarly, pro-China videos have also emerged online, also clearly produced using Synthesia. Fortunately, these videos were flagged as AI-generated, thanks to their obvious flaws. But it's only a matter of time before the technology creates avatars and humans that are indistinguishable from each other.

So, what happens when this technology becomes so good you can no longer tell the difference between a person and a program? The promise of generative AI is that it will give creators more opportunities to monetize their work and explore new ideas. More than that, it lowers the bar of entry. In the same way that digital audio workstations, like Ableton, effectively act as an entire orchestra with a DJ as a composer, platforms like ChatGBT and Synthesia allow everyone the opportunity to become a director without needing to get a job in Hollywood.

You don't need writers, actors, or a film crew; you just need a laptop and an idea. We might see a new wave of creative mediums as millions of people find novel ways to express themselves through these programs. That said, the potential for abuse of this technology is extraordinarily high.

In the race for technological supremacy, safety has become an afterthought for many companies. Stronger guardrails need to be implemented; legislation protecting artists' work and individuals' likenesses needs to be passed; and the companies responsible for this technology need to operate with greater transparency. OpenAI recently published a report claiming that 80% of the American workforce will be impacted by ChatGPT in some way, and that doesn't include the various image, video, and audio generators out there.

If artificial intelligence forever changes how we live and work, then we should all have a say in how it's developed and where it's used. Audiences should never have to guess whether or not the voice they're listening to is human.

Now, if you're terrified about the future of generative AI, I'm sorry to say, but you haven't even heard the worst of it. Watch the video on screen right now to find out the scariest thing about ChatGPT.

[Music]

Your Favorite Youtuber Will Soon Be Replaced By AI

More Articles