yego.me
💡 Stop wasting time. Read Youtube instead of watch. Download Chrome Extension

Debunked: Making Music With Cars (Bootboxing and Techno Jeep)


4m read
·Nov 8, 2024

I saw a couple of videos in the last few months through boxing, featuring snobs gorillas and Julian Smith technology original. Both of them featured cars being played by a group of people. The people appeared to be manipulating various parts of the cars in real time to create beats. A lot of people are impressed with the videos, and some have reacted quite angrily to suggestions that there may be foul play going on. It's not my intention to make a value judgment here or to comment on the artistic merit of the videos, but this is an attempt to explain why I think there's very good reason to think that the performances are fake.

So, a bit of background about me: I've been working with audio sequences and digital audio for about 12 years. Most of that time, my audio work has consisted of arranging very small slices of sound that have a percussive character. Lately, I've written software to measure the precision with which a human performs a back a musical passage. So, when I watched these videos, in both cases, it was obvious to me that the audio tracks we were hearing weren't being created by the people we were watching. Instead, the audio for the duration of most of the videos was the output of a computer sequence of playing back digital samples of recorded car noises.

In the case of both car performances, there are two unmistakable fingerprints of digital audio sequencing. The first one: the timing is exactly on the grid; it's metronomic. Only a computer can perform music that way. Even the most technically skilled humans, on that precise level, wouldn’t perform music that way. Musicians naturally push and pull the timing of notes by tiny amounts. This is a big part of what gives a musical performance its character. The producers of the car clips could have made them considerably more lifelike by applying processing to humanize the timing of the sequence. Humanizing, in the language of music software, just means applying a random push or pull to each musical event, so it doesn't stay exactly on the timing grid, and it makes the sequence sound less robotic.

The second thing was that the sounds repeated themselves exactly. I don't mean that I heard the car door slamming repeatedly; I mean that I heard the same recording of a car door slam over and over. If you repeatedly playback a small piece of audio, for instance, a recording of a snare drum, it can quickly sound unnatural, especially if you play it repeatedly in quick succession. That's what happens in the car clips, and again, the producers missed a simple way to create something more lifelike using the technique called multi-sampling. What they should have done was to sample many different door slams and then have the computer select a random sample each time it needed to play a slam sound.

So, to finish with, we'll take a look at some waveforms. I recorded myself clicking my fingers. I imported it into Audacity, which is a free sample editor software. I made a copy of the recording onto a second track. Here, it's a stereo recording, by the way, so that's why you see four horizontal lines in total. I offset the copy of the recording so that different clicks would align with each other as closely as possible. So, at this level of magnification, you can see that there clicks look and whatever the waveforms looks. And here's how they sound: so this is the first one, and here's the second one. They sound pretty similar, not identical though, pretty close.

As we continue to zoom in, though, we notice that the similarity between the waveforms starts to drift. For instance, at this level of magnification, we already see in detail that this top one has several distinct islands of amplitude, whereas in this one, we have a much more even tail. So, as we get closer, the differences become more apparent. We already see that there's a lot of disagreement occurring between the peaks and troughs of this initial attack portion here. In the second one, you see this area of high peak trough frequency just here, and many of the peaks and troughs don't align.

So, here I've imported the audio from the video Do Boxing featuring Snot Scroller. I've done a similar thing; I'd say I made a copy of the track and offset one of them so that two different bars of the audio coincide with each other a bit closer. So, this is a double door slam that we're seeing, and here's how the other one sounds. There's an immediate similarity between their two waveforms, but as we zoom in, this time we see that the similarity holds really well.

So remember that because these are stereo tracks, you should compare the top two and the bottom two. So, again we have a very good agreement still; this distinctive 3-3 peak here is repeated in this one. So, as we see here, the waveform appears to be made up of individual points. Each of these points is one sample; confusingly, sample in this context doesn't mean a short audio clip, but the smallest unit that a digital waveform is made up of. A sample in an audio file is analogous to a pixel in an image file. In this example, there are 44,100 of these samples for every second playback, so this is an absolutely tiny slice of audio we’re looking at, and the waveforms still agree.

So down to the sample level, we have a really high agreement still between the two waveforms. Here, I've done the same with the audio from the track Techno Jeep, so I'm focusing on this section here. Here's how the first track sounds pretty much. Of course, we can see the waveform looks very similar to start with and focus on the first of these slams here. So you see the similarities are really high still—just focus on this little peak here. Again, even down to the sample level, the agreement is very, very high between the two tracks. This shows beyond doubt that the slamming door sounds in these two videos are sequenced audio clips and not the recording of a human performance.

More Articles

View All
The CIA's TOP SECRET Mind Control Drug
At the end of the Korean War, The New York Times published a gripping story detailing how returning American soldiers may have been converted by communist brainwashers. The story became widely popular. Some troops were allegedly confessing to war crimes, …
Reduction of Air Pollutants| Atmospheric Pollution| AP Environmental Science| Khan Academy
Hey there friends, today we’re going to learn about air pollution, and to start off, we’re going back in time to the small town of Donora, Pennsylvania, in October of 1948. Walking into this small industrial town, you can immediately sense that something…
Volumes of cones intuition | Solid geometry | High school geometry | Khan Academy
So I have two different three-dimensional figures here. I have a pyramid here on the left, and I have a cone here on the right. We know a few things about these two figures. First of all, they have the exact same height. So this length right over here is…
Abiotic factors and an organism's range | High school biology | Khan Academy
So, let’s talk a little bit about abiotic factors for an organism’s range. Before we even get into it, let’s just think about what these words mean. In other videos, we’ve talked about how abiotic means non-living, while biotic would refer to living. So, …
How to read a document | The historian's toolkit | US History | Khan Academy
Hello David, hello Kim. So today what we’re doing is taking a look at this speech by one of my favorite Presidents, Franklin Delano Roosevelt, which he gave at his inauguration in 1933. I think what’s really important about looking at a speech like this i…
If Life Has No Meaning, Why Live? | Albert Camus & The Absurd Man
According to French-Algerian philosopher Albert Camus, our world has no ultimate meaning, but if it had, it would be impossible to know it. It’s all pretty pointless, as if the universe is nothing more than a cosmic coincidence, born without any specific …