GNU Assembly Tutorial 1
Hey guys, this is Mids And1. Today, we're going to be doing a tutorial on something called assembly.
So if you've ever wondered what your microprocessor ever runs or what programming language your microprocessor uses, the answer is assembly. Therefore, if you program in assembly, you'll be programming directly for your processor in a language that your processor really does understand. And that's one of the cool features about assembly that you'll never get in any other programming language. Maybe a bit in C, but not really.
The thing is that on the Macintosh computer, there's no software that comes with Xcode or anything that allows you to program directly for your microprocessor. Instead, there is something that comes with Xcode called GCC, which compiles something called GAS. GAS is essentially an assembler language that runs on cross-platforms. It's not processor-specific, but it's pretty close, and it's good for understanding how your computer's processor really works.
So before you even start to use assembler, you have to understand at least a bit about the hardware of your computer. That way, you understand what assembler is actually doing when you're running instructions. Just like I said, the microprocessor in your computer is a chip in your computer that runs assembly code, and assembly code is formed of instructions.
When you do anything in assembler, you have to use an instruction, and there are only a couple dozen instructions at best. So it's very limited in functionality. That's the way the processor gets instructed. The next part of your processor is that it holds little chunks where it can store a bit of data, called registers, and I'll go over them again in a couple of minutes just to give you a better idea.
Next, you have to understand what RAM is. RAM, also known as memory, are little chips in your computer that can store data. Now, RAM needs electricity running through it to keep track of data, and that's why you never want to store any file in RAM. I don't know if that's even possible really, but RAM is solely for the purpose of running applications to store the data they've accumulated while running, and they shouldn't keep stuff in RAM after they close. So that's what RAM is.
Now, since RAM has a ton of bytes, it is addressed, and what that means is every byte in RAM has a unique number that identifies it, a number that no other byte in RAM has. That number is 32 bits. While on that topic of bits, a bit is a one or a zero, and a byte is eight ones or zeros. So a bit and a byte are two different things.
There are 256 different things that can be in a byte. Every byte in RAM is addressed with 32 bits, which is a 4-byte thing that gets put together and gets turned into a number. The problem is that there are only so many numbers that 32 bits can hold—like a couple billion combinations that 32 bits can hold.
That's why people started making 64-bit operating systems, which address places in RAM with 64 bits instead of 32 bits. This way, you can have much more RAM. Then you get into the problem with programming on a 64-bit operating system on a 32-bit processor. The processor holds registers that hold 32 bits, and when you're addressing RAM, you're using 64 bits to address them.
Normally, when you want to hold an address for a byte in RAM, you want to put it either into another place in RAM, but eventually, you'll have to put it in a register somewhere. A problem arises: if you want to put a 64-bit pointer into a 32-bit register, then you're out of luck.
That's why GAS is also great, because it allows you to use 64-bit code since it makes special registers that are 64 bits. That's why we're going to be using GASmbly instead of processor assembly as well.
At this point, I think you understand enough of the fundamentals to be ready to start programming a basic assembler application. Now, before we even get started programming, once again, I'll have to say that making a "Hello, World!" application is harder than in any other language due to the limited functionality of assembler. So that's not even going to be our first application. Our first application is going to be something that doesn't do anything.
Every assembly application, or GASmbly, has to declare a main function and have a main function. If you don't understand what functions are, you should learn another programming language before doing assembler, just so that you understand some of the terminology I am using.
In our main function, there are a few things you have to do. First, you have to run this instruction. Then, you have to move something into something else. By the way, semicolons are not necessary in GASmbly.
This sets up something called the stack frame. I will explain this in another tutorial, but for now, just put it there. At the end of our main function, we have to call leave.
So what we're going to do is our code would normally go here. We're going to return a number. To understand this, you have to know that every command you ever run, every application has a return value. So if I run LS
, you get a directory listing. Now, let's say you want to see what LS
returned; you do echo $?.
This will show you zero.
This really just tells you what's in the EAX register at any given time. Therefore, you know if applications put zero in the EAX register, then you won't get anything. There's a specific app called false
which puts one into EAX.
So there we go. We're going to use the move instruction to put zero into EAX. I will also point out you should not put any large numbers into EAX because that will cause problems.
Alright, so you'll see right here up above I do a move Q
, and here I do a move L
. Move Q
up here means move 64 bits, and move L
means move 32. For instance, this is a 32-bit register.
Now, in 32-bit systems, these two lines of code would actually be correct, but now I realize you have to use special registers that are in fact 64 bits for these. Every 64-bit register starts with an R in this case. That's why move Q
is up here, and move L
is down here. Move L
moves 32 bits because E is built into the processor, so therefore it's 32 bits.
So if we want to run this application, we have to run the GCC command on it, then execute the a.out
file. When we run this, we can just take a look at what EAX is, and it's zero. Let's say we want to change it to be something else to make sure it works. How about three? Then we can run this application, and there we go— we have three.
So this is our first GAS program ever. The equivalent to this in C, just in case you're wondering, would be return 3
in the main function. So C is like assembler; it's just more simple and it does stack frames and stuff for you.
This is our first tutorial. I'll post this code in the description of the video. So thanks for watching, Mids And1. Subscribe, and goodbye!