x86 Assembly

Introduction

Assembly is a low-level programming language that is a “human-readable” interpretation of machine code. Its instruction set resembles machine code very closely which makes it difficult to understand at first. Modern processors use the x86-64 architecture. You will also encounter processors with ARM architectures, but for the sake of simplicity I will not be discussing that here.

The CPU

Your CPU parses and executes instructions passes by a program loaded in memory. To do this it uses general purpose and special purpose registers which are memory locations within the CPU. If you have an x86 CPU these registers will hold 32 bits of data. If you have an x86_64 CPU these registers will hold 64 bits of data.

General Purpose Registers - 32 Bit

EAX, EBX, ECX, EDX, ESP, EBP, ESI, EDI

EIP

EIP - Instruction Pointer

Syntax

Intel Syntax:

label: mnemonic operands ; comment

Label - Think of these as variable assignments where the value you are storing is the instruction in front of them. These are optional.

Example
add     eax,ebx ; EAX = EAX + EBX

AT&T Syntax

label: mnemonic(suffix) (prefix)operand ; comment

Label - Works the same as Intel syntax Mnemonic - Works the same as Intel syntax, but also requires a suffix that specifies the size of the data you are handling.

Example
movl    %ebx,%eax ; EAX = EBX

The Stack

The stack is a location in memory set aside for your program. As data is added to the stack, the stack pointer is decremented. This means as data is added, the address of the ‘top’ of the stack or ESP moves to a lower memory address. The stack grows ‘down’ toward the EBP or base pointer. push

ret - calls the program back to the return address at the top of the stack by moving the return address into EIP or the instruction pointer.

Assembly Program Sections

_start - equivalent to the main() function in most languages .text - defines functions .data - contains variable assignments

section .text    
global _start_start:           ; write our string to stdout.        
mov     edx,len                ; third argument: message length.        
mov     ecx,msg                ; second argument: pointer to message to write.
mov     ebx,1                  ; first argument: file handle (stdout).        
mov     eax,4                  ; system call number (sys_write).        
int     0x80                   ; call kernel and exit.    
mov ebx,0                      ; first sys call argument: exit code.        
mov     eax,1                  ; system call number (sys_exit).        
int     0x80                   ; call kernel.section .datamsg     
db      "Hello, world!",0xa    ; the string to print.
len     equ     $ - msg        ; length of the string.

References