This is a quick primer / reference guide on the Intel instruction set architecture (ISA).

Bookmarks


Prerequisite Knowledge

This section details a few fundamental concepts needed to get started with Intel assembly.

Terminology

Understanding binary and hex as well as words like “bit”, “byte“(8-bits) and “nibble“(4-bits) are key to understanding anything further related to Intel assembly.

Stack

The stack is a Last-In-First-Out (LIFO) data structure used for local variables, function parameters and assisting with program control flow. Learn more about the stack here.

A stack structure supports two primary instructions push and pop. A push will place a value on the top of a stack while subtracting from the stack pointer, while a pop will remove a value off of the top of the stack (while adding to the stack pointer) and place the popped value in a storage location (such as a register).

* The stack grows upward towards the lower memory range.

* A pop increments the ESP register by 4 bytes and a push decrements the ESP register by 4 bytes.

Heap

The heap is a managed memory region which allows for dynamic allocation of memory during runtime. The heap is typically used for objects too big to be placed on the stack.

* The heap exists in the lower memory ranges and grows downward towards the stack.

1 and 2’s Complement

Essentially, the 1’s complement of a binary number is calculated by flipping each bit. For example, the 1’s complement of value “0011” would be “1100”.

The 2’s complement of “0011” is calculated by flipping each bit as performed previously (to “1100”) and then adding a 1 to this value, thus getting a final value of “1101”. As another example, If you have the number “0000” and you take the 2’s complement, you will get “1111” as the 1’s complement then add 1, thus getting “10000”. As you can see, the 4-bit value is now a 5-bit value after the carry to the 5th bit.

Additional Info

  • If the top bit is set (i.e. 0x80000000) the value is negative.
  • Given a 32-bit number our range is -31 bits all the way up to +31 bits (minus 1).
  • Learn about 1 and 2’s complement here.
  • Resources for performing binary/hex arithmetic and conversions are included in the Tools and Resources section.

Intel Assembly Basics

Tools

  • Compiler is used to take high level source code (like C) and generate assembly code.
  • Assembler takes assembly code and generates machine/object code.
  • Linker takes multiple relocatable object codes and creates a single binary.
  • Loader loads an executable at runtime.
  • Disassembler reverses machine code back into assembly code.

Word Size

  • A byte is the smallest, addressable size in the Intel architecture. (ex: 0xFF)
  • A WORD (generically) is 2 consecutive bytes (ex:0xFFCC). (This stems from the days of 16-bit systems.)
  • In a 32-bit system a WORD can be considered 4 bytes (32 bits). Similarly, on a 64-bit system, a WORD would be 8 bytes.
  • A DWORD and QWORD are 4 consecutive bytes and 8 consecutive bytes respectively.

Syntax

Intel Syntax - The first operand is the destination and second operand is the source. (ex: mov edx, ecx). This syntax is far more prevalent.

AT&T Syntax - First operand is the source operand and second operand is the destination. (ex: movl %ecx, %edx). Very recognizable by the ampersands among other differences.

Endian-ness

Endianness refers to the order of bytes (usually in memory) of a binary number.

Consider a series of memory addresses 0x00, 0x01, 0x02 and 0x03 and consider a hex integer 0x41424344. To store this integer in the given memory addresses in a Little Endian format, it would be stored with the low-order bytes first - 0x44, 0x43, 0x42, 0x41 respectively in addresses 0x00, 0x01, 0x02 and 0x03. Big Endian would store the integer with the higher-order bytes first 0x41, 0x42, 0x43, 0x44 respectively in addresses 0x00, 0x01, 0x02 and 0x03.

  • Endianness comes into play when there are 2 or more consecutive bytes.
  • Big Endian is also known as “Network Byte Order”. (TCP sends data in Big Endian format)
  • No concept of endianness exists when it comes to values stored in a register.

Prologue/Epilogue/Stack Frame

The stack frame is set up via the function prologue. (Example shown below)

push ebp
mov ebp, esp
sub esp, N

The stack frame pushes the current base pointer onto the stack (via push ebp) then stores the stack pointer into EBP at the start of a function call. This is done so that local variables and arguments of that function can be referenced relative to EBP throughout the execution of the function. Local variables are referenced above (-)EBP while arguments are referenced below (+)EBP.

The stack frame is destroyed via the function epilogue.

mov esp, ebp
pop ebp
ret

Calling Conventions

* A call instruction pushes a return address onto the top of the stack and jumps to the memory address referenced in the call instruction (by setting EIP to the call destination). The return address is the address of the call instruction plus 4 bytes (essentially the next instruction after the call).

* ret/retn (return) instruction (essentially) pops the top of the stack (the return address) into EIP and directs execution flow to it.

* retn [int] goes a step further and increments ESP [int] bytes in order to clean up any stack parameters used during the respective function call.

stdcall
With this convention, arguments of a function are pushed in reverse order then the called function (callee) is responsible for cleaning up the stack after. In this convention, the retn [int] return instruction is used.

cdecl
With a cdecl call, the calling function is responsible for cleaning up the stack. This is typically done by using an add esp, int statement after the function has returned. (shown below)

  • The cdecl advantage is that it allows for a variable amount of arguments to a function.
function:
  push ebp
  pop ebp
  retn
push 10101010h
call function
add esp, 4

fastcall
This convention stores arguments in registers (x86 stores first two in ecx, edx and the rest on the stack, x64 stores first four in rcx, rdx, r8 and r9) since registers are faster than storing on the stack (memory). The callee then cleans the stack in x86 (similar to stdcall) and in x64 the caller cleans the stack (similar to cdecl).

Assorted Assembly Knowledge

  • EAX generally contains the return value for function calls.
  • Some x86 instructions need to work with 64-bit operations, in these cases, EDX:EAX is typically used.
  • In an IDIV instruction a 64-bit value, EDX:EAX is divided by ECX. The quotient is stored in EAX and the remainder is stored in EDX.
  • Jumps can be used as evidence of signed vs unsigned operations. ja, jae, jb and jbe are related to unsigned operations while jl, jle, jg and jge are related to signed operations.

Registers

Registers are located on the CPU and are extremely fast to access.

EIP - The Extended Instruction Pointer (EIP) or program counter is a reserved register that contains pointer to the memory location of the currently executing instruction. 32-bit arch does not allow direct access to this register.

General Purpose Registers (GPRs)

Numeric  Register  Purpose  Save
000EAXTypical return value and sometimes accumulatorNo
001ECXCounter registerNo
010EDXGeneral purpose and sometimes extension to accumulator  No
011EBXGeneral purposeYes
100ESPStack pointerYes
101EBPBase frame pointer register and used to build stack frame Yes
110ESISource index registerYes
111EDIDestination index registerYes


32-bit | Low-Order 16-bit | 8-bit (bits 8-15) | Low-Order 8-Bit
EAXAXAHAL
ECXCXDHDL
EDXDXCHCL
EBXBXBHBL


Additional Info

  • The “E” in front of each register stands for “Extended” which is due to the carry over from older 16-bit architectures.
  • The low-order 16-bits of every general purpose register can be accessed by removing the “e” from the register name (e.g., ax, cx, dx, bx, sp, bp, si, di).
  • Only eax, ecx, ebx and edx can reference high/low 8-bits (e.g., ah/al, ch/cl, bh/bl, dh/dl respectively).

Segment Registers

  • CS - Code Segment Register - Maintains the Ring Level (0-3) in the Current Privilege Level (CPL) field.
  • DS - Data Segment Register
  • SS - Stack Segment Register
  • ES - Extra Data Segment Register.
  • GS - Extra Segment Register

Other Registers

EFLAGS

EFLAGS register is used to store status and execution states.

  • ZF/Zero flag - Set if previous arithmetic op is zero, otherwise it is cleared.
  • SF/Sign flag - Set when result of an op is negative and cleared when positive. Also set when most significant bit is set after an arithmetic op.
  • CF/Carry flag - Set when result of an op requires a carry (applies to unsigned numbers) because result is too large/small for destination.
  • OF/Overflow flag - Set if result overflows max size (applies to signed numbers).
  • TF/Trap flag - Used for debugging. x86 will execute only one instruction at a time if this flag is set.

Control Registers

  • CR0 - Controls whether paging is on or off.
    • Bit 0 - Protected Mode Enabled
    • Bit 16 - Write-Protect (when set, CPU cannot write to read-only memory even in Ring 0)
    • Bit 31 - Enable Paging (allows CR3 to be used)
  • CR2 - Contains the linear address that caused a page fault.
  • CR3 - Contains physical base address of Physical Directory Base Register (PDBR). Used when virtual addressing is enabled.
  • CR4 - Controls hardware virtualization settings.
    • Bit 5 - Physical Address Extensions (PAE) (extends 32-bit addressing to 36-bit)
    • Bit 20 - SMEP (Supervisor Mode Execution Prevention) which disallows Ring 0 from executing user mode memory.
    • Bit 21 - SMAP (Supervisor Mode Access Protection) disallows Ring 0 from accessing user mode memory.

Debug Registers

  • DR0 - DR3 - Contains linear address of memory location to be watched
  • DR4, DR5 - Aliases for DR6 and DR7
  • DR6 - Debug status register which contains type of last exception occurred (execution/access/write). These bits must be cleared by debugger, not processor.
  • DR7 - Debug control register

Instructions

The Intel x86 ISA supports a wide variety of instructions. Detailed information on these instructions can be viewed via the Intel 64 and IA-32 Architectures Software Developer Manuals.

Intel instructions have a variable length format, the general machine format is shown below. The parts of an instruction are further explained here.

PREFIX | OPCODE | MODR/M | SIB | DISPLACEMENT | IMMEDIATE

*Steps on how to code assembly instructions into their machine counterparts can be found here.

Additional Info

  • Instruction operands can be a register, an immediate (constant value) or a memory address.
  • A Label is an optional identifier followed by a colon.
  • A Mnemnoic is a reserved name for the human-readable form of a machine instruction. (ex: opcode 0x03 is add).
  • Assembly instructions have the human-readable format: label: mnemonic operand1, operand2, operand3
  • Dereferencing memory is done in assembly using bracket [ ebx ] notation. This means memory is being accessed. In other words, when memory is dereferenced, you are reading/writing the value that is stored at a memory address rather than the memory address itself.

Instruction Classes

Simple - The mov instruction is a simple and oft-used instruction which moves data from one place to another.

Arithmetic - A multitude of arithmetic operations exist for addition, subtraction, etc… (ex: add, sub, inc, dec, mul, div, etc…).

NOP - The nop instruction does nothing, execution simply continues to the next line. (fun fact: a NOP is really a xchg eax, eax.)

Stack - This includes instructions for moving data to and from the stack like push and pop.

Function - This includes instructions for calling and returning from functions (ex: call, ret, retn, etc…)

Conditionals - These instructions are for making comparisons. (ex: test, cmp, etc…)

Branching - Consisting of conditional and unconditional jumps, these instructions control flow of the program. (ex: jz, jnz, je, jg, and many many more…).

Rep - Instructions for manipulating data buffers. (ex: rep, repz, repne, etc…)

*This list of instructions is far from exhaustive. Reference the Intel manual for a complete list.

Instruction Anatomy

Prefix
More details to follow…

Opcode
1-3 byte value representing the machine code value for an instruction.

ModR/M
1 byte value which follows the opcode and identifies the addressing mode as well as the register/memory operands. Only some instructions require this byte. Instructions which require this byte will have the “ModRM” label in it’s respective Instruction Operand Encoding table.

MODR/M Byte Format

MODREGR/M
2-bit Addressing Mode  3-bit r32 operand or opcode extension  3-bit register or memory operand


SIB
More details to follow…

Displacement
8, 16, or 32-bit number that represents a memory location or an offset from a memory location. (ex: mov dword [ecx + 0xAABBCCDD], 0x11223344)

Immediate
8, 16, or 32-bit value that is a literal number. (ex: 0xAABBCCDD in the instruction mov eax, 0xAABBCCDD)


Building an Instruction

The table below describes the various MODR/M addressing modes which are needed to build many types of instructions.

MODR/M Addressing Modes

MOD  Assembly  Explanation
00[r/m]r/m32 operand memory address is located in the r/m register
00[disp32]if MOD is 00 AND R/M is 101 this indicated r/m32 location is a memory location that is a displacement32 only
01[r/m32 + byte]r/m32 operand memory address is located in the r/m register + a 1-byte displacement
10[r/m + dword]r/m32 operand memory address is in the r/m register + a 4-byte displacement
11r/mr/m32 operand is a direct register access

Example 1

Take for example the instruction add eax, ebx

add instruction

add operand encoding

  1. Find the ADD instruction in Intel manual (shown above.)
  2. Find in the Instruction column an instruction which takes two r/m32 operands. In this case, the opcodes which match this description are “01 /r” and “03 /r”.
  3. Checking the Instruction Operand Encoding table we can match the Op/En for each of the opcodes found above with the entry in the table. For example, the 01 /r opcode has an Op/En of MR which based on the Operand Encoding Table would make Operand 1 the r/m and Operand 2 the reg.
  4. Since from the Operand Encoding table we can see ModRM in the MR row we know that a MODR/M byte is required for this instruction.
  5. From the MODR/M addressing mode table we can see that since the r/m32 operand is a direct register, the value for the first two bits of the MODR/M byte is 11.
  6. Since “01 /r” is ADD r/m32, r32 we know that the next three bits of the MODR/M byte is the reg which in this case is the second operand of the instruction “ebx” which is encoded as 011. The final three bits is the instruction “eax” which is encoded as “000”. (These encodings are found in the General Purpose Registers table)
  7. Putting it all together we have an opcode of 0x01 plus 11011000b which translates to 0x01 0xD8 which when disassembled translates to add eax, ebx!

Additional Info

  • r/m32 means you can use a register or memory.
  • r32 means you can only use a register.
  • An Intel instruction is of variable length and can be up to 15 bytes (120 bits).

Opcode Flags

  • NP — Indicates the use of 66/F2/F3 prefixes (beyond those already part of the instructions opcode) are not allowed with the instruction. Such use will either cause an invalid-opcode exception (#UD) or result in the encoding for a different instruction.
  • /digit — A digit between 0 and 7 indicates that the REG field (2nd field) of the ModR/M byte contains the 3-bit value (0-7) which provides an extension to the instruction’s opcode.
  • /r — Indicates that the REG field (2nd field) of the ModR/M byte contains the 3-bit r32 operand value.
  • cb, cw, cd, cp, co, ct — A 1-byte (cb), 2-byte (cw), 4-byte (cd), 6-byte (cp), 8-byte (co) or 10-byte (ct) value following the opcode. This value is used to specify a code offset and possibly a new value for the code segment register.
  • ib, iw, id, io — A 1-byte (ib), 2-byte (iw), 4-byte (id) or 8-byte (io) immediate operand to the instruction that follows the opcode, ModR/M bytes or scale-indexing bytes. The opcode determines if the operand is a signed value. All WORDs, DWORDs and QWORDs are given with the low-order byte first.
  • +rb, +rw, +rd, +ro — Indicated the lower 3 bits of the opcode byte is used to encode the register operand without a modR/M byte. The instruction lists the corresponding hexadecimal value of the opcode byte with low 3 bits as 000b. In non-64-bit mode, a register code, from 0 through 7, is added to the hexadecimal value of the opcode byte. In 64-bit mode, indicates the four bit field of REX.b and opcode[2:0] field encodes the register operand of the instruction. “+ro” is applicable only in 64-bit mode.

NASM Intro

NASM (Netwide Assembler) is a cross-platform assembler. It is a quick way to assemble and disassemble assembly code and machine code respectively

Below is an example of an assembly listing file. (Saved with a .s extension)

Assembly Listing File

[BITS 32]

push ebp
push edi
retn

my_first_label:
mov dword [eax], esp
push ebp
push edi
retn

jmp my_first_label

Running nasm file.s you can get an assembled file. Running ndisasm -u file you can get the disassembled assembly code as shown below.

00000000  55                push ebp
00000001  57                push edi
00000002  C3                ret
00000003  8920              mov [eax],esp
00000005  55                push ebp
00000006  57                push edi
00000007  C3                ret
00000008  EBF9              jmp short 0x

Tools and Resources