This is a quick primer / reference guide on the Intel instruction set architecture (ISA).
- Prerequisite Knowledge
- Intel Assembly Basics
- Building an Instruction
- NASM Intro
- Tools and Resources
- Calling Conventions
- Assorted Assembly Knowledge
- ModR/M Byte Format
- ModR/M Addressing Modes
- Opcode Flags
This section details a few fundamental concepts needed to get started with Intel assembly.
The stack is a Last-In-First-Out (LIFO) data structure used for local variables, function parameters and assisting with program control flow. Learn more about the stack here.
A stack structure supports two primary instructions push and pop. A push will place a value on the top of a stack while subtracting from the stack pointer, while a pop will remove a value off of the top of the stack (while adding to the stack pointer) and place the popped value in a storage location (such as a register).
* The stack grows upward towards the lower memory range.
* A pop increments the ESP register by 4 bytes and a push decrements the ESP register by 4 bytes.
The heap is a managed memory region which allows for dynamic allocation of memory during runtime. The heap is typically used for objects too big to be placed on the stack.
* The heap exists in the lower memory ranges and grows downward towards the stack.
1 and 2’s Complement
Essentially, the 1’s complement of a binary number is calculated by flipping each bit. For example, the 1’s complement of value “0011” would be “1100”.
The 2’s complement of “0011” is calculated by flipping each bit as performed previously (to “1100”) and then adding a 1 to this value, thus getting a final value of “1101”. As another example, If you have the number “0000” and you take the 2’s complement, you will get “1111” as the 1’s complement then add 1, thus getting “10000”. As you can see, the 4-bit value is now a 5-bit value after the carry to the 5th bit.
- If the top bit is set (i.e. 0x80000000) the value is negative.
- Given a 32-bit number our range is -31 bits all the way up to +31 bits (minus 1).
- Learn about 1 and 2’s complement here.
- Resources for performing binary/hex arithmetic and conversions are included in the Tools and Resources section.
Intel Assembly Basics
- Compiler is used to take high level source code (like C) and generate assembly code.
- Assembler takes assembly code and generates machine/object code.
- Linker takes multiple relocatable object codes and creates a single binary.
- Loader loads an executable at runtime.
- Disassembler reverses machine code back into assembly code.
- A byte is the smallest, addressable size in the Intel architecture. (ex: 0xFF)
- A WORD (generically) is 2 consecutive bytes (ex:0xFFCC). (This stems from the days of 16-bit systems.)
- In a 32-bit system a WORD can be considered 4 bytes (32 bits). Similarly, on a 64-bit system, a WORD would be 8 bytes.
- A DWORD and QWORD are 4 consecutive bytes and 8 consecutive bytes respectively.
Intel Syntax - The first operand is the destination and second operand is the source. (ex: mov edx, ecx). This syntax is far more prevalent.
AT&T Syntax - First operand is the source operand and second operand is the destination. (ex: movl %ecx, %edx). Very recognizable by the ampersands among other differences.
Endianness refers to the order of bytes (usually in memory) of a binary number.
Consider a series of memory addresses 0x00, 0x01, 0x02 and 0x03 and consider a hex integer 0x41424344. To store this integer in the given memory addresses in a Little Endian format, it would be stored with the low-order bytes first - 0x44, 0x43, 0x42, 0x41 respectively in addresses 0x00, 0x01, 0x02 and 0x03. Big Endian would store the integer with the higher-order bytes first 0x41, 0x42, 0x43, 0x44 respectively in addresses 0x00, 0x01, 0x02 and 0x03.
- Endianness comes into play when there are 2 or more consecutive bytes.
- Big Endian is also known as “Network Byte Order”. (TCP sends data in Big Endian format)
- No concept of endianness exists when it comes to values stored in a register.
The stack frame is set up via the function prologue. (Example shown below)
push ebp mov ebp, esp sub esp, N
The stack frame pushes the current base pointer onto the stack (via push ebp) then stores the stack pointer into EBP at the start of a function call. This is done so that local variables and arguments of that function can be referenced relative to EBP throughout the execution of the function. Local variables are referenced above (-)EBP while arguments are referenced below (+)EBP.
The stack frame is destroyed via the function epilogue.
mov esp, ebp pop ebp ret
* A call instruction pushes a return address onto the top of the stack and jumps to the memory address referenced in the call instruction (by setting EIP to the call destination). The return address is the address of the call instruction plus 4 bytes (essentially the next instruction after the call).
* ret/retn (return) instruction (essentially) pops the top of the stack (the return address) into EIP and directs execution flow to it.
* retn [int] goes a step further and increments ESP [int] bytes in order to clean up any stack parameters used during the respective function call.
With this convention, arguments of a function are pushed in reverse order then the called function (callee) is responsible for cleaning up the stack after. In this convention, the retn [int] return instruction is used.
With a cdecl call, the calling function is responsible for cleaning up the stack. This is typically done by using an add esp, int statement after the function has returned. (shown below)
- The cdecl advantage is that it allows for a variable amount of arguments to a function.
function: push ebp pop ebp retn push 10101010h call function add esp, 4
This convention stores arguments in registers (x86 stores first two in ecx, edx and the rest on the stack, x64 stores first four in rcx, rdx, r8 and r9) since registers are faster than storing on the stack (memory). The callee then cleans the stack in x86 (similar to stdcall) and in x64 the caller cleans the stack (similar to cdecl).
Assorted Assembly Knowledge
- EAX generally contains the return value for function calls.
- Some x86 instructions need to work with 64-bit operations, in these cases, EDX:EAX is typically used.
- In an IDIV instruction a 64-bit value, EDX:EAX is divided by ECX. The quotient is stored in EAX and the remainder is stored in EDX.
- Jumps can be used as evidence of signed vs unsigned operations. ja, jae, jb and jbe are related to unsigned operations while jl, jle, jg and jge are related to signed operations.
Registers are located on the CPU and are extremely fast to access.
EIP - The Extended Instruction Pointer (EIP) or program counter is a reserved register that contains pointer to the memory location of the currently executing instruction. 32-bit arch does not allow direct access to this register.
General Purpose Registers (GPRs)
|000||EAX||Typical return value and sometimes accumulator||No|
|010||EDX||General purpose and sometimes extension to accumulator||No|
|101||EBP||Base frame pointer register and used to build stack frame||Yes|
|110||ESI||Source index register||Yes|
|111||EDI||Destination index register||Yes|
|32-bit |||Low-Order 16-bit |||8-bit (bits 8-15) |||Low-Order 8-Bit|
- The “E” in front of each register stands for “Extended” which is due to the carry over from older 16-bit architectures.
- The low-order 16-bits of every general purpose register can be accessed by removing the “e” from the register name (e.g., ax, cx, dx, bx, sp, bp, si, di).
- Only eax, ecx, ebx and edx can reference high/low 8-bits (e.g., ah/al, ch/cl, bh/bl, dh/dl respectively).
- CS - Code Segment Register - Maintains the Ring Level (0-3) in the Current Privilege Level (CPL) field.
- DS - Data Segment Register
- SS - Stack Segment Register
- ES - Extra Data Segment Register.
- GS - Extra Segment Register
EFLAGS register is used to store status and execution states.
- ZF/Zero flag - Set if previous arithmetic op is zero, otherwise it is cleared.
- SF/Sign flag - Set when result of an op is negative and cleared when positive. Also set when most significant bit is set after an arithmetic op.
- CF/Carry flag - Set when result of an op requires a carry (applies to unsigned numbers) because result is too large/small for destination.
- OF/Overflow flag - Set if result overflows max size (applies to signed numbers).
- TF/Trap flag - Used for debugging. x86 will execute only one instruction at a time if this flag is set.
- CR0 - Controls whether paging is on or off.
- Bit 0 - Protected Mode Enabled
- Bit 16 - Write-Protect (when set, CPU cannot write to read-only memory even in Ring 0)
- Bit 31 - Enable Paging (allows CR3 to be used)
- CR2 - Contains the linear address that caused a page fault.
- CR3 - Contains physical base address of Physical Directory Base Register (PDBR). Used when virtual addressing is enabled.
- CR4 - Controls hardware virtualization settings.
- Bit 5 - Physical Address Extensions (PAE) (extends 32-bit addressing to 36-bit)
- Bit 20 - SMEP (Supervisor Mode Execution Prevention) which disallows Ring 0 from executing user mode memory.
- Bit 21 - SMAP (Supervisor Mode Access Protection) disallows Ring 0 from accessing user mode memory.
- DR0 - DR3 - Contains linear address of memory location to be watched
- DR4, DR5 - Aliases for DR6 and DR7
- DR6 - Debug status register which contains type of last exception occurred (execution/access/write). These bits must be cleared by debugger, not processor.
- DR7 - Debug control register
The Intel x86 ISA supports a wide variety of instructions. Detailed information on these instructions can be viewed via the Intel 64 and IA-32 Architectures Software Developer Manuals.
Intel instructions have a variable length format, the general machine format is shown below. The parts of an instruction are further explained here.
PREFIX | OPCODE | MODR/M | SIB | DISPLACEMENT | IMMEDIATE
*Steps on how to code assembly instructions into their machine counterparts can be found here.
- Instruction operands can be a register, an immediate (constant value) or a memory address.
- A Label is an optional identifier followed by a colon.
- A Mnemnoic is a reserved name for the human-readable form of a machine instruction. (ex: opcode 0x03 is add).
- Assembly instructions have the human-readable format: label: mnemonic operand1, operand2, operand3
- Dereferencing memory is done in assembly using bracket [ ebx ] notation. This means memory is being accessed. In other words, when memory is dereferenced, you are reading/writing the value that is stored at a memory address rather than the memory address itself.
Simple - The mov instruction is a simple and oft-used instruction which moves data from one place to another.
Arithmetic - A multitude of arithmetic operations exist for addition, subtraction, etc… (ex: add, sub, inc, dec, mul, div, etc…).
NOP - The nop instruction does nothing, execution simply continues to the next line. (fun fact: a NOP is really a xchg eax, eax.)
Stack - This includes instructions for moving data to and from the stack like push and pop.
Function - This includes instructions for calling and returning from functions (ex: call, ret, retn, etc…)
Conditionals - These instructions are for making comparisons. (ex: test, cmp, etc…)
Branching - Consisting of conditional and unconditional jumps, these instructions control flow of the program. (ex: jz, jnz, je, jg, and many many more…).
Rep - Instructions for manipulating data buffers. (ex: rep, repz, repne, etc…)
*This list of instructions is far from exhaustive. Reference the Intel manual for a complete list.
More details to follow…
1-3 byte value representing the machine code value for an instruction.
1 byte value which follows the opcode and identifies the addressing mode as well as the register/memory operands. Only some instructions require this byte. Instructions which require this byte will have the “ModRM” label in it’s respective Instruction Operand Encoding table.
MODR/M Byte Format
|2-bit Addressing Mode||3-bit r32 operand or opcode extension||3-bit register or memory operand|
More details to follow…
8, 16, or 32-bit number that represents a memory location or an offset from a memory location. (ex: mov dword [ecx + 0xAABBCCDD], 0x11223344)
8, 16, or 32-bit value that is a literal number. (ex: 0xAABBCCDD in the instruction mov eax, 0xAABBCCDD)
Building an Instruction
The table below describes the various MODR/M addressing modes which are needed to build many types of instructions.
MODR/M Addressing Modes
|00||[r/m]||r/m32 operand memory address is located in the r/m register|
|00||[disp32]||if MOD is 00 AND R/M is 101 this indicated r/m32 location is a memory location that is a displacement32 only|
|01||[r/m32 + byte]||r/m32 operand memory address is located in the r/m register + a 1-byte displacement|
|10||[r/m + dword]||r/m32 operand memory address is in the r/m register + a 4-byte displacement|
|11||r/m||r/m32 operand is a direct register access|
Take for example the instruction add eax, ebx
- Find the ADD instruction in Intel manual (shown above.)
- Find in the Instruction column an instruction which takes two r/m32 operands. In this case, the opcodes which match this description are “01 /r” and “03 /r”.
- Checking the Instruction Operand Encoding table we can match the Op/En for each of the opcodes found above with the entry in the table. For example, the 01 /r opcode has an Op/En of MR which based on the Operand Encoding Table would make Operand 1 the r/m and Operand 2 the reg.
- Since from the Operand Encoding table we can see ModRM in the MR row we know that a MODR/M byte is required for this instruction.
- From the MODR/M addressing mode table we can see that since the r/m32 operand is a direct register, the value for the first two bits of the MODR/M byte is 11.
- Since “01 /r” is ADD r/m32, r32 we know that the next three bits of the MODR/M byte is the reg which in this case is the second operand of the instruction “ebx” which is encoded as 011. The final three bits is the instruction “eax” which is encoded as “000”. (These encodings are found in the General Purpose Registers table)
- Putting it all together we have an opcode of 0x01 plus 11011000b which translates to 0x01 0xD8 which when disassembled translates to add eax, ebx!
- r/m32 means you can use a register or memory.
- r32 means you can only use a register.
- An Intel instruction is of variable length and can be up to 15 bytes (120 bits).
- NP — Indicates the use of 66/F2/F3 prefixes (beyond those already part of the instructions opcode) are not allowed with the instruction. Such use will either cause an invalid-opcode exception (#UD) or result in the encoding for a different instruction.
- /digit — A digit between 0 and 7 indicates that the REG field (2nd field) of the ModR/M byte contains the 3-bit value (0-7) which provides an extension to the instruction’s opcode.
- /r — Indicates that the REG field (2nd field) of the ModR/M byte contains the 3-bit r32 operand value.
- cb, cw, cd, cp, co, ct — A 1-byte (cb), 2-byte (cw), 4-byte (cd), 6-byte (cp), 8-byte (co) or 10-byte (ct) value following the opcode. This value is used to specify a code offset and possibly a new value for the code segment register.
- ib, iw, id, io — A 1-byte (ib), 2-byte (iw), 4-byte (id) or 8-byte (io) immediate operand to the instruction that follows the opcode, ModR/M bytes or scale-indexing bytes. The opcode determines if the operand is a signed value. All WORDs, DWORDs and QWORDs are given with the low-order byte first.
- +rb, +rw, +rd, +ro — Indicated the lower 3 bits of the opcode byte is used to encode the register operand without a modR/M byte. The instruction lists the corresponding hexadecimal value of the opcode byte with low 3 bits as 000b. In non-64-bit mode, a register code, from 0 through 7, is added to the hexadecimal value of the opcode byte. In 64-bit mode, indicates the four bit field of REX.b and opcode[2:0] field encodes the register operand of the instruction. “+ro” is applicable only in 64-bit mode.
NASM (Netwide Assembler) is a cross-platform assembler. It is a quick way to assemble and disassemble assembly code and machine code respectively
Below is an example of an assembly listing file. (Saved with a .s extension)
Assembly Listing File
[BITS 32] push ebp push edi retn my_first_label: mov dword [eax], esp push ebp push edi retn jmp my_first_label
Running nasm file.s you can get an assembled file. Running ndisasm -u file you can get the disassembled assembly code as shown below.
00000000 55 push ebp 00000001 57 push edi 00000002 C3 ret 00000003 8920 mov [eax],esp 00000005 55 push ebp 00000006 57 push edi 00000007 C3 ret 00000008 EBF9 jmp short 0x
Tools and Resources
- Binary Two’s Complement Converter
- Binary to Hex Converter
- Binary/Hex Calculator
- Binary Floating Point Converter
- Another Binary Floating Point Converter
- Godbolt/Compiler Explorer - Online Compiler for a Variety of Source Languages into Assembly
- Defuse Assembler & Disassembler - Assemble/Diassemble Arbitrary Instructions/Machine Code
- Disasm.pro - Online Assembler/Disassembler
- JUMP Reference
- Data Types in C
- CyberChef - Analyze and Decode Data
- Terminus Project - Diff of Windows Structures Gathered from NTDLL PDBs
- NASM Download
- Intel 64 and IA-32 Architectures Software Developer Manuals