| src | ||
| .gitignore | ||
| Makefile | ||
| README.md | ||
| test.c | ||
libasm
This is my second project in assembly.
It is a from-scratch reimplementation of libc-inspired utilities.
It's pedagogical in purpose and not meant for serious real-world use.
Note: All technical content in this README is specific to x86_64 Linux.
Technical description
Architecture: x86_64
Syntax: Intel
Assembler: NASM 3.01
What I learned
ASM has different syntaxes depending on the assembler and architecture.
It's a very verbose language - everything must be explicit.
Each line represents a single instruction.
Stack
The stack is a LIFO (Last In, First Out) data structure used for temporary storage during program execution.
It grows downward in memory - each time you push a value, the stack pointer (rsp) decreases.
Conversely, pop retrieves the top value and moves rsp back up.
In x86_64, the stack must be 16-byte aligned before any call instruction - failing to do so causes undefined behavior.
rbp (base pointer) is typically used to mark the start of the current stack frame, making it easy to reference local variables at fixed offsets regardless of how rsp moves.
Register
Registers are the processor’s working memory - small, ultra-fast storage slots wired directly into the CPU.
Unlike RAM, which can only be read from or written to, registers are connected to active units like the ALU,
allowing the processor to actually compute: add, subtract, shift bits, apply logical operators, and so on.
All computation happens inside registers - RAM just holds the data until it’s needed.
Special registers
| 64-bit | 32-bit | 16-bit | Name | Purpose |
|---|---|---|---|---|
rsp |
esp |
sp |
Stack Pointer | Points to the top of the stack |
rbp |
ebp |
bp |
Base Pointer | Marks the base of the current stack frame |
rip |
eip |
ip |
Instruction Pointer | Points to the next instruction to execute |
rflags |
eflags |
flags |
Flags Register | Stores CPU state flags (zero, carry, sign, overflow...) |
General-purpose registers
| 64-bit | 32-bit | 16-bit | 8-bit high | 8-bit low | Conventional use |
|---|---|---|---|---|---|
rax |
eax |
ax |
ah |
al |
Return value, accumulator |
rbx |
ebx |
bx |
bh |
bl |
Callee-saved |
rcx |
ecx |
cx |
ch |
cl |
4th argument |
rdx |
edx |
dx |
dh |
dl |
3rd argument |
rsi |
esi |
si |
- | sil |
2nd argument |
rdi |
edi |
di |
- | dil |
1st argument |
r8 |
r8d |
r8w |
- | r8b |
5th argument |
r9 |
r9d |
r9w |
- | r9b |
6th argument |
r10–r11 |
r10d–r11d |
r10w–r11w |
- | r10b–r11b |
Caller-saved (scratch) |
r12–r15 |
r12d–r15d |
r12w–r15w |
- | r12b–r15b |
Callee-saved |
Writing to a 32-bit register (e.g.
eax) zeroes the upper 32 bits of its 64-bit counterpart (rax). Writing to a 16-bit or 8-bit register leaves the upper bits unchanged.
CPU instructions
Base
| Instruction | Description |
|---|---|
mov dst, src |
Copy src into dst |
push src |
Push src onto the stack |
pop dst |
Pop top of stack into dst |
lea dst, [src] |
Load effective address of src into dst |
Branching
| Instruction | Description |
|---|---|
cmp a, b |
Compare a and b (sets flags, no result stored) |
test a, b |
Bitwise AND to set flags (no result stored) |
jmp label |
Unconditional jump |
je label |
Jump if equal (ZF=1) |
jne label |
Jump if not equal (ZF=0) |
jz label |
Jump if zero (ZF=1) |
jnz label |
Jump if not zero (ZF=0) |
jo label |
Jump if overflow (OF=1) |
jno label |
Jump if no overflow (OF=0) |
js label |
Jump if sign / negative (SF=1) |
jns label |
Jump if no sign / positive (SF=0) |
jg label |
Jump if greater (signed) |
jge label |
Jump if greater or equal (signed) |
jl label |
Jump if less (signed) |
jle label |
Jump if less or equal (signed) |
ja label |
Jump if above (unsigned) |
jae label |
Jump if above or equal (unsigned) |
jb label |
Jump if below (unsigned) |
jbe label |
Jump if below or equal (unsigned) |
Arithmetic
| Instruction | Description |
|---|---|
add dst, src |
dst = dst + src |
sub dst, src |
dst = dst - src |
inc dst |
dst = dst + 1 |
dec dst |
dst = dst - 1 |
imul dst, src |
dst = dst * src (signed) |
mul src |
rax * src → rdx:rax (unsigned) |
idiv src |
rdx:rax / src → rax (quotient), rdx (remainder) (signed) |
div src |
rdx:rax / src → rax (quotient), rdx (remainder) (unsigned) |
neg dst |
dst = -dst |
and dst, src |
dst = dst AND src |
or dst, src |
dst = dst OR src |
xor dst, src |
dst = dst XOR src (used to zero a register when dst == src) |
System call
A system call (syscall) is a software interrupt that requests a service from the kernel - file I/O, memory allocation, process control, etc.
In x86_64 Linux, syscalls are triggered with the syscall instruction.
Calling convention:
| Register | Role |
|---|---|
rax |
Syscall number |
rdi |
1st argument |
rsi |
2nd argument |
rdx |
3rd argument |
r10 |
4th argument |
r8 |
5th argument |
r9 |
6th argument |
The return value is stored in rax. On error, rax contains a negative errno value.
Common syscalls:
| Number | Name | Description |
|---|---|---|
| 0 | read |
Read from a file descriptor |
| 1 | write |
Write to a file descriptor |
| 2 | open |
Open a file |
| 3 | close |
Close a file descriptor |
| 60 | exit |
Terminate the process |
Function
Calling convention
Arguments are passed in registers in this order: rdi, rsi, rdx, rcx, r8, r9.
Additional arguments are pushed onto the stack. The return value is stored in rax.
Register preservation:
| Type | Registers | Who saves | Behavior |
|---|---|---|---|
| Caller-saved | rax, rcx, rdx, rsi, rdi, r8–r11 |
Caller | Will be overwritten by the called function - save them before call if needed |
| Callee-saved | rbx, rbp, r12–r15 |
Callee | Must be restored before returning |
Structure of a function
my_function:
push rbp ; save caller's base pointer
mov rbp, rsp ; set up new stack frame
; function body
pop rbp ; restore caller's base pointer
ret ; return to caller (pops rip)
call label pushes the return address onto the stack then jumps to label.
ret pops that address and jumps back to it.
Macro
A macro is a named block of code that gets inlined at each call site - unlike a function, it has no call/ret overhead.
Use macros for short repeated patterns where performance or readability matter.
Syntax
%macro name nb_args
; body - arguments accessed via %1, %2, ...
%endmacro
Example:
%macro save_regs 0
push rbx
push r12
%endmacro
%[expr]
%[...] forces the preprocessor to evaluate an expression inline - useful inside contexts where a token would not normally be expanded, such as inside another macro or a string.
%define OFFSET 8
mov rax, [rbp + %[OFFSET]] ; expands to [rbp + 8] at preprocessing time