from-scratch reimplementation of libc-inspired utilities.
Find a file
2026-06-27 14:42:31 +02:00
src feat: ft_list_remove_if 2026-06-23 11:24:07 +02:00
.gitignore chore: gitignore 2026-06-12 14:37:13 +02:00
Makefile feat: ft_list_remove_if 2026-06-23 11:24:07 +02:00
README.md chore: update README 2026-06-27 14:42:31 +02:00
test.c feat: Makefile, strlen and strcpy 2026-06-12 11:57:21 +02:00

libasm

This is my second project in assembly.
It is a from-scratch reimplementation of libc-inspired utilities.
It's pedagogical in purpose and not meant for serious real-world use.

Note: All technical content in this README is specific to x86_64 Linux.

Technical description

Architecture: x86_64
Syntax: Intel
Assembler: NASM 3.01

What I learned

ASM has different syntaxes depending on the assembler and architecture.

It's a very verbose language - everything must be explicit.
Each line represents a single instruction.

Stack

The stack is a LIFO (Last In, First Out) data structure used for temporary storage during program execution.
It grows downward in memory - each time you push a value, the stack pointer (rsp) decreases.
Conversely, pop retrieves the top value and moves rsp back up.

In x86_64, the stack must be 16-byte aligned before any call instruction - failing to do so causes undefined behavior.

rbp (base pointer) is typically used to mark the start of the current stack frame, making it easy to reference local variables at fixed offsets regardless of how rsp moves.

Register

Registers are the processors working memory - small, ultra-fast storage slots wired directly into the CPU.
Unlike RAM, which can only be read from or written to, registers are connected to active units like the ALU,
allowing the processor to actually compute: add, subtract, shift bits, apply logical operators, and so on.
All computation happens inside registers - RAM just holds the data until its needed.

Special registers

64-bit 32-bit 16-bit Name Purpose
rsp esp sp Stack Pointer Points to the top of the stack
rbp ebp bp Base Pointer Marks the base of the current stack frame
rip eip ip Instruction Pointer Points to the next instruction to execute
rflags eflags flags Flags Register Stores CPU state flags (zero, carry, sign, overflow...)

General-purpose registers

64-bit 32-bit 16-bit 8-bit high 8-bit low Category Conventional use
rax eax ax ah al Caller-saved Return value, accumulator
rbx ebx bx bh bl Callee-saved General purpose
rcx ecx cx ch cl Caller-saved 4th argument
rdx edx dx dh dl Caller-saved 3rd argument
rsi esi si - sil Caller-saved 2nd argument
rdi edi di - dil Caller-saved 1st argument
r8 r8d r8w - r8b Caller-saved 5th argument
r9 r9d r9w - r9b Caller-saved 6th argument
r10r11 r10dr11d r10wr11w - r10br11b Caller-saved Scratch
r12r15 r12dr15d r12wr15w - r12br15b Callee-saved General purpose

Writing to a 32-bit register (e.g. eax) zeroes the upper 32 bits of its 64-bit counterpart (rax). Writing to a 16-bit or 8-bit register leaves the upper bits unchanged.

CPU instructions

Base

Instruction Description
mov dst, src Copy src into dst
push src Push src onto the stack
pop dst Pop top of stack into dst
lea dst, [src] Load effective address of src into dst

Branching

Flag-setting
Instruction Flags set Description
cmp a, b ZF, SF, OF, CF Computes a b, discards result
test a, b ZF, SF, PF Computes a AND b, discards result
Conditional jumps
Instruction Flags Condition Description
jmp - always Unconditional jump
je / jz ZF ZF = 1 Equal / Zero
jne / jnz ZF ZF = 0 Not equal / Not zero
jo OF OF = 1 Overflow
jno OF OF = 0 No overflow
js SF SF = 1 Sign (negative)
jns SF SF = 0 No sign (positive)
jg ZF, SF, OF ZF=0 ∧ SF=OF Greater (signed)
jge SF, OF SF = OF Greater or equal (signed)
jl SF, OF SF ≠ OF Less (signed)
jle ZF, SF, OF ZF=1 SF≠OF Less or equal (signed)
ja CF, ZF CF=0 ∧ ZF=0 Above (unsigned)
jae CF CF = 0 Above or equal (unsigned)
jb CF CF = 1 Below (unsigned)
jbe CF, ZF CF=1 ZF=1 Below or equal (unsigned)

Arithmetic

Instruction Description
add dst, src dst = dst + src
sub dst, src dst = dst - src
inc dst dst = dst + 1
dec dst dst = dst - 1
imul dst, src dst = dst * src (signed)
mul src rax * src → rdx:rax (unsigned)
idiv src rdx:rax / src → rax (quotient), rdx (remainder) (signed)
div src rdx:rax / src → rax (quotient), rdx (remainder) (unsigned)
neg dst dst = -dst
and dst, src dst = dst AND src
or dst, src dst = dst OR src
xor dst, src dst = dst XOR src (used to zero a register when dst == src)

System call

A system call (syscall) is a software interrupt that requests a service from the kernel - file I/O, memory allocation, process control, etc.
In x86_64 Linux, syscalls are triggered with the syscall instruction.

Calling convention:

Register Role
rax Syscall number
rdi 1st argument
rsi 2nd argument
rdx 3rd argument
r10 4th argument
r8 5th argument
r9 6th argument

The return value is stored in rax. On error, rax contains a negative errno value.

Common syscalls:

Number Name Description
0 read Read from a file descriptor
1 write Write to a file descriptor
2 open Open a file
3 close Close a file descriptor
60 exit Terminate the process

Function

Calling convention

Arguments are passed in registers in this order: rdi, rsi, rdx, rcx, r8, r9.
Additional arguments are pushed onto the stack. The return value is stored in rax.

Register preservation:

Type Registers Who saves Behavior
Caller-saved rax, rcx, rdx, rsi, rdi, r8r11 Caller Will be overwritten by the called function - save them before call if needed
Callee-saved rbx, rbp, r12r15 Callee Must be restored before returning

Structure of a function

my_function:
	push rbp           ; save caller's base pointer
	mov  rbp, rsp      ; set up new stack frame

	; function body

	pop  rbp           ; restore caller's base pointer
	ret                ; return to caller (pops rip)

call label pushes the return address onto the stack then jumps to label.
ret pops that address and jumps back to it.

Macro

A macro is a named block of code that gets inlined at each call site - unlike a function, it has no call/ret overhead.
Use macros for short repeated patterns where performance or readability matter.

Syntax

%macro name nb_args
	; body - arguments accessed via %1, %2, ...
%endmacro

Example:

%macro save_regs 0
    push rbx
    push r12
%endmacro

%[expr]

%[...] forces the preprocessor to evaluate an expression inline - useful inside contexts where a token would not normally be expanded, such as inside another macro or a string.

%define OFFSET 8
mov rax, [rbp + %[OFFSET]]   ; expands to [rbp + 8] at preprocessing time

Resources