\section{MD5} MD5 (Message Digest Algorithm 5) was designed by Ronald Rivest in 1991 as a strengthened replacement for MD4. It produces a 128-bit digest from a message of arbitrary length, processing data in 512-bit blocks. Although MD5 is now considered cryptographically broken (collision attacks have been demonstrated since 2004) it remains widely used for non-security purposes such as checksums and data integrity verification. \vspace{1em} MD5 maintains a state of four 32-bit words, conventionally named $A$, $B$, $C$ and $D$, initialized to fixed constants defined in RFC 1321. Each 512-bit block is processed in four rounds of sixteen operations each, for a total of 64 operations per block. Each operation applies one of four non-linear functions to the state words, adds a message word and a precomputed constant derived from the sine function, and rotates the result by a fixed amount. \vspace{1em} The state is initialized to the following fixed constants, as specified in RFC 1321: \begin{align*} A &= \texttt{0x67452301} \\ B &= \texttt{0xefcdab89} \\ C &= \texttt{0x98badcfe} \\ D &= \texttt{0x10325476} \end{align*} \vspace{1em} Before processing, the message is padded to a length congruent to 448 bits modulo 512. A single \texttt{1} bit is appended first, followed by as many \texttt{0} bits as needed. The original message length in bits is then appended as a 64-bit little-endian integer, bringing the total padded length to an exact multiple of 512 bits. \vspace{1em} Each of the four rounds uses a distinct non-linear function applied to the state words $B$, $C$ and $D$: \begin{align*} F(B, C, D) &= (B \land C) \lor (\lnot B \land D) \\ G(B, C, D) &= (B \land D) \lor (C \land \lnot D) \\ H(B, C, D) &= B \oplus C \oplus D \\ I(B, C, D) &= C \oplus (B \lor \lnot D) \end{align*} The message word index used at step $i$ is not sequential: each round applies a distinct selector function $k_r$ where $r = \lfloor i / 16 \rfloor$: \begin{align*} k_0(i) &= i \bmod 16 \\ k_1(i) &= (5i + 1) \bmod 16 \\ k_2(i) &= (3i + 5) \bmod 16 \\ k_3(i) &= 7i \bmod 16 \end{align*} At each step $i$ (with $0 \leq i < 64$), one of the four functions is selected according to the current round, and the state is updated as follows: \begin{align*} A &\leftarrow B + \bigl((A + \phi(B, C, D) + M[k] + T[i]) \lll s[i]\bigr) \end{align*} \noindent where $\phi$ is the auxiliary function for the current round, $M[k]$ is a 32-bit word of the current block, $T[i]$ is a precomputed constant, $s[i]$ is the rotation amount, and $\lll$ denotes a left rotation. After this operation, the state words are cycled: $(A, B, C, D) \leftarrow (D, A, B, C)$. \vspace{1em} The rotation amounts $s[i]$ are constant per round and repeat every four steps: \begin{align*} \text{Round 0} &: 7,\ 12,\ 17,\ 22 \\ \text{Round 1} &: 5,\ 9,\ 14,\ 20 \\ \text{Round 2} &: 4,\ 11,\ 16,\ 23 \\ \text{Round 3} &: 6,\ 10,\ 15,\ 21 \end{align*} \vspace{1em} The 64 constants $T[i]$ are derived from the sine function: \begin{align*} \forall i \in \mathbb{N},\ 0\le i < 64, T_i = \left\lfloor 2^{32}\,|\sin(i+1)| \right\rfloor \end{align*} After each block is processed, the compressed state is added word-by-word to the state before compression: \begin{align*} (A, B, C, D) \leftarrow (A + A_0,\ B + B_0,\ C + C_0,\ D + D_0) \end{align*} \noindent where $A_0$, $B_0$, $C_0$, $D_0$ denote the state at the beginning of the block. After all blocks have been processed, the four state words are serialized in little-endian order to produce the 128-bit digest.