libft_ssl/doc/sha256.tex

\section{SHA-256}

SHA-256 is part of the SHA-2 family of cryptographic hash functions, designed
by the NSA and first published by NIST in 2001. It produces a 256-bit digest
from a message of arbitrary length, processing data in 512-bit blocks. Unlike
MD5, SHA-256 has no known practical collision attacks and remains widely used
in security-critical applications such as TLS certificates and Bitcoin's
proof-of-work.

\vspace{1em}

SHA-256 maintains a state of eight 32-bit words, initialized to fixed constants
derived from the square roots of the first eight prime numbers. Each 512-bit
block is processed in 64 rounds. Each round applies a compression step involving
two non-linear functions, a message schedule word, and a precomputed constant
derived from the cube roots of the first 64 prime numbers.

\vspace{1em}

The padding scheme is identical to MD5: a single \texttt{1} bit is appended,
followed by \texttt{0} bits until the message length is congruent to 448 bits
modulo 512, and the original length in bits is appended as a 64-bit integer.
The difference is that SHA-256 encodes this length in big-endian order.

\vspace{1em}

Each round uses two non-linear functions applied to the state words:

\begin{align*}
\text{Ch}(E, F, G)  &= (E \land F) \oplus (\lnot E \land G) \\
\text{Maj}(A, B, C) &= (A \land B) \oplus (A \land C) \oplus (B \land C)
\end{align*}

\noindent and two rotation-based functions applied to the state words $A$ and $E$:

\begin{align*}
\Sigma_0(A) &= (A \ggg 2)  \oplus (A \ggg 13) \oplus (A \ggg 22) \\
\Sigma_1(E) &= (E \ggg 6)  \oplus (E \ggg 11) \oplus (E \ggg 25)
\end{align*}

\noindent where $\ggg$ denotes a right rotation.

\vspace{1em}

At each round $i$ (with $0 \leq i < 64$), the state is updated as follows:

\begin{align*}
T_1 &= H + \Sigma_1(E) + \text{Ch}(E, F, G) + K[i] + W[i] \\
T_2 &= \Sigma_0(A) + \text{Maj}(A, B, C) \\
H &\leftarrow G, \quad G \leftarrow F, \quad F \leftarrow E, \quad E \leftarrow D + T_1 \\
D &\leftarrow C, \quad C \leftarrow B, \quad B \leftarrow A, \quad A \leftarrow T_1 + T_2
\end{align*}

\noindent where $K[i]$ is a precomputed constant and $W[i]$ is a word from the
message schedule.

\vspace{1em}

\begin{align*}
\forall i \in \mathbb{N},\ 0 \leq i < 64,\quad K[i] = \left\lfloor 2^{32} \times \left(\sqrt[3]{p_{i+1}} \bmod 1\right) \right\rfloor
\end{align*}

\noindent where $p_{i+1}$ is the $(i+1)$-th prime number and $\bmod 1$ denotes
the fractional part.

\vspace{1em}

The message schedule extends the 16 words of the current block into 64 words
using two additional rotation-based functions:

\begin{align*}
\sigma_0(x) &= (x \ggg 7)  \oplus (x \ggg 18) \oplus (x \gg 3) \\
\sigma_1(x) &= (x \ggg 17) \oplus (x \ggg 19) \oplus (x \gg 10)
\end{align*}

\noindent where $\gg$ denotes a logical right shift, and $M[i]$ denotes the
$i$-th 32-bit word of the current 512-bit block. The schedule is then defined
as:

\begin{align*}
W[i] = \begin{cases}
    M[i] & i \in \mathbb{N},\ 0 \leq i < 16 \\
    \sigma_1(W[i-2]) + W[i-7] + \sigma_0(W[i-15]) + W[i-16] & i \in \mathbb{N},\ 16 \leq i < 64
\end{cases}
\end{align*}

\vspace{1em}

The state is initialized to fixed constants derived from the square roots of
the first eight prime numbers:

\begin{align*}
A &= \texttt{0x6a09e667}, \quad B = \texttt{0xbb67ae85}, \quad
C = \texttt{0x3c6ef372}, \quad D = \texttt{0xa54ff53a} \\
E &= \texttt{0x510e527f}, \quad F = \texttt{0x9b05688c}, \quad
G = \texttt{0x1f83d9ab}, \quad H = \texttt{0x5be0cd19}
\end{align*}

After each block is processed, the compressed state is added word-by-word to
the state before compression:

\begin{align*}
(A, \ldots, H) \leftarrow (A + A_0,\ B + B_0,\ C + C_0,\ D + D_0,\ E + E_0,\ F + F_0,\ G + G_0,\ H + H_0)
\end{align*}

\noindent where $A_0, \ldots, H_0$ denote the state at the beginning of the
block. After all blocks have been processed, the eight state words are
serialized in big-endian order to produce the 256-bit digest.

\newpage