doc: add full algorithm reference (MD5, SHA-256, Whirlpool)
This commit is contained in:
parent
3289a9191d
commit
3e36cb4906
9 changed files with 488 additions and 35 deletions
|
|
@ -1,11 +1,18 @@
|
|||
EXTRA_DIST = libft_ssl.tex
|
||||
EXTRA_DIST = \
|
||||
libft_ssl.tex \
|
||||
preliminaries.tex \
|
||||
introduction.tex \
|
||||
generic_interface.tex \
|
||||
md5.tex \
|
||||
sha256.tex \
|
||||
whirlpool.tex
|
||||
|
||||
if ENABLE_DOC
|
||||
pdf: libft_ssl.pdf
|
||||
|
||||
libft_ssl.pdf: libft_ssl.tex
|
||||
$(PDFLATEX) $<
|
||||
$(PDFLATEX) $<
|
||||
TEXINPUTS=$(srcdir): $(PDFLATEX) $<
|
||||
TEXINPUTS=$(srcdir): $(PDFLATEX) $<
|
||||
endif
|
||||
|
||||
clean-local:
|
||||
|
|
|
|||
53
doc/generic_interface.tex
Normal file
53
doc/generic_interface.tex
Normal file
|
|
@ -0,0 +1,53 @@
|
|||
\section{Generic Digest Interface}
|
||||
|
||||
All hash algorithms in \texttt{libft\_ssl} are exposed through a single,
|
||||
uniform interface built around the \texttt{struct digest\_algo} type. This
|
||||
structure holds the algorithm's metadata (its name, digest size and block size)
|
||||
along with three function pointers: \texttt{init}, \texttt{update} and
|
||||
\texttt{final}. This design allows any algorithm to be driven through the same
|
||||
calling convention without the caller needing to know which one is in use. The
|
||||
associated context is held in a \texttt{union digest\_ctx}, which overlays the
|
||||
per-algorithm state structures so that a single allocation covers all supported
|
||||
algorithms.
|
||||
|
||||
\vspace{1em}
|
||||
|
||||
\texttt{struct digest\_algo} describes a hash algorithm as a set of metadata
|
||||
and three function pointers. The \texttt{name} field identifies the algorithm.
|
||||
The \texttt{digest\_size} and \texttt{block\_size} fields express its output
|
||||
length and internal block size in bytes. The three function pointers
|
||||
\texttt{init}, \texttt{update} and \texttt{final} define the algorithm's
|
||||
lifecycle: \texttt{init} sets the context to its initial state, \texttt{update}
|
||||
feeds an arbitrary amount of data into it, and \texttt{final} produces the
|
||||
digest and resets the context. All three operate on a \texttt{void~*} context
|
||||
pointer, which allows the interface to remain algorithm-agnostic.
|
||||
|
||||
\vspace{1em}
|
||||
|
||||
The \texttt{union digest\_ctx} type provides a single allocation large enough
|
||||
to hold the context of any supported algorithm. Because only one algorithm is
|
||||
active at a time, overlaying the per-algorithm structures in a union avoids the
|
||||
overhead of a separate heap allocation while keeping the calling code uniform.
|
||||
The active member is always the one matching the \texttt{struct digest\_algo}
|
||||
being used.
|
||||
|
||||
\vspace{1em}
|
||||
|
||||
Each supported algorithm is registered in \texttt{digest\_algos.h} through an
|
||||
X-macro list. This file defines a single macro \texttt{DIGEST\_ALGOS(X)} that
|
||||
expands \texttt{X} once per algorithm, passing its name, digest size and block
|
||||
size. Consuming this list with a different definition of \texttt{X} generates
|
||||
the corresponding code or data without repetition --- the global \texttt{struct
|
||||
digest\_algo} instances in \texttt{libft\_ssl.c} are produced this way. Adding
|
||||
a new algorithm to the library reduces to adding one line to this list.
|
||||
|
||||
\vspace{1em}
|
||||
|
||||
All three algorithms follow the Merkle-Damgård construction. The message is
|
||||
split into fixed-size blocks and processed sequentially. After each block, the
|
||||
compressed output is combined with the previous state to produce the new state
|
||||
--- this chaining ensures that the final digest depends on every bit of the
|
||||
input. The exact combination operation is algorithm-specific: MD5 and SHA-256
|
||||
use an additive feedforward, while Whirlpool uses the Miyaguchi-Preneel scheme.
|
||||
|
||||
\newpage
|
||||
19
doc/introduction.tex
Normal file
19
doc/introduction.tex
Normal file
|
|
@ -0,0 +1,19 @@
|
|||
\section{Introduction}
|
||||
|
||||
\texttt{libft\_ssl} is a C library implementing cryptographic hash functions
|
||||
from scratch. A cryptographic hash function maps an arbitrary-length input to a
|
||||
fixed-size digest. This operation is deterministic and one-way: it is
|
||||
computationally infeasible to recover the original input from its digest.
|
||||
|
||||
The library currently implements the following algorithms:
|
||||
|
||||
\begin{itemize}
|
||||
\item \textbf{MD5} - produces a 128-bit digest.
|
||||
\item \textbf{SHA-256} - produces a 256-bit digest.
|
||||
\item \textbf{Whirlpool} - produces a 512-bit digest.
|
||||
\end{itemize}
|
||||
|
||||
These functions are commonly used for data integrity verification, digital
|
||||
signatures, and \textbf{M}essage \textbf{A}uthentication \textbf{C}ode\textbf{s} (MACs).
|
||||
|
||||
\newpage
|
||||
|
|
@ -6,7 +6,7 @@
|
|||
\usepackage{amssymb}
|
||||
\usepackage{listings}
|
||||
\usepackage{xcolor}
|
||||
\usepackage{hyperref}
|
||||
\usepackage[hidelinks]{hyperref}
|
||||
\usepackage{geometry}
|
||||
|
||||
\geometry{margin=2.5cm}
|
||||
|
|
@ -21,36 +21,11 @@
|
|||
\tableofcontents
|
||||
\newpage
|
||||
|
||||
\section{Introduction}
|
||||
\input{preliminaries}
|
||||
\input{introduction}
|
||||
\input{generic_interface}
|
||||
\input{md5}
|
||||
\input{sha256}
|
||||
\input{whirlpool}
|
||||
|
||||
\texttt{libft\_ssl} is a C library implementing cryptographic hash functions
|
||||
from scratch. A cryptographic hash function maps an arbitrary-length input to a
|
||||
fixed-size digest. This operation is deterministic and one-way: it is
|
||||
computationally infeasible to recover the original input from its digest.
|
||||
|
||||
The library currently implements the following algorithms:
|
||||
|
||||
\begin{itemize}
|
||||
\item \textbf{MD5} - produces a 128-bit digest.
|
||||
\item \textbf{SHA-256} - produces a 256-bit digest.
|
||||
\item \textbf{Whirlpool} - produces a 512-bit digest.
|
||||
\end{itemize}
|
||||
|
||||
These functions are commonly used for data integrity verification, digital
|
||||
signatures, and message authentication codes (MACs).
|
||||
\newpage
|
||||
|
||||
\section{Library core}
|
||||
|
||||
\newpage
|
||||
|
||||
\section{MD5}
|
||||
|
||||
\newpage
|
||||
|
||||
\section{SHA-256}
|
||||
|
||||
\newpage
|
||||
|
||||
\section{Whirlpool}
|
||||
\end{document}
|
||||
|
|
|
|||
102
doc/md5.tex
Normal file
102
doc/md5.tex
Normal file
|
|
@ -0,0 +1,102 @@
|
|||
\section{MD5}
|
||||
|
||||
MD5 (Message Digest Algorithm 5) was designed by Ronald Rivest in 1991 as a
|
||||
strengthened replacement for MD4. It produces a 128-bit digest from a message
|
||||
of arbitrary length, processing data in 512-bit blocks. Although MD5 is now
|
||||
considered cryptographically broken (collision attacks have been
|
||||
demonstrated since 2004) it remains widely used for non-security purposes
|
||||
such as checksums and data integrity verification.
|
||||
|
||||
\vspace{1em}
|
||||
|
||||
MD5 maintains a state of four 32-bit words, conventionally named $A$, $B$, $C$
|
||||
and $D$, initialized to fixed constants defined in RFC 1321. Each 512-bit block
|
||||
is processed in four rounds of sixteen operations each, for a total of 64
|
||||
operations per block. Each operation applies one of four non-linear functions
|
||||
to the state words, adds a message word and a precomputed constant derived from
|
||||
the sine function, and rotates the result by a fixed amount.
|
||||
|
||||
\vspace{1em}
|
||||
|
||||
The state is initialized to the following fixed constants, as specified in RFC 1321:
|
||||
|
||||
\begin{align*}
|
||||
A &= \texttt{0x67452301} \\
|
||||
B &= \texttt{0xefcdab89} \\
|
||||
C &= \texttt{0x98badcfe} \\
|
||||
D &= \texttt{0x10325476}
|
||||
\end{align*}
|
||||
|
||||
\vspace{1em}
|
||||
|
||||
Before processing, the message is padded to a length congruent to 448 bits
|
||||
modulo 512. A single \texttt{1} bit is appended first, followed by as many
|
||||
\texttt{0} bits as needed. The original message length in bits is then appended
|
||||
as a 64-bit little-endian integer, bringing the total padded length to an exact
|
||||
multiple of 512 bits.
|
||||
|
||||
\vspace{1em}
|
||||
|
||||
Each of the four rounds uses a distinct non-linear function applied to the
|
||||
state words $B$, $C$ and $D$:
|
||||
|
||||
\begin{align*}
|
||||
F(B, C, D) &= (B \land C) \lor (\lnot B \land D) \\
|
||||
G(B, C, D) &= (B \land D) \lor (C \land \lnot D) \\
|
||||
H(B, C, D) &= B \oplus C \oplus D \\
|
||||
I(B, C, D) &= C \oplus (B \lor \lnot D)
|
||||
\end{align*}
|
||||
|
||||
The message word index used at step $i$ is not sequential: each round applies
|
||||
a distinct selector function $k_r$ where $r = \lfloor i / 16 \rfloor$:
|
||||
|
||||
\begin{align*}
|
||||
k_0(i) &= i \bmod 16 \\
|
||||
k_1(i) &= (5i + 1) \bmod 16 \\
|
||||
k_2(i) &= (3i + 5) \bmod 16 \\
|
||||
k_3(i) &= 7i \bmod 16
|
||||
\end{align*}
|
||||
|
||||
At each step $i$ (with $0 \leq i < 64$), one of the four functions is selected
|
||||
according to the current round, and the state is updated as follows:
|
||||
|
||||
\begin{align*}
|
||||
A &\leftarrow B + \bigl((A + \phi(B, C, D) + M[k] + T[i]) \lll s[i]\bigr)
|
||||
\end{align*}
|
||||
|
||||
\noindent where $\phi$ is the auxiliary function for the current round, $M[k]$
|
||||
is a 32-bit word of the current block, $T[i]$ is a precomputed constant, $s[i]$
|
||||
is the rotation amount, and $\lll$ denotes a left rotation. After this
|
||||
operation, the state words are cycled: $(A, B, C, D) \leftarrow (D, A, B, C)$.
|
||||
|
||||
\vspace{1em}
|
||||
|
||||
The rotation amounts $s[i]$ are constant per round and repeat every four steps:
|
||||
|
||||
\begin{align*}
|
||||
\text{Round 0} &: 7,\ 12,\ 17,\ 22 \\
|
||||
\text{Round 1} &: 5,\ 9,\ 14,\ 20 \\
|
||||
\text{Round 2} &: 4,\ 11,\ 16,\ 23 \\
|
||||
\text{Round 3} &: 6,\ 10,\ 15,\ 21
|
||||
\end{align*}
|
||||
|
||||
\vspace{1em}
|
||||
|
||||
The 64 constants $T[i]$ are derived from the sine function:
|
||||
|
||||
\begin{align*}
|
||||
\forall i \in \mathbb{N},\ 0\le i < 64, T_i = \left\lfloor 2^{32}\,|\sin(i+1)| \right\rfloor
|
||||
\end{align*}
|
||||
|
||||
After each block is processed, the compressed state is added word-by-word to
|
||||
the state before compression:
|
||||
|
||||
\begin{align*}
|
||||
(A, B, C, D) \leftarrow (A + A_0,\ B + B_0,\ C + C_0,\ D + D_0)
|
||||
\end{align*}
|
||||
|
||||
\noindent where $A_0$, $B_0$, $C_0$, $D_0$ denote the state at the beginning
|
||||
of the block. After all blocks have been processed, the four state words are
|
||||
serialized in little-endian order to produce the 128-bit digest.
|
||||
|
||||
\newpage
|
||||
15
doc/md5_init_T.py
Normal file
15
doc/md5_init_T.py
Normal file
|
|
@ -0,0 +1,15 @@
|
|||
import math
|
||||
|
||||
def md5_T():
|
||||
T = []
|
||||
for i in range(64):
|
||||
val = int(math.floor((2**32) * abs(math.sin(i + 1))))
|
||||
T.append(val & 0xFFFFFFFF)
|
||||
return T
|
||||
|
||||
if __name__ == "__main__":
|
||||
T = md5_T()
|
||||
for i, v in enumerate(T):
|
||||
print(f"0x{v:08x}, ")
|
||||
|
||||
|
||||
66
doc/preliminaries.tex
Normal file
66
doc/preliminaries.tex
Normal file
|
|
@ -0,0 +1,66 @@
|
|||
\section{Preliminaries}
|
||||
|
||||
This section defines the terminology used throughout the document. The concepts
|
||||
introduced here are general to cryptographic hash functions and apply to all
|
||||
algorithms described in subsequent sections.
|
||||
|
||||
\vspace{1em}
|
||||
|
||||
A \textbf{bit} is the smallest unit of information, taking a value of either 0
|
||||
or 1. A \textbf{byte} is a group of eight bits, and is the standard unit of
|
||||
data storage and transmission. A \textbf{word} is a fixed-size integer used
|
||||
internally by a hash algorithm --- MD5 and SHA-256 operate on 32-bit words,
|
||||
while Whirlpool operates on 64-bit words.
|
||||
|
||||
\vspace{1em}
|
||||
|
||||
\textbf{Endianness} refers to the byte order used when storing a multi-byte
|
||||
integer in memory. In \textbf{little-endian} order, the least significant byte
|
||||
is stored first; in \textbf{big-endian} order, the most significant byte is
|
||||
stored first. This distinction matters when serializing the internal state to
|
||||
produce the final digest --- MD5 uses little-endian, while SHA-256 and
|
||||
Whirlpool use big-endian.
|
||||
|
||||
\vspace{1em}
|
||||
|
||||
A \textbf{message} is the arbitrary-length input fed to a hash function. The
|
||||
\textbf{digest} is the fixed-size output it produces. A hash function is said
|
||||
to be \textbf{one-way} if it is computationally infeasible to recover any input
|
||||
that produces a given digest. A \textbf{collision} occurs when two distinct
|
||||
messages produce the same digest; a hash function is considered broken when
|
||||
collisions can be found efficiently.
|
||||
|
||||
\vspace{1em}
|
||||
|
||||
Hash functions process their input in fixed-size chunks called \textbf{blocks}.
|
||||
Since the message length is rarely a multiple of the block size,
|
||||
\textbf{padding} is appended to the last block to bring it to the required
|
||||
length. The \textbf{state} is a set of words initialized to fixed constants and
|
||||
updated after each block; it accumulates the result of the computation and is
|
||||
serialized into the digest at the end. The \textbf{compression function} is the
|
||||
core transformation applied to each block --- it takes the current state and
|
||||
one block of data, and produces a new state.
|
||||
|
||||
\vspace{1em}
|
||||
|
||||
The \textbf{Miyaguchi-Preneel} construction is a way to build a compression
|
||||
function from a block cipher $E$. Given a current state $H$ and a message
|
||||
block $M$, it produces a new state as:
|
||||
|
||||
\begin{align*}
|
||||
H \leftarrow E(H,\ M) \oplus M \oplus H
|
||||
\end{align*}
|
||||
|
||||
\noindent where $E(H, M)$ denotes the encryption of $M$ using $H$ as the key.
|
||||
The XOR with both $M$ and $H$ ensures that the output cannot be trivially
|
||||
inverted even if $E$ is known.
|
||||
|
||||
\vspace{1em}
|
||||
|
||||
The \textbf{wide-pipe} construction is a variant of Merkle-Damgård where the
|
||||
internal state is wider than the final digest. This makes collision attacks
|
||||
harder: an attacker targeting the output must first find a collision in the
|
||||
larger internal state, which requires significantly more work than attacking
|
||||
the digest directly.
|
||||
|
||||
\newpage
|
||||
110
doc/sha256.tex
Normal file
110
doc/sha256.tex
Normal file
|
|
@ -0,0 +1,110 @@
|
|||
\section{SHA-256}
|
||||
|
||||
SHA-256 is part of the SHA-2 family of cryptographic hash functions, designed
|
||||
by the NSA and first published by NIST in 2001. It produces a 256-bit digest
|
||||
from a message of arbitrary length, processing data in 512-bit blocks. Unlike
|
||||
MD5, SHA-256 has no known practical collision attacks and remains widely used
|
||||
in security-critical applications such as TLS certificates and Bitcoin's
|
||||
proof-of-work.
|
||||
|
||||
\vspace{1em}
|
||||
|
||||
SHA-256 maintains a state of eight 32-bit words, initialized to fixed constants
|
||||
derived from the square roots of the first eight prime numbers. Each 512-bit
|
||||
block is processed in 64 rounds. Each round applies a compression step involving
|
||||
two non-linear functions, a message schedule word, and a precomputed constant
|
||||
derived from the cube roots of the first 64 prime numbers.
|
||||
|
||||
\vspace{1em}
|
||||
|
||||
The padding scheme is identical to MD5: a single \texttt{1} bit is appended,
|
||||
followed by \texttt{0} bits until the message length is congruent to 448 bits
|
||||
modulo 512, and the original length in bits is appended as a 64-bit integer.
|
||||
The difference is that SHA-256 encodes this length in big-endian order.
|
||||
|
||||
\vspace{1em}
|
||||
|
||||
Each round uses two non-linear functions applied to the state words:
|
||||
|
||||
\begin{align*}
|
||||
\text{Ch}(E, F, G) &= (E \land F) \oplus (\lnot E \land G) \\
|
||||
\text{Maj}(A, B, C) &= (A \land B) \oplus (A \land C) \oplus (B \land C)
|
||||
\end{align*}
|
||||
|
||||
\noindent and two rotation-based functions applied to the state words $A$ and $E$:
|
||||
|
||||
\begin{align*}
|
||||
\Sigma_0(A) &= (A \ggg 2) \oplus (A \ggg 13) \oplus (A \ggg 22) \\
|
||||
\Sigma_1(E) &= (E \ggg 6) \oplus (E \ggg 11) \oplus (E \ggg 25)
|
||||
\end{align*}
|
||||
|
||||
\noindent where $\ggg$ denotes a right rotation.
|
||||
|
||||
\vspace{1em}
|
||||
|
||||
At each round $i$ (with $0 \leq i < 64$), the state is updated as follows:
|
||||
|
||||
\begin{align*}
|
||||
T_1 &= H + \Sigma_1(E) + \text{Ch}(E, F, G) + K[i] + W[i] \\
|
||||
T_2 &= \Sigma_0(A) + \text{Maj}(A, B, C) \\
|
||||
H &\leftarrow G, \quad G \leftarrow F, \quad F \leftarrow E, \quad E \leftarrow D + T_1 \\
|
||||
D &\leftarrow C, \quad C \leftarrow B, \quad B \leftarrow A, \quad A \leftarrow T_1 + T_2
|
||||
\end{align*}
|
||||
|
||||
\noindent where $K[i]$ is a precomputed constant and $W[i]$ is a word from the
|
||||
message schedule.
|
||||
|
||||
\vspace{1em}
|
||||
|
||||
\begin{align*}
|
||||
\forall i \in \mathbb{N},\ 0 \leq i < 64,\quad K[i] = \left\lfloor 2^{32} \times \left(\sqrt[3]{p_{i+1}} \bmod 1\right) \right\rfloor
|
||||
\end{align*}
|
||||
|
||||
\noindent where $p_{i+1}$ is the $(i+1)$-th prime number and $\bmod 1$ denotes
|
||||
the fractional part.
|
||||
|
||||
\vspace{1em}
|
||||
|
||||
The message schedule extends the 16 words of the current block into 64 words
|
||||
using two additional rotation-based functions:
|
||||
|
||||
\begin{align*}
|
||||
\sigma_0(x) &= (x \ggg 7) \oplus (x \ggg 18) \oplus (x \gg 3) \\
|
||||
\sigma_1(x) &= (x \ggg 17) \oplus (x \ggg 19) \oplus (x \gg 10)
|
||||
\end{align*}
|
||||
|
||||
\noindent where $\gg$ denotes a logical right shift, and $M[i]$ denotes the
|
||||
$i$-th 32-bit word of the current 512-bit block. The schedule is then defined
|
||||
as:
|
||||
|
||||
\begin{align*}
|
||||
W[i] = \begin{cases}
|
||||
M[i] & i \in \mathbb{N},\ 0 \leq i < 16 \\
|
||||
\sigma_1(W[i-2]) + W[i-7] + \sigma_0(W[i-15]) + W[i-16] & i \in \mathbb{N},\ 16 \leq i < 64
|
||||
\end{cases}
|
||||
\end{align*}
|
||||
|
||||
\vspace{1em}
|
||||
|
||||
The state is initialized to fixed constants derived from the square roots of
|
||||
the first eight prime numbers:
|
||||
|
||||
\begin{align*}
|
||||
A &= \texttt{0x6a09e667}, \quad B = \texttt{0xbb67ae85}, \quad
|
||||
C = \texttt{0x3c6ef372}, \quad D = \texttt{0xa54ff53a} \\
|
||||
E &= \texttt{0x510e527f}, \quad F = \texttt{0x9b05688c}, \quad
|
||||
G = \texttt{0x1f83d9ab}, \quad H = \texttt{0x5be0cd19}
|
||||
\end{align*}
|
||||
|
||||
After each block is processed, the compressed state is added word-by-word to
|
||||
the state before compression:
|
||||
|
||||
\begin{align*}
|
||||
(A, \ldots, H) \leftarrow (A + A_0,\ B + B_0,\ C + C_0,\ D + D_0,\ E + E_0,\ F + F_0,\ G + G_0,\ H + H_0)
|
||||
\end{align*}
|
||||
|
||||
\noindent where $A_0, \ldots, H_0$ denote the state at the beginning of the
|
||||
block. After all blocks have been processed, the eight state words are
|
||||
serialized in big-endian order to produce the 256-bit digest.
|
||||
|
||||
\newpage
|
||||
106
doc/whirlpool.tex
Normal file
106
doc/whirlpool.tex
Normal file
|
|
@ -0,0 +1,106 @@
|
|||
\section{Whirlpool}
|
||||
|
||||
Whirlpool is a cryptographic hash function designed by Vincent Rijmen and Paulo
|
||||
Barreto, first published in 2000 and standardized by ISO/IEC in 2004. It
|
||||
produces a 512-bit digest from a message of arbitrary length, processing data
|
||||
in 512-bit blocks. Its internal structure is inspired by the wide-pipe
|
||||
Miyaguchi-Preneel construction and shares design principles with AES, using a
|
||||
substitution-permutation network over an $8 \times 8$ matrix of bytes.
|
||||
|
||||
\vspace{1em}
|
||||
|
||||
Whirlpool maintains a state of eight 64-bit words, forming an $8 \times 8$
|
||||
matrix of bytes. Each 512-bit block is processed in 10 rounds. Each round
|
||||
applies four successive transformations to the state matrix: a byte
|
||||
substitution, a column shift, a row mixing, and a round key addition.
|
||||
|
||||
\vspace{1em}
|
||||
|
||||
The padding scheme follows the same structure as MD5 and SHA-256: a single
|
||||
\texttt{1} bit is appended, followed by \texttt{0} bits until the message
|
||||
length is congruent to 448 bits modulo 512. The original message length in bits
|
||||
is then appended as a 64-bit big-endian integer.
|
||||
|
||||
\vspace{1em}
|
||||
|
||||
Each round applies the following four transformations in order:
|
||||
|
||||
\textbf{SubBytes} replaces each byte of the state matrix by its image under the
|
||||
Whirlpool S-box, a fixed 256-entry lookup table defined in the Whirlpool
|
||||
specification.
|
||||
|
||||
\medskip
|
||||
|
||||
\textbf{ShiftColumns} cyclically shifts each column $j$ of the state matrix
|
||||
upward by $j$ positions, producing a transposition that spreads bytes across
|
||||
rows. Formally, if $a_{i,j}$ denotes the byte at row $i$, column $j$ of the
|
||||
state matrix, ShiftColumns produces:
|
||||
|
||||
\begin{align*}
|
||||
b_{i,j} = a_{i',\ j} \quad \text{where } i' = (i - j) \bmod 8
|
||||
\end{align*}
|
||||
|
||||
\medskip
|
||||
|
||||
\textbf{MixRows} multiplies each row of the state matrix by a fixed MDS matrix
|
||||
over $\mathrm{GF}(2^8)$ with irreducible polynomial $x^8 + x^4 + x^3 + x^2 +
|
||||
1$, providing diffusion across the eight bytes of each row. Formally, for each
|
||||
row $i$, each output byte $b_j$ is computed as:
|
||||
|
||||
\begin{align*}
|
||||
b_j = \bigoplus_{k=0}^{7} \mathrm{MDS}[(j - k) \bmod 8] \cdot a_{i,k}
|
||||
\end{align*}
|
||||
|
||||
\noindent where $\cdot$ denotes multiplication in $\mathrm{GF}(2^8)$ and
|
||||
$\oplus$ denotes XOR.
|
||||
|
||||
\medskip
|
||||
|
||||
\textbf{AddRoundKey} XORs the state with the current round key.
|
||||
|
||||
\vspace{1em}
|
||||
|
||||
The S-box and the MDS matrix coefficients are fixed tables defined in the
|
||||
Whirlpool specification; their values are too large to reproduce here. The
|
||||
round constants $\mathrm{RC}[r]$, $r \in \mathbb{N},\ 1 \leq r \leq 10$, are
|
||||
however directly derived from the S-box. Each $\mathrm{RC}[r]$ is an 8-word
|
||||
state where only the first word is non-zero:
|
||||
|
||||
\begin{align*}
|
||||
\mathrm{RC}[r][0] &= \sum_{k=0}^{7} S[8(r-1)+k] \cdot 2^{8(7-k)} \\
|
||||
\mathrm{RC}[r][j] &= 0 \quad \forall j \in \mathbb{N},\ 1 \leq j \leq 7
|
||||
\end{align*}
|
||||
|
||||
\noindent Their role is to break symmetry in the key schedule: without them, a
|
||||
symmetric input state would produce symmetric round keys, weakening the internal
|
||||
block transformation.
|
||||
|
||||
\vspace{1em}
|
||||
|
||||
The round keys $K[r]$, $r \in \mathbb{N},\ 0 \leq r \leq 10$, are derived from
|
||||
the current hash state. $K[0]$ is set to the state before processing the block.
|
||||
Each subsequent key is obtained by applying the round function to the previous
|
||||
key with a precomputed round constant, where $\text{Round}(S, K)$ denotes the
|
||||
successive application of SubBytes, ShiftColumns, MixRows, and AddRoundKey with
|
||||
key $K$ to state $S$:
|
||||
|
||||
\begin{align*}
|
||||
K[0] &= H \\
|
||||
K[r] &= \text{Round}(K[r-1],\ \mathrm{RC}[r]) \quad r \in \mathbb{N},\ 1 \leq r \leq 10
|
||||
\end{align*}
|
||||
|
||||
The block $M$ is then encrypted using these keys under a wide-pipe construction.
|
||||
The final state update follows the Miyaguchi-Preneel scheme:
|
||||
|
||||
\begin{align*}
|
||||
H \leftarrow E(H,\ M) \oplus M \oplus H
|
||||
\end{align*}
|
||||
|
||||
\noindent where $E(H, M)$ denotes the encryption of $M$ with key schedule
|
||||
derived from $H$.
|
||||
|
||||
\vspace{1em}
|
||||
|
||||
The state is initialized to all zeros. After all blocks have been processed,
|
||||
the eight 64-bit state words are serialized in big-endian order to produce the
|
||||
512-bit digest.
|
||||
Loading…
Add table
Reference in a new issue