From aa907ef4a3a8c4d7d1309002ed0089e8e2dd3c42 Mon Sep 17 00:00:00 2001 From: Andreas Tsouchlos Date: Mon, 4 May 2026 12:12:10 +0200 Subject: [PATCH] Incorporate Lia's corrections to classical fundamentals --- src/thesis/chapters/2_fundamentals.tex | 105 +++++++++++++------------ 1 file changed, 54 insertions(+), 51 deletions(-) diff --git a/src/thesis/chapters/2_fundamentals.tex b/src/thesis/chapters/2_fundamentals.tex index 373f7f1..ce196dc 100644 --- a/src/thesis/chapters/2_fundamentals.tex +++ b/src/thesis/chapters/2_fundamentals.tex @@ -15,10 +15,10 @@ these topics and subsequently introduces the fundamentals of \ac{qec}. The core concept underpinning error correcting codes is the realization that introducing a finite amount of redundancy to information before transmission can considerably reduce the error rate. -Specifically, Shannon proved in 1948 that for any channel, a block -code can be found that achieves arbitrarily small probability of -error at any communication rate up to the capacity of the channel -when the block length approaches infinity +Specifically, consider a block code of length $n$ used to communicate +over a channel with capacity $C$ at rate $R$. +Shannon proved in 1948 that for any rate $R < C$, the probability of +error can be made arbitrarily small as $n \to \infty$ \cite[Sec.~13]{shannon_mathematical_1948}. In this section, we explore the concepts of ``classical'' (as in non-quantum) @@ -54,36 +54,34 @@ We call the set of all codewords $\mathcal{C}$ the \textit{code} % d_min and the [] Notation % -During the encoding process, a mapping from $\mathbb{F}_2^k$ -onto $\mathcal{C} \subset \mathbb{F}_2^n$ takes place. -The input messages are mapped onto an expanded vector space, where -they are ``further apart'', giving rise to the error correcting -properties of the code. -This notion of the distance between two codewords $\bm{x}_1$ and -$\bm{x}_2$ can be expressed using the \textit{Hamming distance} $d(\bm{x}_1, -\bm{x}_2)$, which is defined as the number of positions in which they differ. -We define the \textit{minimum distance} of a code $\mathcal{C}$ as +During the encoding process, a mapping from $\mathbb{F}_2^k$ onto +$\mathcal{C} \subset \mathbb{F}_2^n$ takes place. +Since $n > k$, only a fraction $2^{k-n}$ of the vectors in $\mathbb{F}_2^n$ +are valid codewords, leaving room to choose $\mathcal{C}$ such that any two +distinct codewords differ in many positions. +This separation gives rise to the error correcting properties of the code +and is quantified by the \textit{Hamming distance} $d(\bm{x}_1, \bm{x}_2)$, +defined as the number of positions in which $\bm{x}_1$ and $\bm{x}_2$ differ. +The \textit{minimum distance} of a code $\mathcal{C}$ is then defined as % \begin{align*} d_\text{min} := \min \left\{ d(\bm{x}_1, \bm{x}_2) : \bm{x}_1, - \bm{x}_2 \in \mathcal{C}, \bm{x}_1 \neq \bm{x}_2 \right\} - . + \bm{x}_2 \in \mathcal{C}, \bm{x}_1 \neq \bm{x}_2 \right\}. \end{align*} % -We can signify that a binary linear block code has information length -$k$, block length $n$ and minimum distance $d_\text{min}$ using the -notation $[n,k,d_\text{min}]$ \cite[Sec.~1.3]{macwilliams_theory_1977}. +We denote a binary linear block code with information length +$k$, block length $n$ and minimum distance $d_\text{min}$ by +$[n,k,d_\text{min}]$. % % Parity checks, H, and the syndrome % -A particularly elegant way of describing the subspace $C$ of -$\mathbb{F}_2^n$ that the codewords make up is the notion of -\textit{parity checks}. +A particularly elegant way of describing the code space $C$ is the +notion of \textit{parity checks}. Since $\lvert \mathcal{C} \rvert = 2^k$ and $\lvert \mathbb{F}_2^n -\rvert = 2^n$, we could introduce $n-k$ conditions to constrain the -additional degrees of freedom. +\rvert = 2^n$, there are $n-k$ conditions constrain the additional +degrees of freedom. These conditions, called parity checks, take the form of equations over $\mathbb{F}_2^n$, linking the individual positions of each codeword. We can arrange the coefficients of these equations in a @@ -99,7 +97,7 @@ Note that in general we may have linearly dependent parity checks, prompting us to define the \ac{pcm} as $\bm{H} \in \mathbb{F}_2^{m\times n}$ with $\hspace{2mm} m \ge n-k$ instead. The \textit{syndrome} $\bm{s} = \bm{H} \bm{v}^\text{T}$ describes -which parity checks a candidate codeword $\bm{v} \in \mathbb{F}_2^n$ violates. +which parity checks a vector $\bm{v} \in \mathbb{F}_2^n$ violates. The representation using the \ac{pcm} has the benefit of providing a description of the code, the memory complexity of which does not grow exponentially with $n$, in contrast to keeping track of all codewords directly. @@ -173,9 +171,10 @@ Shannon's noisy-channel coding theorem is stated for codes whose block length approaches infinity. This suggests that as the block length becomes larger, the performance of the considered codes should generally improve. -However, the size of the \ac{pcm}, and thus in general the decoding complexity, -of a linear block code grows quadratically with $n$. -This would quickly render decoding intractable as we increase the block length. +However, the size of the \ac{pcm} of a linear block code grows +quadratically with $n$. +This would quickly render decoding intractable as we increase the +block length, due to increased memory and computational complexity. We can get around this problem by constructing $\bm{H}$ in such a manner that the number of nonzero entries grows less than quadratically, e.g., only linearly. @@ -191,16 +190,16 @@ These differ from ``classical codes'' in their decoding algorithms: Classical codes are usually decoded using one-step hard-decision decoding, whereas modern codes are suitable for iterative soft-decision decoding \cite[Preface]{ryan_channel_2009}. The iterative decoding algorithms -in question are generally defined in terms of message passing on the -\textit{Tanner graph} of the code. The Tanner graph is a bipartite +are generally defined in terms of message passing on the +\textit{Tanner graph} of a code. The Tanner graph is a bipartite graph that constitutes an alternative representation of the \ac{pcm}. We define two types of nodes: \acp{vn}, corresponding to codeword bits, and \acp{cn}, corresponding to individual parity checks. We then construct the Tanner graph by connecting each \ac{cn} to the \acp{vn} that make up the corresponding parity check \cite[Sec.~5.1.2]{ryan_channel_2009}. -\Cref{PCM and Tanner graph of the Hamming code} shows this -construction for the [7,4,3]-Hamming code. +\Cref{PCM and Tanner graph of the Hamming code} shows the Tanner +graph of the [7,4,3]-Hamming code. % \begin{figure}[t] \centering @@ -290,8 +289,9 @@ We typically evaluate the performance of LDPC codes using the transmitted block in this context). Considering an \ac{awgn} channel, \Cref{fig:ldpc-perf} shows a qualitative performance characteristic of an \ac{ldpc} code -\cite[Fig.~1]{costello_spatially_2014}. We talk of the -\textit{waterfall} and the \textit{error floor} regions. +\cite[Fig.~1]{costello_spatially_2014}. +We can observe the \textit{waterfall} and the \textit{error floor} +regions typical for \ac{ldpc} codes under iterative decoding. \begin{figure}[t] \centering @@ -330,7 +330,7 @@ qualitative performance characteristic of an \ac{ldpc} code (1.8684E+00, 6.2247E-09) (1.9053E+00, 1E-09) }; - \addlegendentry{Regular} + \addlegendentry{Regular LDPC codes} \addplot+[mark=none, solid, smooth, KITorange] coordinates { (4.5789E-01, 1.1821E-01) @@ -351,7 +351,7 @@ qualitative performance characteristic of an \ac{ldpc} code (2.2947E+00, 3.1876E-09) % (2.8842E+00, 2.0403E-09) }; - \addlegendentry{Irregular} + \addlegendentry{Irregular LDPC codes} \draw[gray, densely dashed] (axis cs:0.65, 2e-3) rectangle (axis cs:1.65, 5e-5); @@ -364,9 +364,9 @@ qualitative performance characteristic of an \ac{ldpc} code \end{tikzpicture} \caption{ - Qualitative performance characteristic of \ac{ldpc} code - in an \ac{awgn} channel. Adapted from - \cite[Fig.~1]{costello_spatially_2014}. + Qualitative performance characteristic of regular or + irregular \ac{ldpc} code in an \ac{awgn} channel. + Adapted from \cite[Fig.~1]{costello_spatially_2014}. } \label{fig:ldpc-perf} \end{figure} @@ -379,15 +379,16 @@ the numbers of ones, of their rows and columns are constant Already during their introduction, regular \ac{ldpc} codes were shown to have a minimum distance scaling linearly with the block length $n$ for large values \cite[Ch.~2,~Theorem~1]{gallager_low_1960}, -which leads to them not exhibiting an error floor under \ac{ml} decoding. +which leads to the fact that they do not exhibit an error floor under +\ac{ml} decoding. Irregular codes, on the other hand, generally do exhibit an error floor, -their redeeming quality being the ability to reach near-capacity +while their redeeming quality is the ability to reach near-capacity performance in the waterfall region \cite[Intro.]{costello_spatially_2014}. \subsection{Spatially-Coupled LDPC Codes} -A relatively recent development in the world of \ac{ldpc} codes is -that of \ac{sc}-\ac{ldpc} codes. +A recent development in the field of \ac{ldpc} codes is that of +\ac{sc}-\ac{ldpc} codes. Their key feature is that they combine the best properties of regular and irregular codes. They have a minimum distance that grows linearly with $n$, promising @@ -396,8 +397,8 @@ iterative decoding behavior, promising good performance in the waterfall region \cite[Intro.]{costello_spatially_2014}. The essential property of \ac{sc}-\ac{ldpc} codes is that codewords -from different \textit{spatial positions}, that would ordinarily be sent -one after the other independently, are coupled. +from different \textit{spatial positions}, which would ordinarily be sent +one after the other independently, are linked. This is achieved by connecting some \acp{vn} of one spatial position to \acp{cn} of another, resulting in a \ac{pcm} of the form \cite[Eq.~1]{hassan_fully_2016} @@ -498,7 +499,7 @@ This construction results in a Tanner graph as depicted in \draw[decorate, decoration={brace, amplitude=10pt}] ([xshift=-5mm,yshift=2mm]vn00.north) -- ([xshift=5mm,yshift=2mm]vn00.north -| cn20.north) - node[midway, above=4mm] {K}; + node[midway, above=4mm] {$K=2$}; \end{tikzpicture} \caption{ @@ -508,8 +509,7 @@ This construction results in a Tanner graph as depicted in \label{fig:sc-ldpc-tanner} \end{figure} -Note that at the first and last few spatial positions, some \acp{cn} -have lower degrees. +Note that at the first few spatial positions some \acp{cn} have lower degrees. This leads to more reliable information about the \acp{vn} that, as we will see, is later passed to subsequent spatial positions during decoding. @@ -526,13 +526,16 @@ algorithms, something that is possible due to their sparsity \cite[Sec.~5.3]{ryan_channel_2009}. The algorithm originally proposed alongside LDPC codes for this purpose by Gallager in 1960 is now known as the \ac{spa} -\cite[5.4.1]{ryan_channel_2009}, also called \ac{bp}. +\cite[5.4.1]{ryan_channel_2009}, also called \acf{bp}. The optimality criterion the \ac{spa} is built around is a symbol-wise \ac{map} decision \cite[Sec.~5.4.1]{ryan_channel_2009}. -The core idea of the resulting algorithm is to view \acp{cn} as -representing single-parity check codes and \acp{vn} as representing -repetition codes. +The core idea of the resulting algorithm is to view \acp{cn} +and \acp{vn} as representing individual local codes. +A \ac{cn} represents a single parity check on the connected \acp{vn}, +so it can be understood as a single-parity check code. +Similarly, a \ac{vn} represents a bit, and all connected \acp{cn} +should agree on its value; it can therefore be understood as a repetition code. The algorithm alternates between consolidating soft information about the \acp{vn} in the \acp{cn}, and consolidating soft information about the \acp{cn} in the \acp{vn}.