Incorporate Lia's corrections to classical fundamentals

2026-05-04 12:12:10 +02:00
parent 12036caa91
commit aa907ef4a3
1 changed files with 54 additions and 51 deletions
--- a/src/thesis/chapters/2_fundamentals.tex
+++ b/src/thesis/chapters/2_fundamentals.tex
@@ -15,10 +15,10 @@ these topics and subsequently introduces the fundamentals of \ac{qec}.
 The core concept underpinning error correcting codes is the
 realization that introducing a finite amount of redundancy to
 information before transmission can considerably reduce the error rate.
-Specifically, Shannon proved in 1948 that for any channel, a block
-code can be found that achieves arbitrarily small probability of
-error at any communication rate up to the capacity of the channel
-when the block length approaches infinity
+Specifically, consider a block code of length $n$ used to communicate
+over a channel with capacity $C$ at rate $R$.
+Shannon proved in 1948 that for any rate $R < C$, the probability of
+error can be made arbitrarily small as $n \to \infty$
 \cite[Sec.~13]{shannon_mathematical_1948}.

 In this section, we explore the concepts of ``classical'' (as in non-quantum)
@@ -54,36 +54,34 @@ We call the set of all codewords $\mathcal{C}$ the \textit{code}
 % d_min and the [] Notation
 %

-During the encoding process, a mapping from $\mathbb{F}_2^k$
-onto $\mathcal{C} \subset \mathbb{F}_2^n$ takes place.
-The input messages are mapped onto an expanded vector space, where
-they are ``further apart'', giving rise to the error correcting
-properties of the code.
-This notion of the distance between two codewords $\bm{x}_1$ and
-$\bm{x}_2$ can be expressed using the \textit{Hamming distance} $d(\bm{x}_1,
-\bm{x}_2)$, which is defined as the number of positions in which they differ.
-We define the \textit{minimum distance} of a code $\mathcal{C}$ as
+During the encoding process, a mapping from $\mathbb{F}_2^k$ onto
+$\mathcal{C} \subset \mathbb{F}_2^n$ takes place.
+Since $n > k$, only a fraction $2^{k-n}$ of the vectors in $\mathbb{F}_2^n$
+are valid codewords, leaving room to choose $\mathcal{C}$ such that any two
+distinct codewords differ in many positions.
+This separation gives rise to the error correcting properties of the code
+and is quantified by the \textit{Hamming distance} $d(\bm{x}_1, \bm{x}_2)$,
+defined as the number of positions in which $\bm{x}_1$ and $\bm{x}_2$ differ.
+The \textit{minimum distance} of a code $\mathcal{C}$ is then defined as
 %
 \begin{align*}
    d_\text{min} := \min \left\{ d(\bm{x}_1, \bm{x}_2) : \bm{x}_1,
-    \bm{x}_2 \in \mathcal{C}, \bm{x}_1 \neq \bm{x}_2 \right\}
-    .
+    \bm{x}_2 \in \mathcal{C}, \bm{x}_1 \neq \bm{x}_2 \right\}.
 \end{align*}
 %
-We can signify that a binary linear block code has information length
-$k$, block length $n$ and minimum distance $d_\text{min}$ using the
-notation $[n,k,d_\text{min}]$ \cite[Sec.~1.3]{macwilliams_theory_1977}.
+We denote a binary linear block code with information length
+$k$, block length $n$ and minimum distance $d_\text{min}$ by
+$[n,k,d_\text{min}]$.

 %
 % Parity checks, H, and the syndrome
 %

-A particularly elegant way of describing the subspace $C$ of
-$\mathbb{F}_2^n$ that the codewords make up is the notion of
-\textit{parity checks}.
+A particularly elegant way of describing the code space $C$ is the
+notion of \textit{parity checks}.
 Since $\lvert \mathcal{C} \rvert = 2^k$ and $\lvert \mathbb{F}_2^n
-\rvert = 2^n$, we could introduce $n-k$ conditions to constrain the
-additional degrees of freedom.
+\rvert = 2^n$, there are $n-k$ conditions constrain the additional
+degrees of freedom.
 These conditions, called parity checks, take the form of equations
 over $\mathbb{F}_2^n$, linking the individual positions of each codeword.
 We can arrange the coefficients of these equations in a
@@ -99,7 +97,7 @@ Note that in general we may have linearly dependent parity checks,
 prompting us to define the \ac{pcm} as $\bm{H} \in
 \mathbb{F}_2^{m\times n}$ with $\hspace{2mm} m \ge n-k$ instead.
 The \textit{syndrome} $\bm{s} = \bm{H} \bm{v}^\text{T}$ describes
-which parity checks a candidate codeword $\bm{v} \in \mathbb{F}_2^n$ violates.
+which parity checks a vector $\bm{v} \in \mathbb{F}_2^n$ violates.
 The representation using the \ac{pcm} has the benefit of providing a
 description of the code, the memory complexity of which does not grow
 exponentially with $n$, in contrast to keeping track of all codewords directly.
@@ -173,9 +171,10 @@ Shannon's noisy-channel coding theorem is stated for codes whose block
 length approaches infinity. This suggests that as the block length
 becomes larger, the performance of the considered codes should
 generally improve.
-However, the size of the \ac{pcm}, and thus in general the decoding complexity,
-of a linear block code grows quadratically with $n$.
-This would quickly render decoding intractable as we increase the block length.
+However, the size of the \ac{pcm} of a linear block code grows
+quadratically with $n$.
+This would quickly render decoding intractable as we increase the
+block length, due to increased memory and computational complexity.
 We can get around this problem by constructing $\bm{H}$ in such a
 manner that the number of nonzero entries grows less than quadratically, e.g.,
 only linearly.
@@ -191,16 +190,16 @@ These differ from ``classical codes'' in their decoding algorithms:
 Classical codes are usually decoded using one-step hard-decision decoding,
 whereas modern codes are suitable for iterative soft-decision
 decoding \cite[Preface]{ryan_channel_2009}. The iterative decoding algorithms
-in question are generally defined in terms of message passing on the
-\textit{Tanner graph} of the code. The Tanner graph is a bipartite
+are generally defined in terms of message passing on the
+\textit{Tanner graph} of a code. The Tanner graph is a bipartite
 graph that constitutes an alternative representation of the \ac{pcm}.
 We define two types of nodes: \acp{vn}, corresponding to codeword
 bits, and \acp{cn}, corresponding to individual parity checks.
 We then construct the Tanner graph by connecting each \ac{cn} to
 the \acp{vn} that make up the corresponding parity check
 \cite[Sec.~5.1.2]{ryan_channel_2009}.
-\Cref{PCM and Tanner graph of the Hamming code} shows this
-construction for the [7,4,3]-Hamming code.
+\Cref{PCM and Tanner graph of the Hamming code} shows the Tanner
+graph of the [7,4,3]-Hamming code.
 %
 \begin{figure}[t]
    \centering
@@ -290,8 +289,9 @@ We typically evaluate the performance of LDPC codes using the
 transmitted block in this context).
 Considering an \ac{awgn} channel, \Cref{fig:ldpc-perf} shows a
 qualitative performance characteristic of an \ac{ldpc} code
-\cite[Fig.~1]{costello_spatially_2014}. We talk of the
-\textit{waterfall} and the \textit{error floor} regions.
+\cite[Fig.~1]{costello_spatially_2014}.
+We can observe the \textit{waterfall} and the \textit{error floor}
+regions typical for \ac{ldpc} codes under iterative decoding.

 \begin{figure}[t]
    \centering
@@ -330,7 +330,7 @@ qualitative performance characteristic of an \ac{ldpc} code
                (1.8684E+00, 6.2247E-09)
                (1.9053E+00, 1E-09)
            };
-            \addlegendentry{Regular}
+            \addlegendentry{Regular LDPC codes}

            \addplot+[mark=none, solid, smooth, KITorange] coordinates {
                (4.5789E-01, 1.1821E-01)
@@ -351,7 +351,7 @@ qualitative performance characteristic of an \ac{ldpc} code
                (2.2947E+00, 3.1876E-09)
                % (2.8842E+00, 2.0403E-09)
            };
-            \addlegendentry{Irregular}
+            \addlegendentry{Irregular LDPC codes}

            \draw[gray, densely dashed]
            (axis cs:0.65, 2e-3) rectangle (axis cs:1.65, 5e-5);
@@ -364,9 +364,9 @@ qualitative performance characteristic of an \ac{ldpc} code
    \end{tikzpicture}

    \caption{
-        Qualitative performance characteristic of \ac{ldpc} code
-        in an \ac{awgn} channel. Adapted from
-        \cite[Fig.~1]{costello_spatially_2014}.
+        Qualitative performance characteristic of regular or
+        irregular \ac{ldpc} code in an \ac{awgn} channel.
+        Adapted from \cite[Fig.~1]{costello_spatially_2014}.
    }
    \label{fig:ldpc-perf}
 \end{figure}
@@ -379,15 +379,16 @@ the numbers of ones, of their rows and columns are constant
 Already during their introduction, regular \ac{ldpc} codes were shown to have
 a minimum distance scaling linearly with the block length $n$ for
 large values \cite[Ch.~2,~Theorem~1]{gallager_low_1960},
-which leads to them not exhibiting an error floor under \ac{ml} decoding.
+which leads to the fact that they do not exhibit an error floor under
+\ac{ml} decoding.
 Irregular codes, on the other hand, generally do exhibit an error floor,
-their redeeming quality being the ability to reach near-capacity
+while their redeeming quality is the ability to reach near-capacity
 performance in the waterfall region \cite[Intro.]{costello_spatially_2014}.

 \subsection{Spatially-Coupled LDPC Codes}

-A relatively recent development in the world of \ac{ldpc} codes is
-that of \ac{sc}-\ac{ldpc} codes.
+A recent development in the field of \ac{ldpc} codes is that of
+\ac{sc}-\ac{ldpc} codes.
 Their key feature is that they combine the best properties of regular
 and irregular codes.
 They have a minimum distance that grows linearly with $n$, promising
@@ -396,8 +397,8 @@ iterative decoding behavior, promising good performance in the
 waterfall region \cite[Intro.]{costello_spatially_2014}.

 The essential property of \ac{sc}-\ac{ldpc} codes is that codewords
-from different \textit{spatial positions}, that would ordinarily be sent
-one after the other independently, are coupled.
+from different \textit{spatial positions}, which would ordinarily be sent
+one after the other independently, are linked.
 This is achieved by connecting some \acp{vn} of one spatial position to
 \acp{cn} of another, resulting in a \ac{pcm} of the form
 \cite[Eq.~1]{hassan_fully_2016}
@@ -498,7 +499,7 @@ This construction results in a Tanner graph as depicted in
        \draw[decorate, decoration={brace, amplitude=10pt}]
        ([xshift=-5mm,yshift=2mm]vn00.north) --
        ([xshift=5mm,yshift=2mm]vn00.north -| cn20.north)
-        node[midway, above=4mm] {K};
+        node[midway, above=4mm] {$K=2$};
    \end{tikzpicture}

    \caption{
@@ -508,8 +509,7 @@ This construction results in a Tanner graph as depicted in
    \label{fig:sc-ldpc-tanner}
 \end{figure}

-Note that at the first and last few spatial positions, some \acp{cn}
-have lower degrees.
+Note that at the first few spatial positions some \acp{cn} have lower degrees.
 This leads to more reliable information about the
 \acp{vn} that, as we will see, is
 later passed to subsequent spatial positions during decoding.
@@ -526,13 +526,16 @@ algorithms, something that is possible due to their sparsity
 \cite[Sec.~5.3]{ryan_channel_2009}.
 The algorithm originally proposed alongside LDPC codes for this
 purpose by Gallager in 1960 is now known as the \ac{spa}
-\cite[5.4.1]{ryan_channel_2009}, also called \ac{bp}.
+\cite[5.4.1]{ryan_channel_2009}, also called \acf{bp}.

 The optimality criterion the \ac{spa} is built around is a
 symbol-wise \ac{map} decision \cite[Sec.~5.4.1]{ryan_channel_2009}.
-The core idea of the resulting algorithm is to view \acp{cn} as
-representing single-parity check codes and \acp{vn} as representing
-repetition codes.
+The core idea of the resulting algorithm is to view \acp{cn}
+and \acp{vn} as representing individual local codes.
+A \ac{cn} represents a single parity check on the connected \acp{vn},
+so it can be understood as a single-parity check code.
+Similarly, a \ac{vn} represents a bit, and all connected \acp{cn}
+should agree on its value; it can therefore be understood as a repetition code.
 The algorithm alternates between consolidating soft information about
 the \acp{vn} in the \acp{cn}, and consolidating soft information about
 the \acp{cn} in the \acp{vn}.