Incorporate Lia's corrections to classical fundamentals

This commit is contained in:
2026-05-04 12:12:10 +02:00
parent 12036caa91
commit aa907ef4a3

View File

@@ -15,10 +15,10 @@ these topics and subsequently introduces the fundamentals of \ac{qec}.
The core concept underpinning error correcting codes is the The core concept underpinning error correcting codes is the
realization that introducing a finite amount of redundancy to realization that introducing a finite amount of redundancy to
information before transmission can considerably reduce the error rate. information before transmission can considerably reduce the error rate.
Specifically, Shannon proved in 1948 that for any channel, a block Specifically, consider a block code of length $n$ used to communicate
code can be found that achieves arbitrarily small probability of over a channel with capacity $C$ at rate $R$.
error at any communication rate up to the capacity of the channel Shannon proved in 1948 that for any rate $R < C$, the probability of
when the block length approaches infinity error can be made arbitrarily small as $n \to \infty$
\cite[Sec.~13]{shannon_mathematical_1948}. \cite[Sec.~13]{shannon_mathematical_1948}.
In this section, we explore the concepts of ``classical'' (as in non-quantum) In this section, we explore the concepts of ``classical'' (as in non-quantum)
@@ -54,36 +54,34 @@ We call the set of all codewords $\mathcal{C}$ the \textit{code}
% d_min and the [] Notation % d_min and the [] Notation
% %
During the encoding process, a mapping from $\mathbb{F}_2^k$ During the encoding process, a mapping from $\mathbb{F}_2^k$ onto
onto $\mathcal{C} \subset \mathbb{F}_2^n$ takes place. $\mathcal{C} \subset \mathbb{F}_2^n$ takes place.
The input messages are mapped onto an expanded vector space, where Since $n > k$, only a fraction $2^{k-n}$ of the vectors in $\mathbb{F}_2^n$
they are ``further apart'', giving rise to the error correcting are valid codewords, leaving room to choose $\mathcal{C}$ such that any two
properties of the code. distinct codewords differ in many positions.
This notion of the distance between two codewords $\bm{x}_1$ and This separation gives rise to the error correcting properties of the code
$\bm{x}_2$ can be expressed using the \textit{Hamming distance} $d(\bm{x}_1, and is quantified by the \textit{Hamming distance} $d(\bm{x}_1, \bm{x}_2)$,
\bm{x}_2)$, which is defined as the number of positions in which they differ. defined as the number of positions in which $\bm{x}_1$ and $\bm{x}_2$ differ.
We define the \textit{minimum distance} of a code $\mathcal{C}$ as The \textit{minimum distance} of a code $\mathcal{C}$ is then defined as
% %
\begin{align*} \begin{align*}
d_\text{min} := \min \left\{ d(\bm{x}_1, \bm{x}_2) : \bm{x}_1, d_\text{min} := \min \left\{ d(\bm{x}_1, \bm{x}_2) : \bm{x}_1,
\bm{x}_2 \in \mathcal{C}, \bm{x}_1 \neq \bm{x}_2 \right\} \bm{x}_2 \in \mathcal{C}, \bm{x}_1 \neq \bm{x}_2 \right\}.
.
\end{align*} \end{align*}
% %
We can signify that a binary linear block code has information length We denote a binary linear block code with information length
$k$, block length $n$ and minimum distance $d_\text{min}$ using the $k$, block length $n$ and minimum distance $d_\text{min}$ by
notation $[n,k,d_\text{min}]$ \cite[Sec.~1.3]{macwilliams_theory_1977}. $[n,k,d_\text{min}]$.
% %
% Parity checks, H, and the syndrome % Parity checks, H, and the syndrome
% %
A particularly elegant way of describing the subspace $C$ of A particularly elegant way of describing the code space $C$ is the
$\mathbb{F}_2^n$ that the codewords make up is the notion of notion of \textit{parity checks}.
\textit{parity checks}.
Since $\lvert \mathcal{C} \rvert = 2^k$ and $\lvert \mathbb{F}_2^n Since $\lvert \mathcal{C} \rvert = 2^k$ and $\lvert \mathbb{F}_2^n
\rvert = 2^n$, we could introduce $n-k$ conditions to constrain the \rvert = 2^n$, there are $n-k$ conditions constrain the additional
additional degrees of freedom. degrees of freedom.
These conditions, called parity checks, take the form of equations These conditions, called parity checks, take the form of equations
over $\mathbb{F}_2^n$, linking the individual positions of each codeword. over $\mathbb{F}_2^n$, linking the individual positions of each codeword.
We can arrange the coefficients of these equations in a We can arrange the coefficients of these equations in a
@@ -99,7 +97,7 @@ Note that in general we may have linearly dependent parity checks,
prompting us to define the \ac{pcm} as $\bm{H} \in prompting us to define the \ac{pcm} as $\bm{H} \in
\mathbb{F}_2^{m\times n}$ with $\hspace{2mm} m \ge n-k$ instead. \mathbb{F}_2^{m\times n}$ with $\hspace{2mm} m \ge n-k$ instead.
The \textit{syndrome} $\bm{s} = \bm{H} \bm{v}^\text{T}$ describes The \textit{syndrome} $\bm{s} = \bm{H} \bm{v}^\text{T}$ describes
which parity checks a candidate codeword $\bm{v} \in \mathbb{F}_2^n$ violates. which parity checks a vector $\bm{v} \in \mathbb{F}_2^n$ violates.
The representation using the \ac{pcm} has the benefit of providing a The representation using the \ac{pcm} has the benefit of providing a
description of the code, the memory complexity of which does not grow description of the code, the memory complexity of which does not grow
exponentially with $n$, in contrast to keeping track of all codewords directly. exponentially with $n$, in contrast to keeping track of all codewords directly.
@@ -173,9 +171,10 @@ Shannon's noisy-channel coding theorem is stated for codes whose block
length approaches infinity. This suggests that as the block length length approaches infinity. This suggests that as the block length
becomes larger, the performance of the considered codes should becomes larger, the performance of the considered codes should
generally improve. generally improve.
However, the size of the \ac{pcm}, and thus in general the decoding complexity, However, the size of the \ac{pcm} of a linear block code grows
of a linear block code grows quadratically with $n$. quadratically with $n$.
This would quickly render decoding intractable as we increase the block length. This would quickly render decoding intractable as we increase the
block length, due to increased memory and computational complexity.
We can get around this problem by constructing $\bm{H}$ in such a We can get around this problem by constructing $\bm{H}$ in such a
manner that the number of nonzero entries grows less than quadratically, e.g., manner that the number of nonzero entries grows less than quadratically, e.g.,
only linearly. only linearly.
@@ -191,16 +190,16 @@ These differ from ``classical codes'' in their decoding algorithms:
Classical codes are usually decoded using one-step hard-decision decoding, Classical codes are usually decoded using one-step hard-decision decoding,
whereas modern codes are suitable for iterative soft-decision whereas modern codes are suitable for iterative soft-decision
decoding \cite[Preface]{ryan_channel_2009}. The iterative decoding algorithms decoding \cite[Preface]{ryan_channel_2009}. The iterative decoding algorithms
in question are generally defined in terms of message passing on the are generally defined in terms of message passing on the
\textit{Tanner graph} of the code. The Tanner graph is a bipartite \textit{Tanner graph} of a code. The Tanner graph is a bipartite
graph that constitutes an alternative representation of the \ac{pcm}. graph that constitutes an alternative representation of the \ac{pcm}.
We define two types of nodes: \acp{vn}, corresponding to codeword We define two types of nodes: \acp{vn}, corresponding to codeword
bits, and \acp{cn}, corresponding to individual parity checks. bits, and \acp{cn}, corresponding to individual parity checks.
We then construct the Tanner graph by connecting each \ac{cn} to We then construct the Tanner graph by connecting each \ac{cn} to
the \acp{vn} that make up the corresponding parity check the \acp{vn} that make up the corresponding parity check
\cite[Sec.~5.1.2]{ryan_channel_2009}. \cite[Sec.~5.1.2]{ryan_channel_2009}.
\Cref{PCM and Tanner graph of the Hamming code} shows this \Cref{PCM and Tanner graph of the Hamming code} shows the Tanner
construction for the [7,4,3]-Hamming code. graph of the [7,4,3]-Hamming code.
% %
\begin{figure}[t] \begin{figure}[t]
\centering \centering
@@ -290,8 +289,9 @@ We typically evaluate the performance of LDPC codes using the
transmitted block in this context). transmitted block in this context).
Considering an \ac{awgn} channel, \Cref{fig:ldpc-perf} shows a Considering an \ac{awgn} channel, \Cref{fig:ldpc-perf} shows a
qualitative performance characteristic of an \ac{ldpc} code qualitative performance characteristic of an \ac{ldpc} code
\cite[Fig.~1]{costello_spatially_2014}. We talk of the \cite[Fig.~1]{costello_spatially_2014}.
\textit{waterfall} and the \textit{error floor} regions. We can observe the \textit{waterfall} and the \textit{error floor}
regions typical for \ac{ldpc} codes under iterative decoding.
\begin{figure}[t] \begin{figure}[t]
\centering \centering
@@ -330,7 +330,7 @@ qualitative performance characteristic of an \ac{ldpc} code
(1.8684E+00, 6.2247E-09) (1.8684E+00, 6.2247E-09)
(1.9053E+00, 1E-09) (1.9053E+00, 1E-09)
}; };
\addlegendentry{Regular} \addlegendentry{Regular LDPC codes}
\addplot+[mark=none, solid, smooth, KITorange] coordinates { \addplot+[mark=none, solid, smooth, KITorange] coordinates {
(4.5789E-01, 1.1821E-01) (4.5789E-01, 1.1821E-01)
@@ -351,7 +351,7 @@ qualitative performance characteristic of an \ac{ldpc} code
(2.2947E+00, 3.1876E-09) (2.2947E+00, 3.1876E-09)
% (2.8842E+00, 2.0403E-09) % (2.8842E+00, 2.0403E-09)
}; };
\addlegendentry{Irregular} \addlegendentry{Irregular LDPC codes}
\draw[gray, densely dashed] \draw[gray, densely dashed]
(axis cs:0.65, 2e-3) rectangle (axis cs:1.65, 5e-5); (axis cs:0.65, 2e-3) rectangle (axis cs:1.65, 5e-5);
@@ -364,9 +364,9 @@ qualitative performance characteristic of an \ac{ldpc} code
\end{tikzpicture} \end{tikzpicture}
\caption{ \caption{
Qualitative performance characteristic of \ac{ldpc} code Qualitative performance characteristic of regular or
in an \ac{awgn} channel. Adapted from irregular \ac{ldpc} code in an \ac{awgn} channel.
\cite[Fig.~1]{costello_spatially_2014}. Adapted from \cite[Fig.~1]{costello_spatially_2014}.
} }
\label{fig:ldpc-perf} \label{fig:ldpc-perf}
\end{figure} \end{figure}
@@ -379,15 +379,16 @@ the numbers of ones, of their rows and columns are constant
Already during their introduction, regular \ac{ldpc} codes were shown to have Already during their introduction, regular \ac{ldpc} codes were shown to have
a minimum distance scaling linearly with the block length $n$ for a minimum distance scaling linearly with the block length $n$ for
large values \cite[Ch.~2,~Theorem~1]{gallager_low_1960}, large values \cite[Ch.~2,~Theorem~1]{gallager_low_1960},
which leads to them not exhibiting an error floor under \ac{ml} decoding. which leads to the fact that they do not exhibit an error floor under
\ac{ml} decoding.
Irregular codes, on the other hand, generally do exhibit an error floor, Irregular codes, on the other hand, generally do exhibit an error floor,
their redeeming quality being the ability to reach near-capacity while their redeeming quality is the ability to reach near-capacity
performance in the waterfall region \cite[Intro.]{costello_spatially_2014}. performance in the waterfall region \cite[Intro.]{costello_spatially_2014}.
\subsection{Spatially-Coupled LDPC Codes} \subsection{Spatially-Coupled LDPC Codes}
A relatively recent development in the world of \ac{ldpc} codes is A recent development in the field of \ac{ldpc} codes is that of
that of \ac{sc}-\ac{ldpc} codes. \ac{sc}-\ac{ldpc} codes.
Their key feature is that they combine the best properties of regular Their key feature is that they combine the best properties of regular
and irregular codes. and irregular codes.
They have a minimum distance that grows linearly with $n$, promising They have a minimum distance that grows linearly with $n$, promising
@@ -396,8 +397,8 @@ iterative decoding behavior, promising good performance in the
waterfall region \cite[Intro.]{costello_spatially_2014}. waterfall region \cite[Intro.]{costello_spatially_2014}.
The essential property of \ac{sc}-\ac{ldpc} codes is that codewords The essential property of \ac{sc}-\ac{ldpc} codes is that codewords
from different \textit{spatial positions}, that would ordinarily be sent from different \textit{spatial positions}, which would ordinarily be sent
one after the other independently, are coupled. one after the other independently, are linked.
This is achieved by connecting some \acp{vn} of one spatial position to This is achieved by connecting some \acp{vn} of one spatial position to
\acp{cn} of another, resulting in a \ac{pcm} of the form \acp{cn} of another, resulting in a \ac{pcm} of the form
\cite[Eq.~1]{hassan_fully_2016} \cite[Eq.~1]{hassan_fully_2016}
@@ -498,7 +499,7 @@ This construction results in a Tanner graph as depicted in
\draw[decorate, decoration={brace, amplitude=10pt}] \draw[decorate, decoration={brace, amplitude=10pt}]
([xshift=-5mm,yshift=2mm]vn00.north) -- ([xshift=-5mm,yshift=2mm]vn00.north) --
([xshift=5mm,yshift=2mm]vn00.north -| cn20.north) ([xshift=5mm,yshift=2mm]vn00.north -| cn20.north)
node[midway, above=4mm] {K}; node[midway, above=4mm] {$K=2$};
\end{tikzpicture} \end{tikzpicture}
\caption{ \caption{
@@ -508,8 +509,7 @@ This construction results in a Tanner graph as depicted in
\label{fig:sc-ldpc-tanner} \label{fig:sc-ldpc-tanner}
\end{figure} \end{figure}
Note that at the first and last few spatial positions, some \acp{cn} Note that at the first few spatial positions some \acp{cn} have lower degrees.
have lower degrees.
This leads to more reliable information about the This leads to more reliable information about the
\acp{vn} that, as we will see, is \acp{vn} that, as we will see, is
later passed to subsequent spatial positions during decoding. later passed to subsequent spatial positions during decoding.
@@ -526,13 +526,16 @@ algorithms, something that is possible due to their sparsity
\cite[Sec.~5.3]{ryan_channel_2009}. \cite[Sec.~5.3]{ryan_channel_2009}.
The algorithm originally proposed alongside LDPC codes for this The algorithm originally proposed alongside LDPC codes for this
purpose by Gallager in 1960 is now known as the \ac{spa} purpose by Gallager in 1960 is now known as the \ac{spa}
\cite[5.4.1]{ryan_channel_2009}, also called \ac{bp}. \cite[5.4.1]{ryan_channel_2009}, also called \acf{bp}.
The optimality criterion the \ac{spa} is built around is a The optimality criterion the \ac{spa} is built around is a
symbol-wise \ac{map} decision \cite[Sec.~5.4.1]{ryan_channel_2009}. symbol-wise \ac{map} decision \cite[Sec.~5.4.1]{ryan_channel_2009}.
The core idea of the resulting algorithm is to view \acp{cn} as The core idea of the resulting algorithm is to view \acp{cn}
representing single-parity check codes and \acp{vn} as representing and \acp{vn} as representing individual local codes.
repetition codes. A \ac{cn} represents a single parity check on the connected \acp{vn},
so it can be understood as a single-parity check code.
Similarly, a \ac{vn} represents a bit, and all connected \acp{cn}
should agree on its value; it can therefore be understood as a repetition code.
The algorithm alternates between consolidating soft information about The algorithm alternates between consolidating soft information about
the \acp{vn} in the \acp{cn}, and consolidating soft information about the \acp{vn} in the \acp{cn}, and consolidating soft information about
the \acp{cn} in the \acp{vn}. the \acp{cn} in the \acp{vn}.