Incorporate Lia's corrections to classical fundamentals

This commit is contained in:
2026-05-04 12:12:10 +02:00
parent 12036caa91
commit aa907ef4a3

View File

@@ -15,10 +15,10 @@ these topics and subsequently introduces the fundamentals of \ac{qec}.
The core concept underpinning error correcting codes is the
realization that introducing a finite amount of redundancy to
information before transmission can considerably reduce the error rate.
Specifically, Shannon proved in 1948 that for any channel, a block
code can be found that achieves arbitrarily small probability of
error at any communication rate up to the capacity of the channel
when the block length approaches infinity
Specifically, consider a block code of length $n$ used to communicate
over a channel with capacity $C$ at rate $R$.
Shannon proved in 1948 that for any rate $R < C$, the probability of
error can be made arbitrarily small as $n \to \infty$
\cite[Sec.~13]{shannon_mathematical_1948}.
In this section, we explore the concepts of ``classical'' (as in non-quantum)
@@ -54,36 +54,34 @@ We call the set of all codewords $\mathcal{C}$ the \textit{code}
% d_min and the [] Notation
%
During the encoding process, a mapping from $\mathbb{F}_2^k$
onto $\mathcal{C} \subset \mathbb{F}_2^n$ takes place.
The input messages are mapped onto an expanded vector space, where
they are ``further apart'', giving rise to the error correcting
properties of the code.
This notion of the distance between two codewords $\bm{x}_1$ and
$\bm{x}_2$ can be expressed using the \textit{Hamming distance} $d(\bm{x}_1,
\bm{x}_2)$, which is defined as the number of positions in which they differ.
We define the \textit{minimum distance} of a code $\mathcal{C}$ as
During the encoding process, a mapping from $\mathbb{F}_2^k$ onto
$\mathcal{C} \subset \mathbb{F}_2^n$ takes place.
Since $n > k$, only a fraction $2^{k-n}$ of the vectors in $\mathbb{F}_2^n$
are valid codewords, leaving room to choose $\mathcal{C}$ such that any two
distinct codewords differ in many positions.
This separation gives rise to the error correcting properties of the code
and is quantified by the \textit{Hamming distance} $d(\bm{x}_1, \bm{x}_2)$,
defined as the number of positions in which $\bm{x}_1$ and $\bm{x}_2$ differ.
The \textit{minimum distance} of a code $\mathcal{C}$ is then defined as
%
\begin{align*}
d_\text{min} := \min \left\{ d(\bm{x}_1, \bm{x}_2) : \bm{x}_1,
\bm{x}_2 \in \mathcal{C}, \bm{x}_1 \neq \bm{x}_2 \right\}
.
\bm{x}_2 \in \mathcal{C}, \bm{x}_1 \neq \bm{x}_2 \right\}.
\end{align*}
%
We can signify that a binary linear block code has information length
$k$, block length $n$ and minimum distance $d_\text{min}$ using the
notation $[n,k,d_\text{min}]$ \cite[Sec.~1.3]{macwilliams_theory_1977}.
We denote a binary linear block code with information length
$k$, block length $n$ and minimum distance $d_\text{min}$ by
$[n,k,d_\text{min}]$.
%
% Parity checks, H, and the syndrome
%
A particularly elegant way of describing the subspace $C$ of
$\mathbb{F}_2^n$ that the codewords make up is the notion of
\textit{parity checks}.
A particularly elegant way of describing the code space $C$ is the
notion of \textit{parity checks}.
Since $\lvert \mathcal{C} \rvert = 2^k$ and $\lvert \mathbb{F}_2^n
\rvert = 2^n$, we could introduce $n-k$ conditions to constrain the
additional degrees of freedom.
\rvert = 2^n$, there are $n-k$ conditions constrain the additional
degrees of freedom.
These conditions, called parity checks, take the form of equations
over $\mathbb{F}_2^n$, linking the individual positions of each codeword.
We can arrange the coefficients of these equations in a
@@ -99,7 +97,7 @@ Note that in general we may have linearly dependent parity checks,
prompting us to define the \ac{pcm} as $\bm{H} \in
\mathbb{F}_2^{m\times n}$ with $\hspace{2mm} m \ge n-k$ instead.
The \textit{syndrome} $\bm{s} = \bm{H} \bm{v}^\text{T}$ describes
which parity checks a candidate codeword $\bm{v} \in \mathbb{F}_2^n$ violates.
which parity checks a vector $\bm{v} \in \mathbb{F}_2^n$ violates.
The representation using the \ac{pcm} has the benefit of providing a
description of the code, the memory complexity of which does not grow
exponentially with $n$, in contrast to keeping track of all codewords directly.
@@ -173,9 +171,10 @@ Shannon's noisy-channel coding theorem is stated for codes whose block
length approaches infinity. This suggests that as the block length
becomes larger, the performance of the considered codes should
generally improve.
However, the size of the \ac{pcm}, and thus in general the decoding complexity,
of a linear block code grows quadratically with $n$.
This would quickly render decoding intractable as we increase the block length.
However, the size of the \ac{pcm} of a linear block code grows
quadratically with $n$.
This would quickly render decoding intractable as we increase the
block length, due to increased memory and computational complexity.
We can get around this problem by constructing $\bm{H}$ in such a
manner that the number of nonzero entries grows less than quadratically, e.g.,
only linearly.
@@ -191,16 +190,16 @@ These differ from ``classical codes'' in their decoding algorithms:
Classical codes are usually decoded using one-step hard-decision decoding,
whereas modern codes are suitable for iterative soft-decision
decoding \cite[Preface]{ryan_channel_2009}. The iterative decoding algorithms
in question are generally defined in terms of message passing on the
\textit{Tanner graph} of the code. The Tanner graph is a bipartite
are generally defined in terms of message passing on the
\textit{Tanner graph} of a code. The Tanner graph is a bipartite
graph that constitutes an alternative representation of the \ac{pcm}.
We define two types of nodes: \acp{vn}, corresponding to codeword
bits, and \acp{cn}, corresponding to individual parity checks.
We then construct the Tanner graph by connecting each \ac{cn} to
the \acp{vn} that make up the corresponding parity check
\cite[Sec.~5.1.2]{ryan_channel_2009}.
\Cref{PCM and Tanner graph of the Hamming code} shows this
construction for the [7,4,3]-Hamming code.
\Cref{PCM and Tanner graph of the Hamming code} shows the Tanner
graph of the [7,4,3]-Hamming code.
%
\begin{figure}[t]
\centering
@@ -290,8 +289,9 @@ We typically evaluate the performance of LDPC codes using the
transmitted block in this context).
Considering an \ac{awgn} channel, \Cref{fig:ldpc-perf} shows a
qualitative performance characteristic of an \ac{ldpc} code
\cite[Fig.~1]{costello_spatially_2014}. We talk of the
\textit{waterfall} and the \textit{error floor} regions.
\cite[Fig.~1]{costello_spatially_2014}.
We can observe the \textit{waterfall} and the \textit{error floor}
regions typical for \ac{ldpc} codes under iterative decoding.
\begin{figure}[t]
\centering
@@ -330,7 +330,7 @@ qualitative performance characteristic of an \ac{ldpc} code
(1.8684E+00, 6.2247E-09)
(1.9053E+00, 1E-09)
};
\addlegendentry{Regular}
\addlegendentry{Regular LDPC codes}
\addplot+[mark=none, solid, smooth, KITorange] coordinates {
(4.5789E-01, 1.1821E-01)
@@ -351,7 +351,7 @@ qualitative performance characteristic of an \ac{ldpc} code
(2.2947E+00, 3.1876E-09)
% (2.8842E+00, 2.0403E-09)
};
\addlegendentry{Irregular}
\addlegendentry{Irregular LDPC codes}
\draw[gray, densely dashed]
(axis cs:0.65, 2e-3) rectangle (axis cs:1.65, 5e-5);
@@ -364,9 +364,9 @@ qualitative performance characteristic of an \ac{ldpc} code
\end{tikzpicture}
\caption{
Qualitative performance characteristic of \ac{ldpc} code
in an \ac{awgn} channel. Adapted from
\cite[Fig.~1]{costello_spatially_2014}.
Qualitative performance characteristic of regular or
irregular \ac{ldpc} code in an \ac{awgn} channel.
Adapted from \cite[Fig.~1]{costello_spatially_2014}.
}
\label{fig:ldpc-perf}
\end{figure}
@@ -379,15 +379,16 @@ the numbers of ones, of their rows and columns are constant
Already during their introduction, regular \ac{ldpc} codes were shown to have
a minimum distance scaling linearly with the block length $n$ for
large values \cite[Ch.~2,~Theorem~1]{gallager_low_1960},
which leads to them not exhibiting an error floor under \ac{ml} decoding.
which leads to the fact that they do not exhibit an error floor under
\ac{ml} decoding.
Irregular codes, on the other hand, generally do exhibit an error floor,
their redeeming quality being the ability to reach near-capacity
while their redeeming quality is the ability to reach near-capacity
performance in the waterfall region \cite[Intro.]{costello_spatially_2014}.
\subsection{Spatially-Coupled LDPC Codes}
A relatively recent development in the world of \ac{ldpc} codes is
that of \ac{sc}-\ac{ldpc} codes.
A recent development in the field of \ac{ldpc} codes is that of
\ac{sc}-\ac{ldpc} codes.
Their key feature is that they combine the best properties of regular
and irregular codes.
They have a minimum distance that grows linearly with $n$, promising
@@ -396,8 +397,8 @@ iterative decoding behavior, promising good performance in the
waterfall region \cite[Intro.]{costello_spatially_2014}.
The essential property of \ac{sc}-\ac{ldpc} codes is that codewords
from different \textit{spatial positions}, that would ordinarily be sent
one after the other independently, are coupled.
from different \textit{spatial positions}, which would ordinarily be sent
one after the other independently, are linked.
This is achieved by connecting some \acp{vn} of one spatial position to
\acp{cn} of another, resulting in a \ac{pcm} of the form
\cite[Eq.~1]{hassan_fully_2016}
@@ -498,7 +499,7 @@ This construction results in a Tanner graph as depicted in
\draw[decorate, decoration={brace, amplitude=10pt}]
([xshift=-5mm,yshift=2mm]vn00.north) --
([xshift=5mm,yshift=2mm]vn00.north -| cn20.north)
node[midway, above=4mm] {K};
node[midway, above=4mm] {$K=2$};
\end{tikzpicture}
\caption{
@@ -508,8 +509,7 @@ This construction results in a Tanner graph as depicted in
\label{fig:sc-ldpc-tanner}
\end{figure}
Note that at the first and last few spatial positions, some \acp{cn}
have lower degrees.
Note that at the first few spatial positions some \acp{cn} have lower degrees.
This leads to more reliable information about the
\acp{vn} that, as we will see, is
later passed to subsequent spatial positions during decoding.
@@ -526,13 +526,16 @@ algorithms, something that is possible due to their sparsity
\cite[Sec.~5.3]{ryan_channel_2009}.
The algorithm originally proposed alongside LDPC codes for this
purpose by Gallager in 1960 is now known as the \ac{spa}
\cite[5.4.1]{ryan_channel_2009}, also called \ac{bp}.
\cite[5.4.1]{ryan_channel_2009}, also called \acf{bp}.
The optimality criterion the \ac{spa} is built around is a
symbol-wise \ac{map} decision \cite[Sec.~5.4.1]{ryan_channel_2009}.
The core idea of the resulting algorithm is to view \acp{cn} as
representing single-parity check codes and \acp{vn} as representing
repetition codes.
The core idea of the resulting algorithm is to view \acp{cn}
and \acp{vn} as representing individual local codes.
A \ac{cn} represents a single parity check on the connected \acp{vn},
so it can be understood as a single-parity check code.
Similarly, a \ac{vn} represents a bit, and all connected \acp{cn}
should agree on its value; it can therefore be understood as a repetition code.
The algorithm alternates between consolidating soft information about
the \acp{vn} in the \acp{cn}, and consolidating soft information about
the \acp{cn} in the \acp{vn}.