diff --git a/src/thesis/chapters/2_fundamentals.tex b/src/thesis/chapters/2_fundamentals.tex index be5ce44..a9acd07 100644 --- a/src/thesis/chapters/2_fundamentals.tex +++ b/src/thesis/chapters/2_fundamentals.tex @@ -35,20 +35,21 @@ algorithm. % Codewords, n, k, rate % -One particularly important class of coding schemes is that of binary -linear block codes. -The information to be protected takes the form of a sequence of +Binary linear block codes form one particularly important class of +coding schemes. +The information to be protected is represented by a sequence of binary symbols, which is split into separate blocks. -Each block is encoded, transmitted, and decoded separately. +Then, each block is encoded, transmitted, and decoded separately. The encoding step introduces redundancy by mapping input messages $\bm{u} \in \mathbb{F}_2^k$ of length $k \in \mathbb{N}$ (called the \textit{information length}) onto \textit{codewords} $\bm{x} \in \mathbb{F}_2^n$ of length $n \in \mathbb{N}$ (called the \textit{block length}) with $n > k$. -A measure of the amount of introduced redundancy is the \textit{code -rate} $R = k/n$. -We call the set of all codewords $\mathcal{C}$ the \textit{code} -\cite[Sec.~3.1.1]{ryan_channel_2009}. +The \textit{code rate} $R = k/n$ is a measure of the amount of +introduced redundancy. +We call the set of all codewords +$\mathcal{C} = \{\bm{x}^{(1)}, \bm{x}^{(2)}, \ldots, \bm{x}^{(2^k)}\}$ +the \textit{code} \cite[Sec.~3.1.1]{ryan_channel_2009}. % % d_min and the [] Notation @@ -77,7 +78,7 @@ $[n,k,d_\text{min}]$. % Parity checks, H, and the syndrome % -A particularly elegant way of describing the code space $C$ is the +A particularly elegant way of describing the code space $\mathcal{C}$ is the notion of \textit{parity checks}. Since $\lvert \mathcal{C} \rvert = 2^k$ and $\lvert \mathbb{F}_2^n \rvert = 2^n$, there are $n-k$ conditions constrain the additional @@ -86,14 +87,14 @@ These conditions, called parity checks, take the form of equations over $\mathbb{F}_2^n$, linking the individual positions of each codeword. We can arrange the coefficients of these equations in a \textit{parity-check matrix} (\acs{pcm}) $\bm{H} \in -\mathbb{F}_2^{(n-k) \times n}$ and equivalently define the code as -\cite[Sec.~3.1.1]{ryan_channel_2009} +\mathbb{F}_2^{(n-k) \times n}$, $\text{rank}(\bm{H}) = n-k$, and +equivalently define the code as \cite[Sec.~3.1.1]{ryan_channel_2009} \begin{align*} - \mathcal{C} = \left\{ \bm{x} \in \mathbb{F}_2^n : + \mathcal{C} := \text{kern}(\bm{H}) = \left\{ \bm{x} \in \mathbb{F}_2^n : \bm{H}\bm{x}^\text{T} = \bm{0} \right\} .% \end{align*} -Note that in general we may have linearly dependent parity checks, +In general, we have linearly dependent parity checks, prompting us to define the \ac{pcm} as $\bm{H} \in \mathbb{F}_2^{m\times n}$ with $\hspace{2mm} m \ge n-k$ instead. The \textit{syndrome} $\bm{s} = \bm{H} \bm{v}^\text{T}$ describes @@ -118,9 +119,9 @@ $\bm{y} \in \mathbb{R}^n$, and \textit{hard-decision} decoding, where $\bm{y} \in \mathbb{F}_2^n$ \cite[Sec.~1.5.1.3]{ryan_channel_2009}. Finally, the decoder is responsible for obtaining an estimate $\hat{\bm{u}} \in \mathbb{F}_2^k$ of the original input message. -This is done by first finding an estimate $\hat{\bm{x}}$ of the sent +This can be done by first finding an estimate $\hat{\bm{x}}$ of the sent codeword and undoing the encoding. -The decoding problem that we generally attempt to solve thus consists +The decoding problem that we attempt to solve thus consists in finding the best estimate $\hat{\bm{x}}$ given $\bm{y}$. \begin{figure}[t] @@ -168,9 +169,9 @@ in finding the best estimate $\hat{\bm{x}}$ given $\bm{y}$. % Shannon's noisy-channel coding theorem is stated for codes whose block -length approaches infinity. This suggests that as the block length -becomes larger, the performance of the considered codes should -generally improve. +length $n$ approaches infinity. +This suggests that as the block length becomes larger, the +performance of the considered codes should generally improve. However, the size of the \ac{pcm} of a linear block code grows quadratically with $n$. This would quickly render decoding intractable as we increase the @@ -189,13 +190,14 @@ This is exactly the motivation behind \ac{ldpc} codes These differ from ``classical codes'' in their decoding algorithms: Classical codes are usually decoded using one-step hard-decision decoding, whereas modern codes are suitable for iterative soft-decision -decoding \cite[Preface]{ryan_channel_2009}. The iterative decoding algorithms +decoding \cite[Preface]{ryan_channel_2009}. +For \ac{ldpc} codes, the iterative decoding algorithms are generally defined in terms of message passing on the \textit{Tanner graph} of a code. The Tanner graph is a bipartite graph that constitutes an alternative representation of the \ac{pcm}. We define two types of nodes: \Acp{vn}, corresponding to codeword bits, and \acp{cn}, corresponding to individual parity checks. -We then construct the Tanner graph by connecting each \ac{cn} to +Then, we construct the Tanner graph by connecting each \ac{cn} to the \acp{vn} that make up the corresponding parity check \cite[Sec.~5.1.2]{ryan_channel_2009}. \Cref{PCM and Tanner graph of the Hamming code} shows the Tanner @@ -273,10 +275,10 @@ Mathematically, we represent a \ac{vn} using the index $i \in and a \ac{cn} using the index $j \in \mathcal{J} := \left[ 0 : m-1 \right]$. We can then encode the information contained in the graph by defining -the neighborhood of a variable node $i$ as +the neighborhood of a \ac{vn} $i$ as $\mathcal{N}_\text{V} (i) = \left\{ j \in \mathcal{J} : \bm{H}_{j,i} = 1 \right\}$ -and that of a check node $j$ as +and the neighborhood of a \ac{cn} $j$ as $\mathcal{N}_\text{C} (j) = \left\{ i \in \mathcal{I} : \bm{H}_{j,i} = 1 \right\}$. @@ -379,15 +381,15 @@ the numbers of ones, of their rows and columns are constant Already during their introduction, regular \ac{ldpc} codes were shown to have a minimum distance scaling linearly with the block length $n$ for large values \cite[Ch.~2,~Theorem~1]{gallager_low_1960}, -which leads to the fact that they do not exhibit an error floor under -\ac{ml} decoding. -Irregular codes, on the other hand, generally do exhibit an error floor, -while their redeeming quality is the ability to reach near-capacity -performance in the waterfall region \cite[Intro.]{costello_spatially_2014}. +which leads to a more favorable behavior of the error rate for high +signal-to-noise ratios. +Irregular codes, on the other hand, have more severe error floor behavior. +However, they have the the ability to reach near-capacity performance +in the waterfall region \cite[Intro.]{costello_spatially_2014}. \subsection{Spatially-Coupled LDPC Codes} -A recent development in the field of \ac{ldpc} codes is that of +A more recent development in the field of \ac{ldpc} codes is that of \ac{sc}-\ac{ldpc} codes. Their key feature is that they combine the best properties of regular and irregular codes. @@ -399,11 +401,12 @@ waterfall region \cite[Intro.]{costello_spatially_2014}. The essential property of \ac{sc}-\ac{ldpc} codes is that codewords from different \textit{spatial positions}, which would ordinarily be sent one after the other independently, are linked. -This is achieved by connecting some \acp{vn} of one spatial position to -\acp{cn} of another, resulting in a \ac{pcm} of the form +This is achieved by introducing edges between \acp{vn} of one spatial +position and \acp{cn} of another, resulting in a \ac{pcm} of the form \cite[Eq.~1]{hassan_fully_2016} % -\begin{align*} +\begin{align} + \label{eq:PCM} \bm{H} = \begin{pmatrix} \bm{H}_0(1) & & \\ @@ -413,10 +416,11 @@ This is achieved by connecting some \acp{vn} of one spatial position to & & \bm{H}_K(L) \\ \end{pmatrix} , -\end{align*} +\end{align} % where $K \in \mathbb{N}$ is the \textit{coupling width} and $L \in \mathbb{N}$ is the number of spatial positions. +The parts of the \ac{pcm} left empty in \Cref{eq:PCM} are filled with zeros. This construction results in a Tanner graph as depicted in \Cref{fig:sc-ldpc-tanner}. @@ -513,7 +517,7 @@ Note that at the first few spatial positions some \acp{cn} have lower degrees. This leads to more reliable information about the \acp{vn} that, as we will see, is later passed to subsequent spatial positions during decoding. -This is precisely the effect that leads to the good performance of +This is precisely the effect that leads to the improved performance of \ac{sc}-\ac{ldpc} codes in the waterfall region \cite{costello_spatially_2014}. \subsection{Iterative Decoding} @@ -521,15 +525,14 @@ This is precisely the effect that leads to the good performance of % Introduction -\ac{ldpc} codes are generally decoded using efficient iterative -algorithms, something that is possible due to their sparsity -\cite[Sec.~5.3]{ryan_channel_2009}. -The algorithm originally proposed alongside LDPC codes for this -purpose by Gallager in 1960 is now known as the \ac{spa} +Due to their sparse graphs, efficient iterative decoders exist for +\ac{ldpc} codes \cite[Sec.~5.3]{ryan_channel_2009}. +The decoding algorithm originally proposed alongside LDPC codes by +Gallager in 1960 is now known as the \ac{spa} \cite[5.4.1]{ryan_channel_2009}, also called \acf{bp}. The optimality criterion the \ac{spa} is built around is a -symbol-wise \ac{map} decision \cite[Sec.~5.4.1]{ryan_channel_2009}. +bit-wise \ac{map} decision \cite[Sec.~5.4.1]{ryan_channel_2009}. The core idea of the resulting algorithm is to view \acp{cn} and \acp{vn} as representing individual local codes. A \ac{cn} represents a single parity check on the connected \acp{vn}, @@ -539,11 +542,11 @@ should agree on its value; it can therefore be understood as a repetition code. The algorithm alternates between consolidating soft information about the \acp{vn} in the \acp{cn}, and consolidating soft information about the \acp{cn} in the \acp{vn}. -To this end, messages are passed back and forth along the edges of -the Tanner graph. +To this end, messages computed in the nodes are passed back and forth +along the edges of the Tanner graph. $L_{i\rightarrow j}$ represents a message passed from \ac{vn} $i$ to -\ac{cn} j, $L_{i\leftarrow j}$ represents a message passed from -\ac{cn} j to \ac{vn} i. +\ac{cn} $j$, $L_{i\leftarrow j}$ represents a message passed from +\ac{cn} $j$ to \ac{vn} $i$. The \acp{vn} additionally receive messages \cite[5.4.2]{ryan_channel_2009} \begin{align*} \tilde{L}_i = \log \frac{P(X=0 \vert Y=y)}{P(X=1 \vert Y=y)}, @@ -574,7 +577,7 @@ possible cycles and are thus especially problematic. % Min-sum algorithm -A simplification of the \ac{spa} is the min-sum decoder. Here, the +A simplification of the \ac{spa} is the min-sum algorithm. Here, the \ac{cn} update is approximated as \cite[Sec.~5.5.1]{ryan_channel_2009} \begin{align*} L_{i \leftarrow j} = \prod_{i' \in \mathcal{N}_\text{C}(j)\setminus \{i\}}