987 lines
36 KiB
TeX
987 lines
36 KiB
TeX
\chapter{Fundamentals}
|
|
\label{ch:Fundamentals}
|
|
|
|
\Ac{qec} is a field of research combining ``classical''
|
|
communications engineering and quantum information science.
|
|
This chapter provides the relevant theoretical background on both of
|
|
these topics and subsequently introduces the fundamentals of \ac{qec}.
|
|
|
|
% TODO: Is an explanation of BP with guided decimation needed in this chapter?
|
|
% TODO: Is an explanation of OSD needed chapter?
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\section{Classical Error Correction}
|
|
\label{sec:Classical Error Correction}
|
|
|
|
The core concept underpinning error correcting codes is the
|
|
realization that introducing a finite amount of redundancy to
|
|
information before transmission can considerably reduce the error rate.
|
|
Specifically, Shannon proved in 1948 that for any channel, a block
|
|
code can be found that achieves arbitrarily small probability of
|
|
error at any communication rate up to the capacity of the channel
|
|
when the block length approaches infinity
|
|
\cite[Sec.~13]{shannon_mathematical_1948}.
|
|
|
|
In this section, we explore the concepts of ``classical'' (as in non-quantum)
|
|
error correction that are central to this work.
|
|
We start by looking at different ways of encoding information,
|
|
first considering binary linear block codes in general and then \ac{ldpc} and
|
|
\ac{sc}-\ac{ldpc} codes.
|
|
Finally, we pivot to the decoding process, specifically the \ac{bp}
|
|
algorithm.
|
|
|
|
\subsection{Binary Linear Block Codes}
|
|
|
|
%
|
|
% Codewords, n, k, rate
|
|
%
|
|
|
|
One particularly important class of coding schemes is that of binary
|
|
linear block codes.
|
|
The information to be protected takes the form of a sequence of
|
|
binary symbols, which is split into separate blocks.
|
|
Each block is encoded, transmitted, and decoded separately.
|
|
The encoding step introduces redundancy by mapping input messages
|
|
$\bm{u} \in \mathbb{F}_2^k$ of length $k \in \mathbb{N}$ (called the
|
|
\textit{information length}) onto \textit{codewords} $\bm{x} \in
|
|
\mathbb{F}_2^n$ of length $n \in \mathbb{N}$ (called the
|
|
\textit{block length}) with $n > k$.
|
|
A measure of the amount of introduced redundancy is the \textit{code
|
|
rate} $R = k/n$.
|
|
We call the set of all codewords $\mathcal{C}$ the \textit{code}
|
|
\cite[Sec.~3.1.1]{ryan_channel_2009}.
|
|
|
|
%
|
|
% d_min and the [] Notation
|
|
%
|
|
|
|
During the encoding process, a mapping from $\mathbb{F}_2^k$
|
|
onto $\mathcal{C} \subset \mathbb{F}_2^n$ takes place.
|
|
The input messages are mapped onto an expanded vector space, where
|
|
they are ``further apart'', giving rise to the error correcting
|
|
properties of the code.
|
|
This notion of the distance between two codewords $\bm{x}_1$ and
|
|
$\bm{x}_2$ can be expressed using the \textit{Hamming distance} $d(\bm{x}_1,
|
|
\bm{x}_2)$, which is defined as the number of positions in which they differ.
|
|
We define the \textit{minimum distance} of a code $\mathcal{C}$ as
|
|
%
|
|
\begin{align*}
|
|
d_\text{min} := \min \left\{ d(\bm{x}_1, \bm{x}_2) : \bm{x}_1,
|
|
\bm{x}_2 \in \mathcal{C}, \bm{x}_1 \neq \bm{x}_2 \right\}
|
|
.
|
|
\end{align*}
|
|
%
|
|
We can signify that a binary linear block code has information length
|
|
$k$, block length $n$ and minimum distance $d_\text{min}$ using the
|
|
notation $[n,k,d_\text{min}]$ \cite[Sec.~1.3]{macwilliams_theory_1977}.
|
|
|
|
%
|
|
% Parity checks, H, and the syndrome
|
|
%
|
|
|
|
A particularly elegant way of describing the subspace $C$ of
|
|
$\mathbb{F}_2^n$ that the codewords make up is the notion of
|
|
\textit{parity checks}.
|
|
Since $\lvert \mathcal{C} \rvert = 2^k$ and $\lvert \mathbb{F}_2^n
|
|
\rvert = 2^n$, we could introduce $n-k$ conditions to constrain the
|
|
additional degrees of freedom.
|
|
These conditions, called parity checks, take the form of equations
|
|
over $\mathbb{F}_2^n$, linking the individual positions of each codeword.
|
|
We can arrange the coefficients of these equations in a
|
|
\textit{parity-check matrix} (\acs{pcm}) $\bm{H} \in
|
|
\mathbb{F}_2^{(n-k) \times n}$ and equivalently define the code as
|
|
\cite[Sec.~3.1.1]{ryan_channel_2009}
|
|
\begin{align*}
|
|
\mathcal{C} = \left\{ \bm{x} \in \mathbb{F}_2^n :
|
|
\bm{H}\bm{x}^\text{T} = \bm{0} \right\}
|
|
.%
|
|
\end{align*}
|
|
Note that in general we may have linearly dependent parity checks,
|
|
prompting us to define the \ac{pcm} as $\bm{H} \in
|
|
\mathbb{F}_2^{m\times n}$ with $\hspace{2mm} m \ge n-k$ instead.
|
|
The \textit{syndrome} $\bm{s} = \bm{H} \bm{v}^\text{T}$ describes
|
|
which parity checks a candidate codeword $\bm{v} \in \mathbb{F}_2^n$ violates.
|
|
The representation using the \ac{pcm} has the benefit of providing a
|
|
description of the code, the memory complexity of which doesn't grow
|
|
exponentially with $n$, in contrast to keeping track of all codewords directly.
|
|
|
|
%
|
|
% The decoding problem
|
|
%
|
|
|
|
Figure \ref{fig:Diagram of a transmission system} visualizes the
|
|
communication process \cite[Sec.~1.1]{ryan_channel_2009}.
|
|
An input message $\bm{u}\in \mathbb{F}_2^k$ is mapped onto a codeword $\bm{x}
|
|
\in \mathbb{F}_2^n$. This is passed on to a modulator, which
|
|
interacts with the physical channel.
|
|
A demodulator processes the channel output and forwards the result
|
|
$\bm{y}$ to a decoder.
|
|
We differentiate between \textit{soft-decision} decoding, where
|
|
$\bm{y} \in \mathbb{R}^n$, and \textit{hard-decision} decoding, where
|
|
$\bm{y} \in \mathbb{F}_2^n$ \cite[Sec.~1.5.1.3]{ryan_channel_2009}.
|
|
Finally, the decoder is responsible for obtaining an estimate
|
|
$\hat{\bm{u}} \in \mathbb{F}_2^k$ of the original input message.
|
|
This is done by first finding an estimate $\hat{\bm{x}}$ of the sent
|
|
codeword and undoing the encoding.
|
|
The decoding problem that we generally attempt to solve thus consists
|
|
in finding the best estimate $\hat{\bm{x}}$ given $\bm{y}$.
|
|
|
|
\begin{figure}[t]
|
|
\centering
|
|
|
|
\tikzset{
|
|
box/.style={
|
|
rectangle, draw=black, minimum width=17mm, minimum height=8mm,
|
|
},
|
|
}
|
|
|
|
\begin{tikzpicture}
|
|
[
|
|
node distance = 2mm and 7mm,
|
|
]
|
|
\node (in) {};
|
|
\node[box, right=of in] (enc) {Encoder};
|
|
\node[box, minimum width=25mm, right=of enc] (mod) {Modulator};
|
|
\node[box, below right=of mod] (cha) {Channel};
|
|
\node[box, minimum width=25mm, below left=of cha] (dem) {Demodulator};
|
|
\node[box, left=of dem] (dec) {Decoder};
|
|
\node[left=of dec] (out) {};
|
|
|
|
\draw[-{latex}] (in) -- (enc) node[midway, above] {$\bm{u}$};
|
|
\draw[-{latex}] (enc) -- (mod) node[midway, above] {$\bm{x}$};
|
|
\draw[-{latex}] (mod) -| (cha);
|
|
\draw[-{latex}] (cha) |- (dem);
|
|
\draw[-{latex}] (dem) -- (dec) node[midway, above] {$\bm{y}$};
|
|
\draw[-{latex}] (dec) -- (out) node[midway, above] {$\hat{\bm{u}}$};
|
|
\end{tikzpicture}
|
|
|
|
\caption{Overview of a transmission system.}
|
|
\label{fig:Diagram of a transmission system}
|
|
\end{figure}
|
|
%
|
|
|
|
%
|
|
% Hard vs. soft information
|
|
%
|
|
|
|
\subsection{Low-Density Parity-Check Codes}
|
|
|
|
%
|
|
% Core concept
|
|
%
|
|
|
|
Shannon's noisy-channel coding theorem is stated for codes whose block
|
|
length approaches infinity. This suggests that as the block length
|
|
becomes larger, the performance of the considered codes should
|
|
generally improve.
|
|
However, the size of the \ac{pcm}, and thus in general the decoding complexity,
|
|
of a linear block code grows quadratically with $n$.
|
|
This would quickly render decoding intractable as we increase the block length.
|
|
We can get around this problem by constructing $\bm{H}$ in such a
|
|
manner that the number of nonzero entries grows less than quadratically, e.g.,
|
|
only linearly.
|
|
This is exactly the motivation behind \ac{ldpc} codes
|
|
\cite[Ch.~1]{gallager_low_1960}.
|
|
|
|
%
|
|
% Tanner Graph, VNs and CNs
|
|
%
|
|
|
|
\ac{ldpc} codes belong to a class sometimes referred to as ``modern codes''.
|
|
These differ from ``classical codes'' in their decoding algorithms:
|
|
Classical codes are usually decoded using one-step hard-decision decoding,
|
|
whereas modern codes are suitable for iterative soft-decision
|
|
decoding \cite[Preface]{ryan_channel_2009}. The iterative decoding algorithms
|
|
in question are generally defined in terms of message passing on the
|
|
\textit{Tanner graph} of the code. The Tanner graph is a bipartite
|
|
graph that constitutes an alternative representation of the \ac{pcm}.
|
|
We define two types of nodes: \acp{vn}, corresponding to codeword
|
|
bits, and \acp{cn}, corresponding to individual parity checks.
|
|
We then construct the Tanner graph by connecting each \ac{cn} to
|
|
the \acp{vn} that make up the corresponding parity check
|
|
\cite[Sec.~5.1.2]{ryan_channel_2009}.
|
|
Figure \ref{PCM and Tanner graph of the Hamming code} shows this
|
|
construction for the [7,4,3]-Hamming code.
|
|
%
|
|
\begin{figure}[t]
|
|
\centering
|
|
|
|
\begin{align*}
|
|
\bm{H} =
|
|
\begin{pmatrix}
|
|
0 & 1 & 1 & 1 & 1 & 0 & 0 \\
|
|
1 & 0 & 1 & 1 & 0 & 1 & 0 \\
|
|
1 & 1 & 0 & 1 & 0 & 0 & 1 \\
|
|
\end{pmatrix}
|
|
\end{align*}
|
|
|
|
\vspace*{2mm}
|
|
|
|
\tikzset{
|
|
VN/.style={
|
|
circle, fill=KITgreen, minimum width=1mm, minimum height=1mm,
|
|
},
|
|
CN/.style={
|
|
rectangle, fill=KITblue, minimum width=1mm, minimum height=1mm,
|
|
},
|
|
}
|
|
|
|
\begin{tikzpicture}
|
|
\node[VN, label=above:$x_1$] (vn1) {};
|
|
\node[VN, right=12mm of vn1, label=above:$x_2$] (vn2) {};
|
|
\node[VN, right=12mm of vn2, label=above:$x_3$] (vn3) {};
|
|
\node[VN, right=12mm of vn3, label=above:$x_4$] (vn4) {};
|
|
\node[VN, right=12mm of vn4, label=above:$x_5$] (vn5) {};
|
|
\node[VN, right=12mm of vn5, label=above:$x_6$] (vn6) {};
|
|
\node[VN, right=12mm of vn6, label=above:$x_7$] (vn7) {};
|
|
|
|
\node[
|
|
CN, below=25mm of vn4,
|
|
label={below:$x_1 + x_3 + x_4 + x_6 = 0$}
|
|
] (cn2) {};
|
|
\node[
|
|
CN, left=40mm of cn2,
|
|
label={below:$x_2 + x_3 + x_4 + x_5 = 0$}
|
|
] (cn1) {};
|
|
\node[
|
|
CN, right=40mm of cn2,
|
|
label={below:$x_1 + x_2 + x_4 + x_7 = 0$}
|
|
] (cn3) {};
|
|
|
|
\foreach \n in {2,3,4,5} {
|
|
\draw (cn1) -- (vn\n);
|
|
}
|
|
|
|
\foreach \n in {1,3,4,6} {
|
|
\draw (cn2) -- (vn\n);
|
|
}
|
|
|
|
\foreach \n in {1,2,4,7} {
|
|
\draw (cn3) -- (vn\n);
|
|
}
|
|
\end{tikzpicture}
|
|
|
|
\caption{The \ac{pcm} and corresponding Tanner graph of the
|
|
[7,4,3]-Hamming code.}
|
|
\label{PCM and Tanner graph of the Hamming code}
|
|
\end{figure}
|
|
|
|
%
|
|
% N_V(j), N_C(i)
|
|
%
|
|
|
|
Mathematically, we represent a \ac{vn} using the index $i \in
|
|
\mathcal{I} := \left[
|
|
1 : n \right]$ and a \ac{cn} using the index $j \in \mathcal{J}
|
|
:= \left[ 1 : m \right]$.
|
|
We can then encode the information contained in the graph by defining
|
|
the neighborhood of a variable node $i$ as
|
|
$\mathcal{N}_\text{V} (i) = \left\{ j \in \mathcal{J} : \bm{H}_{j,i}
|
|
= 1 \right\}$
|
|
and that of a check node $j$ as
|
|
$\mathcal{N}_\text{C} (j) = \left\{ i \in \mathcal{I} : \bm{H}_{j,i}
|
|
= 1 \right\}$.
|
|
|
|
%
|
|
% Error floor and waterfall regions
|
|
%
|
|
|
|
We typically evaluate the performance of LDPC codes using the
|
|
\ac{ber} or the \ac{fer} (a \textit{frame} referes to one whole
|
|
transmitted block in this context).
|
|
Considering an \ac{awgn} channel, \autoref{fig:ldpc-perf} shows a
|
|
qualitative performance characteristic of an \ac{ldpc} code
|
|
\cite[Fig.~1]{costello_spatially_2014}. We talk of the
|
|
\textit{waterfall} and the \textit{error floor} regions.
|
|
|
|
\begin{figure}[t]
|
|
\centering
|
|
|
|
\begin{tikzpicture}
|
|
\begin{axis}[
|
|
width=12cm,
|
|
height=9cm,
|
|
xlabel={Signal-to-noise ratio},
|
|
ylabel={Error rate},
|
|
% xmin=0, xmax=6,
|
|
enlarge x limits=false,
|
|
ymin=1e-9, ymax=1,
|
|
ticks=none,
|
|
% y tick label={},
|
|
ymode=log,
|
|
grid=both,
|
|
grid style={line width=0.2pt, draw=gray!30},
|
|
major grid style={line width=0.4pt, draw=gray!50},
|
|
legend pos=north east,
|
|
legend cell align={left},
|
|
]
|
|
|
|
\addplot+[mark=none, solid, smooth, KITblue] coordinates {
|
|
(4.5789E-01, 1.1821E-01)
|
|
(6.6842E-01, 9.4575E-02)
|
|
(8.6316E-01, 5.2657E-02)
|
|
(1.0421E+00, 2.2183E-02)
|
|
(1.1789E+00, 8.3588E-03)
|
|
(1.3368E+00, 1.4835E-03)
|
|
(1.4895E+00, 1.6852E-04)
|
|
(1.5842E+00, 2.8285E-05)
|
|
(1.6737E+00, 4.2465E-06)
|
|
(1.7684E+00, 3.4519E-07)
|
|
(1.8316E+00, 3.9213E-08)
|
|
(1.8684E+00, 6.2247E-09)
|
|
(1.9053E+00, 1E-09)
|
|
};
|
|
\addlegendentry{Regular}
|
|
|
|
\addplot+[mark=none, solid, smooth, KITorange] coordinates {
|
|
(4.5789E-01, 1.1821E-01)
|
|
(6.4211E-01, 4.9800E-02)
|
|
(7.5263E-01, 1.2700E-02)
|
|
(8.1579E-01, 2.3177E-03)
|
|
(8.6842E-01, 3.5779E-04)
|
|
(9.1053E-01, 5.3716E-05)
|
|
(9.4737E-01, 4.8818E-06)
|
|
(9.8947E-01, 6.5555E-07)
|
|
(1.0421E+00, 9.5713E-08)
|
|
% (1.0684E+00, 2.9670E-08)
|
|
(1.1474E+00, 1.2499E-08)
|
|
(1.3000E+00, 7.1560E-09)
|
|
(1.4579E+00, 6.0535E-09)
|
|
% (1.6105E+00, 5E-09)
|
|
(1.9579E+00, 4E-09)
|
|
(2.2947E+00, 3.1876E-09)
|
|
% (2.8842E+00, 2.0403E-09)
|
|
};
|
|
\addlegendentry{Irregular}
|
|
|
|
\draw[gray, densely dashed]
|
|
(axis cs:0.65, 2e-3) rectangle (axis cs:1.65, 5e-5);
|
|
\node[below] at (axis cs:1.15, 6e-5) {Waterfall};
|
|
|
|
\draw[gray, densely dashed]
|
|
(axis cs:1, 6e-8) rectangle (axis cs:2, 2e-9);
|
|
\node[above] at (axis cs:1.5, 7e-8) {Error floor};
|
|
\end{axis}
|
|
\end{tikzpicture}
|
|
|
|
\caption{
|
|
Qualitative performance characteristic of \ac{ldpc} code
|
|
in an \ac{awgn} channel. Adapted from
|
|
\cite[Fig.~1]{costello_spatially_2014}.
|
|
}
|
|
\label{fig:ldpc-perf}
|
|
\end{figure}
|
|
|
|
Broadly, there are two kinds of \ac{ldpc} codes, \textit{regular} and
|
|
\textit{irregular}.
|
|
Regular codes are characterized by the fact that the weights, i.e.,
|
|
the numbers of ones, of their rows and columns are constant
|
|
\cite[Sec.~5.1.1]{ryan_channel_2009}.
|
|
Already during their introduction, regular \ac{ldpc} codes were shown to have
|
|
a minimum distance scaling linearly with the block length $n$ for
|
|
large values \cite[Ch.~2,~Theorem~1]{gallager_low_1960},
|
|
which leads to them not exhibiting an error floor under \ac{ml} decoding.
|
|
Irregular codes, on the other hand, generally do exhibit an error floor,
|
|
their redeeming quality being the ability to reach near-capacity
|
|
performance in the waterfall region \cite[Intro.]{costello_spatially_2014}.
|
|
|
|
\subsection{Spatially-Coupled LDPC Codes}
|
|
|
|
A relatively recent development in the world of \ac{ldpc} codes is
|
|
that of \ac{sc}-\ac{ldpc} codes.
|
|
Their key feature is that they combine the best properties of regular
|
|
and irregular codes.
|
|
They have a minimum distance that grows linearly with $n$, promising
|
|
good error floor behavior, and capacity approaching
|
|
iterative decoding behavior, promising good performance in the
|
|
waterfall region \cite[Intro.]{costello_spatially_2014}.
|
|
|
|
The essential property of \ac{sc}-\ac{ldpc} codes is that codewords
|
|
from different \textit{spatial positions}, that would ordinarily be sent
|
|
one after the other independently, are coupled.
|
|
This is achieved by connecting some \acp{vn} of one spatial position to
|
|
\acp{cn} of another, resulting in a \ac{pcm} of the form
|
|
\cite[Eq.~1]{hassan_fully_2016}
|
|
%
|
|
\begin{align*}
|
|
\bm{H} =
|
|
\begin{pmatrix}
|
|
\bm{H}_0(1) & & \\
|
|
\vdots & \ddots & \\
|
|
\bm{H}_K(1) & & \bm{H}_0(L) \\
|
|
& \ddots & \\
|
|
& & \bm{H}_K(L) \\
|
|
\end{pmatrix}
|
|
,
|
|
\end{align*}
|
|
%
|
|
where $K \in \mathbb{N}$ is the \textit{coupling width} and $L \in
|
|
\mathbb{N}$ is the number of spatial positions.
|
|
This construction results in a Tanner graph as depicted in
|
|
\autoref{fig:sc-ldpc-tanner}.
|
|
|
|
\begin{figure}[t]
|
|
\centering
|
|
|
|
\tikzset{
|
|
VN/.style={
|
|
circle, fill=KITgreen, minimum width=1mm, minimum height=1mm,
|
|
},
|
|
CN/.style={
|
|
rectangle, fill=KITblue, minimum width=1mm, minimum height=1mm,
|
|
},
|
|
}
|
|
|
|
\begin{tikzpicture}[node distance=7mm and 1cm]
|
|
\node[VN] (vn00) {};
|
|
\node[VN, below = of vn00] (vn01) {};
|
|
\node[VN, below = of vn01] (vn02) {};
|
|
\node[VN, below = of vn02] (vn03) {};
|
|
\node[VN, below = of vn03] (vn04) {};
|
|
|
|
\coordinate (temp) at ($(vn01)!0.5!(vn02)$);
|
|
|
|
\node[CN, right = of temp] (cn00) {};
|
|
\node[CN, below = of cn00] (cn01) {};
|
|
|
|
\draw (vn00) -- (cn00);
|
|
\draw (vn01) -- (cn00);
|
|
\draw (vn03) -- (cn00);
|
|
\draw (vn01) -- (cn01);
|
|
\draw (vn02) -- (cn01);
|
|
\draw (vn04) -- (cn01);
|
|
|
|
\foreach \i in {1,2,3} {
|
|
\pgfmathtruncatemacro{\previ}{\i-1}
|
|
\node[VN, right = 25mm of vn\previ 0] (vn\i0) {};
|
|
|
|
\foreach \j in {1,...,4} {
|
|
\pgfmathtruncatemacro{\prevj}{\j-1}
|
|
\node[VN, below = of vn\i\prevj] (vn\i\j) {};
|
|
}
|
|
|
|
\coordinate (temp) at ($(vn\i1)!0.5!(vn\i2)$);
|
|
|
|
\node[CN, right = of temp] (cn\i0) {};
|
|
\node[CN, below = of cn\i0] (cn\i1) {};
|
|
|
|
\draw (vn\i0) -- (cn\i0);
|
|
\draw (vn\i1) -- (cn\i0);
|
|
\draw (vn\i3) -- (cn\i0);
|
|
\draw (vn\i1) -- (cn\i1);
|
|
\draw (vn\i2) -- (cn\i1);
|
|
\draw (vn\i4) -- (cn\i1);
|
|
}
|
|
|
|
\node[right = 25mm of vn30] (vn40) {};
|
|
\node[below = of vn40] (vn41) {};
|
|
\node[below = of vn41] (vn42) {};
|
|
\node[below = of vn42] (vn43) {};
|
|
\node[below = of vn43] (vn44) {};
|
|
|
|
\coordinate (temp) at ($(vn41)!0.5!(vn42)$);
|
|
|
|
\node[right = of temp] (cn40) {};
|
|
\node[below = of cn40] (cn41) {};
|
|
|
|
\foreach \i in {0,1,2} {
|
|
\pgfmathtruncatemacro{\next}{\i+1}
|
|
\pgfmathtruncatemacro{\nextnext}{\i+2}
|
|
|
|
\draw (vn\i 3) to[bend right] (cn\next 1);
|
|
\draw (vn\i 1) to[bend left] (cn\nextnext 0);
|
|
}
|
|
|
|
\draw (vn33) to[bend right] (cn41);
|
|
|
|
\node at ($(cn40)!0.5!(cn41)$) {\dots};
|
|
|
|
\draw[decorate, decoration={brace, amplitude=10pt}]
|
|
([xshift=-5mm,yshift=2mm]vn00.north) --
|
|
([xshift=5mm,yshift=2mm]vn00.north -| cn20.north)
|
|
node[midway, above=4mm] {K};
|
|
\end{tikzpicture}
|
|
|
|
\caption{
|
|
Visualization of the coupling between the Tanner graphs
|
|
of individual spatial positions.
|
|
}
|
|
\label{fig:sc-ldpc-tanner}
|
|
\end{figure}
|
|
|
|
Note that at the first and last few spatial positions, some \acp{cn}
|
|
have lower degrees.
|
|
This leads to more reliable information about the
|
|
\acp{vn} that, as we will see, is
|
|
later passed to subsequent spatial positions during decoding.
|
|
This is precisely the effect that leads to the good performance of
|
|
\ac{sc}-\ac{ldpc} codes in the waterfall region \cite{costello_spatially_2014}.
|
|
|
|
\subsection{Iterative Decoding}
|
|
|
|
% Introduction
|
|
|
|
\ac{ldpc} codes are generally decoded using efficient iterative
|
|
algorithms, something that is possible due to their sparsity
|
|
\cite[Sec.~5.3]{ryan_channel_2009}.
|
|
The algorithm originally proposed alongside LDPC codes for this
|
|
purpose by Gallager in 1960 is now known as the \ac{spa}
|
|
\cite[5.4.1]{ryan_channel_2009}, also called \ac{bp}.
|
|
|
|
The optimality criterion the \ac{spa} is built around is a
|
|
symbol-wise \ac{map} decision \cite[Sec.~5.4.1]{ryan_channel_2009}.
|
|
The core idea of the resulting algorithm is to view \acp{cn} as
|
|
representing single-parity check codes and \acp{vn} as representing
|
|
repetition codes.
|
|
The algorithm alternates between consolidating soft information about
|
|
the \acp{vn} in the \acp{cn}, and consolidating soft information about
|
|
the \acp{cn} in the \acp{vn}.
|
|
To this end, messages are passed back and forth along the edges of
|
|
the Tanner graph.
|
|
$L_{i\rightarrow j}$ represents a message passed from \ac{vn} $i$ to
|
|
\ac{cn} j, $L_{i\leftarrow j}$ represents a message passed from
|
|
\ac{cn} j to \ac{vn} i.
|
|
The \acp{vn} additionally receive messages \cite[5.4.2]{ryan_channel_2009}
|
|
\begin{align*}
|
|
\tilde{L}_i = \log \frac{P(X=0 \vert Y=y)}{P(X=1 \vert Y=y)},
|
|
\end{align*}
|
|
computed from the channel outputs.
|
|
The consolidation of the information occurs in the \ac{vn} update
|
|
\begin{align*}
|
|
L_{i\rightarrow j} = \tilde{L}_i + \sum_{j'\in \mathcal{N}(i)\setminus
|
|
j} L_{i\leftarrow j'}
|
|
\end{align*}
|
|
and the \ac{cn} update
|
|
\begin{align*}
|
|
L_{i\leftarrow j} = 2\cdot \tanh^{-1} \left( \prod_{i'\in
|
|
\mathcal{N}(j)\setminus i} \tanh \frac{L_{i'\rightarrow j}}{2} \right)
|
|
.
|
|
\end{align*}
|
|
|
|
A basic assumption for the derivation of the \ac{spa} is that the
|
|
messages are statistically independent.
|
|
If the Tanner graph has cycles, however, this
|
|
condition is not met.
|
|
The shorter the cycles, the sooner this condition is violated and the
|
|
worse the approximation becomes \cite[Sec.~5.4.4]{ryan_channel_2009}.
|
|
Cycles of length four (so-called \emph{$4$-cycles}) are the shortest
|
|
possible cycles and are thus especially problematic.
|
|
|
|
% Min-sum algorithm
|
|
|
|
A simplification of the \ac{spa} is the min-sum decoder. Here, the
|
|
\ac{cn} update is approximated as \cite[Sec.~5.5.1]{ryan_channel_2009}
|
|
\begin{align*}
|
|
L_{i \leftarrow j} = \prod_{i' \in \mathcal{N}(j)\setminus i}
|
|
\sign \left( L_{i' \rightarrow j} \right)
|
|
\cdot \min_{i' \in \mathcal{N}(j)\setminus i} \lvert
|
|
L_{i'\rightarrow j} \rvert
|
|
.
|
|
\end{align*}
|
|
|
|
% Sliding-window decoding
|
|
|
|
For \ac{sc}-\ac{ldpc} codes, the iterative decoding process is wrapped by a
|
|
windowing step. This is done to reduce the latency and memory requirements and
|
|
also the overall computational complexity \cite{costello_spatially_2014}.
|
|
To this end, the Tanner graph is split into several overlapping windows.
|
|
During decoding, the messages that are passed along the edges of the
|
|
graph in the overlapping regions are kept in memory and used for the
|
|
decoding of subsequent blocks \cite[Sec.~III.~C.]{hassan_fully_2016}.
|
|
|
|
\section{Quantum Mechanics and Quantum Information Science}
|
|
\label{sec:Quantum Mechanics and Quantum Information Science}
|
|
|
|
Designing codes and decoders for \ac{qec} is generally performed on a
|
|
layer of abstraction far removed from the quantum mechanical
|
|
processes underlying the actual qubits.
|
|
Nevertheless, having a fundamental understanding of the related
|
|
quantum mechanical concepts is useful to understand the unique constraints
|
|
of this field.
|
|
The purpose of this section is to convey these concepts to the reader.
|
|
|
|
%%%%%%%%%%%%%%%%
|
|
\subsection{Core Concepts and Notation}
|
|
\label{subsec:Notation}
|
|
|
|
% Wave functions
|
|
|
|
In quantum mechanics, the evolution of a state of a particle over tme
|
|
and space is described by a \emph{wave function} $\psi(x,t)$.
|
|
The connection between this function and the world that we can observe
|
|
is the fact that $\lvert \psi (x,t) \rvert^2$ is the \ac{pdf} of
|
|
finding a praticle in that particular state.
|
|
|
|
% Dirac notation
|
|
|
|
A lot of the related mathematics can be very elegantly expressed
|
|
using the language of linear algebra.
|
|
The so called Bra-ket or Dirac notation is especially appropriate,
|
|
having been proposed by Paul Dirac in 1939 for the express purpose
|
|
of simplifying quantum mechanical notation \cite{dirac_new_1939}.
|
|
Two new symbols are defined, \emph{bra}s $\bra{\cdot}$ and
|
|
\emph{ket}s $\ket{\cdot}$.
|
|
Kets denote ordinary vectors, while bras denote their Hermitian conjugates.
|
|
For example, two vectors specified by the labels $a$ and $b$
|
|
respectively are written as $\ket{a}$ and $\ket{b}$.
|
|
Their inner product is $\braket{a\vert b}$.
|
|
|
|
% Expressing wave functions using linear algebra
|
|
|
|
We can model a wave function $\psi(x,t)$ as a linear combination of different
|
|
\emph{basis functions} $e_n(x,t),~n\in \mathbb{N}$ as%
|
|
\begin{align*}
|
|
\psi(x,t) = \sum_{n=1}^{\infty} c_n \cdot e_n(x,t)
|
|
.%
|
|
\end{align*}
|
|
To express this relation using linear algebra, we represent
|
|
$\psi(x,t)$ and $e_n(x,t)$ as vectors $\ket{\psi}$ and $\ket{e_n}$.
|
|
We write%
|
|
\begin{align*}
|
|
\ket{\psi} = \sum_{n=1}^{\infty} c_n \ket{e_n}
|
|
.%
|
|
\end{align*}
|
|
|
|
% Operators
|
|
|
|
Another important notion is that of an \emph{operator}, a component
|
|
that takes a function as an input and returns another function as an output.
|
|
Operators are useful to describe the relations between different
|
|
quantities relating to a particle.
|
|
An example of this is the differential operator $\partial x$.
|
|
|
|
%%%%%%%%%%%%%%%%
|
|
\subsection{Observables}
|
|
\label{subsec:Observables}
|
|
|
|
% Observable quantities
|
|
|
|
An \emph{observable quantity} $Q$ is \ldots .
|
|
Due to the probabilistic nature of quantum mechanics, the result of a
|
|
measurement is not deterministic.
|
|
Thus, it is useful to consider the \emph{expected value} $\braket{Q}$
|
|
of an observable quantity in addition to individual measurement results.
|
|
|
|
% General expression for expected value of observable quantity
|
|
|
|
If we know the wave function of a particle, we should be able to
|
|
compute $\braket{Q}$ for any observable quantity we wish.
|
|
It can be shown that for any $Q$, we can compute a
|
|
corresponding operator $\hat{Q}$ such that%
|
|
\begin{align}
|
|
\label{eq:gen_expr_Q_exp}
|
|
\braket{Q} = \int_{-\infty}^{\infty} \psi^*(x,t) \hat{Q} \psi(x,t) dx
|
|
.%
|
|
\end{align}%
|
|
While the derivation of this relationship is out of the scope of this
|
|
work, we can at least look at an example to illustrate it.
|
|
Considering the position $Q = x$ of a particle and setting the observable
|
|
operator to $\hat{Q} = x$, we can write%
|
|
\begin{align*}
|
|
\braket{x} = \int_{-\infty}^{\infty} \psi^*(x,t) \cdot x \cdot \psi(x,t) dx
|
|
= \int_{-\infty}^{\infty} x \lvert \psi(x,t) \rvert ^2 dx
|
|
.%
|
|
\end{align*}
|
|
Note that $\lvert \psi(x,t) \rvert^2 $ represents the \ac{pdf} of
|
|
finding a particle in a specific state. We immediately see that the
|
|
formula simplifies to the direct calculation of the expected value.
|
|
|
|
% Determinate states and eigenvalues
|
|
|
|
% TODO: Introduce determinate states above
|
|
% TODO: Nicer phrasing
|
|
% TODO: Use different symbol for determinate states (not psi)
|
|
% TODO: Fix equation
|
|
Let us now examine how the observable operator $\hat{Q}$ relates to
|
|
the determinate states that make up the overall superposition state
|
|
of the particle.
|
|
We begin by translating \autoref{eq:gen_expr_Q_exp} into linear alebra as%
|
|
\begin{align}
|
|
\label{eq:gen_expr_Q_exp_lin}
|
|
\braket{Q} = \braket{\psi \vert \hat{Q}\psi}
|
|
.%
|
|
\end{align}
|
|
\autoref{eq:gen_expr_Q_exp_lin} expresses an inherently probabilistic
|
|
relationhip.
|
|
The determinate states are inherently deterministic.
|
|
To relate the two, we look at those states $\ket{\psi}$, where the
|
|
variance of the measurements of $Q$ is zero. These are exactly the
|
|
determinate states.%
|
|
\begin{align}
|
|
0 &\overset{!}{=} \braket{(Q - \braket{Q})^2}
|
|
= \braket{\psi \vert (\hat{Q} - \braket{Q})^2 \psi} \nonumber\\
|
|
&= \braket{(Q - \braket{Q})\psi \vert (\hat{Q} - \braket{Q})
|
|
\psi} \nonumber\\
|
|
&= \lVert (Q - \braket{Q}) \psi \rVert^2 \nonumber\\[3mm]
|
|
&\hspace{-8mm}\Leftrightarrow (\hat{Q} - \braket{Q}) \ket{\psi} =
|
|
0 \nonumber\\
|
|
\label{eq:observable_eigenrelation}
|
|
&\hspace{-8mm}\Leftrightarrow \hat{Q}\ket{\psi}
|
|
= \underbrace{\braket{Q}}_{\lambda_n} \ket{\psi}
|
|
.%
|
|
\end{align}%
|
|
%
|
|
Because we have assumed the variance to be zero, $\braket{Q}$ is now
|
|
the deterministic measurement value corresponding to the determinate
|
|
state $\ket{\psi}$.
|
|
We can see that the determinate states are the \emph{eigenstates} of
|
|
the observable operator $\hat{Q}$ and that the corresponding
|
|
(deterministic) measurement values are the corresponding
|
|
\emph{eigenvalues} $\lambda_n$.
|
|
|
|
% Determinate states as a basis
|
|
|
|
% TODO: Rephrase
|
|
% TODO: Show that |c_n|^2 is the probability of finding a particle in
|
|
% a given state
|
|
% In particular, using the determinate states $\ket{e_n}$ as a basis to
|
|
% write the superimposed state
|
|
% \begin{align*}
|
|
% \ket{\psi} = \sum_{n=1}^{\infty} c_n \ket{e_n}
|
|
% ,
|
|
% \end{align*}
|
|
|
|
% Recap
|
|
|
|
% TODO: Mention that `observable` is used to refer to the observable operator
|
|
% TODO: Mention eigenstates and eigenvalues again
|
|
To summarize, we can mathematically express any observable quantity
|
|
$Q$ using a corresponding operator $\hat{Q}$.
|
|
This operator allows us to both compute the expected value of the
|
|
observable using \autoref{eq:gen_expr_Q_exp_lin}, and describe the
|
|
individual determinate states and corresponding measurement values
|
|
using \autoref{eq:observable_eigenrelation}.
|
|
|
|
%%%%%%%%%%%%%%%%
|
|
\subsection{Projective Measurements}
|
|
\label{subsec:Projective Measurements}
|
|
|
|
% Projective measurements
|
|
|
|
% TODO: Better introduce the collapse of the superposition state
|
|
The measurements we considered in the previous section, for which
|
|
\autoref{eq:gen_expr_Q_exp_lin} holds, belong to the category of
|
|
\emph{projective measurements}.
|
|
For these, certain restrictions such as repeatability apply: after
|
|
measuring a quantum state and thus collapsing it onto one of the
|
|
determinate states, futher measurements should yield the same value.
|
|
More general methods of modelling measurements exist, e.g., describing
|
|
destructive measurements, but they are not relevant to us here
|
|
\cite[Box~2.5]{nielsen_quantum_2010}.
|
|
|
|
% Projection operators
|
|
|
|
% TODO: Fix notational issues related to e_n
|
|
We can model the collapse of the original state onto one of the
|
|
superimposed basis states as a \emph{projection}.
|
|
To see this, we insert \autoref{eq:determinate_basis} into
|
|
\autoref{eq:observable_eigenrelation}, obtaining%
|
|
\begin{align*}
|
|
\hat{Q}\ket{\psi} = \sum_{n=1}^{\infty} c_n \hat{Q} \ket{e_n}
|
|
= \sum_{n=1}^{\infty} \lambda_n c_n \ket{e_n}
|
|
.%
|
|
\end{align*}%
|
|
We see that $\hat{Q}$ has the effect of multiplying the component
|
|
along each basis vector with the corresponding eigenvalue.
|
|
We decompose $\hat{Q}$ into its constituent parts that act on each of
|
|
the separate components as
|
|
\begin{align*}
|
|
\hat{Q} = \sum_{n=1}^{\infty} \lambda_n \hat{P}_n
|
|
\end{align*}
|
|
using \emph{projection operators}
|
|
\begin{align*}
|
|
\hat{P}_n := \ket{e_n}\bra{e_n}, \hspace{3mm} n\in \mathbb{N}
|
|
.
|
|
\end{align*}%
|
|
These project a vector onto the subspace spanned by $\ket{e_n}$.
|
|
|
|
% Using projection operators to measure if a state has a component
|
|
% along a basis vector
|
|
|
|
A particularly interesting property of projection operators is that
|
|
\begin{align*}
|
|
\hat{P}_n (\hat{P}_n \ket{\psi}) = \hat{P}_n^2 \ket{\psi}
|
|
= \hat{P}_n \ket{\psi},
|
|
\end{align*}%
|
|
and the only way this can hold for any $\ket{\psi}$ is if $\hat{P}_n$
|
|
only has the eigenvalues $0$ or $1$.
|
|
As explained in the previous section, the eigenvalues are the results
|
|
of performing a measurement.
|
|
We can thus use the projection operator as an observable and treat
|
|
the eigenvalue as an indicator of the state having a component along
|
|
the related basis vector.
|
|
|
|
%%%%%%%%%%%%%%%%
|
|
\subsection{Qubits and Multi-Qubit States}
|
|
\label{subsec:Qubits and Multi-Qubit States}
|
|
|
|
\red{
|
|
\begin{itemize}
|
|
\item Qubits and multi-qubit states
|
|
\begin{itemize}
|
|
\item The qubit
|
|
\begin{itemize}
|
|
\item Similar structure to classical
|
|
computing: bits are modified with gates
|
|
-> quantum bits are modified with quantum gates
|
|
\end{itemize}
|
|
\item The tensor product
|
|
\item Information is not stored in the individual bit
|
|
states but in the correlations / entanglement between them
|
|
\item -> The size of the vector space
|
|
\item The X,Z and Y operators
|
|
\item (?) Notation of operators on multi-qubit states
|
|
\end{itemize}
|
|
\end{itemize}
|
|
}
|
|
|
|
\red{
|
|
\begin{itemize}
|
|
\item Representing wave functions as vectors (psi as label,
|
|
building a vector space using basis functions)
|
|
\end{itemize}
|
|
}
|
|
|
|
\red{\textbf{Tensor product}}
|
|
\red{\ldots
|
|
Take for example two systems with the determinate states $\ket{0}$
|
|
and $\ket{1}$. In general, the state of each can be written as the
|
|
superposition%
|
|
%
|
|
\begin{align*}
|
|
\alpha \ket{0} + \beta \ket{1}
|
|
.%
|
|
\end{align*}
|
|
%
|
|
Combining these two sytems into one, the overall state becomes%
|
|
%
|
|
\begin{align*}
|
|
&\mleft( \alpha_1 \ket{0} + \beta_1 \ket{1} \mright) \otimes
|
|
\mleft( \alpha_2 \ket{0} + \beta_2 \ket{1} \mright) \\
|
|
= &\alpha_1 \alpha_2 \ket{0} \ket{0}
|
|
+ \alpha_1 \alpha_2 \ket{0} \ket{1}
|
|
+ \beta_1 \alpha_2 \ket{1} \ket{0}
|
|
+ \beta_1 \beta_2 \ket{1} \ket{1}
|
|
% =: &\alpha_{00} \ket{00}
|
|
% + \alpha_{01} \ket{01}
|
|
% + \alpha_{10} \ket{10}
|
|
% + \alpha_{11} \ket{11}
|
|
.%
|
|
\end{align*}%
|
|
%
|
|
\ldots When not ambiguous in the context, the tensor product
|
|
symbol may be omitted, e.g.,
|
|
\begin{align*}
|
|
\ket{0} \otimes \ket{0} = \ket{0}\ket{0}
|
|
.%
|
|
\end{align*}
|
|
}
|
|
|
|
As we will see, the core concept that gives quantum computing its
|
|
power is entanglement. When two quantum mechanical systems are
|
|
entangled, measuring the state of one will collapse that of the other.
|
|
Take for example two subsystems with the overall state
|
|
%
|
|
\begin{align*}
|
|
\ket{\psi} = \frac{1}{\sqrt{2}} \mleft( \ket{0}\ket{0} +
|
|
\ket{1}\ket{1} \mright)
|
|
.%
|
|
\end{align*}
|
|
%
|
|
If we measure the first subsystem as being in $\ket{0}$, we can
|
|
be certain that a measurement of the second subsystem will also yield $\ket{0}$.
|
|
Introducing a new notation for entangled states, we can write%
|
|
%
|
|
\begin{align*}
|
|
\ket{\psi} = \frac{1}{\sqrt{2}} \left( \ket{00} + \ket{11} \right)
|
|
.%
|
|
\end{align*}
|
|
%
|
|
|
|
%%%%%%%%%%%%%%%%
|
|
\subsection{Quantum Gates}
|
|
\label{subsec:Quantum Gates}
|
|
|
|
\red{
|
|
\textbf{Content:}
|
|
\begin{itemize}
|
|
\item Bra-ket notation
|
|
\item The tensor product
|
|
\item Projective measurements (the related operators,
|
|
eigenvalues/eigenspaces, etc.)
|
|
\begin{itemize}
|
|
\item First explain what an operator is
|
|
\end{itemize}
|
|
\item Abstract intro to QC: Use gates to process qubit
|
|
states, similar to classical case
|
|
\item X, Z, Y operators/gates
|
|
\item Hadamard gate (+ X and Z are the same thing in differt bases)
|
|
\item Notation of operators on multi-qubit states
|
|
\item The Pauli, Clifford and Magic groups
|
|
\end{itemize}
|
|
}
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\section{Quantum Error Correction}
|
|
\label{sec:Quantum Error Correction}
|
|
|
|
\red{
|
|
\textbf{Content:}
|
|
\begin{itemize}
|
|
\item General context
|
|
\begin{itemize}
|
|
\item Why we want QC
|
|
\item Why we need QEC (correcting errors due to noisy gates)
|
|
\item Main challenges of QEC compared to classical
|
|
error correction
|
|
\end{itemize}
|
|
\item Stabilizer codes
|
|
\begin{itemize}
|
|
\item Definition of a stabilizer code
|
|
\item The stabilizer its generators (note somewhere
|
|
that the generators have to commute to be able to
|
|
be measured without disturbing each other)
|
|
\item syndrome extraction circuit
|
|
\item Stabilizer codes are effectively the QM
|
|
% TODO: Actually binary linear codes or just linear codes?
|
|
equivalent of binary linear codes (e.g.,
|
|
expressible via check matrix)
|
|
\end{itemize}
|
|
\item Digitization of errors
|
|
\item CSS codes
|
|
\item Color codes?
|
|
\item Surface codes?
|
|
\item Fault tolerant error correction (gates with which we do
|
|
error correction are also noisy)
|
|
\begin{itemize}
|
|
\item Transversal operations
|
|
\item \dots
|
|
\end{itemize}
|
|
\item Circuit level noise
|
|
\item Detector error model
|
|
\begin{itemize}
|
|
\item Columns of the check matrix represent different
|
|
possible error patterns $\rightarrow$ Check matrix
|
|
doesn't quite correspond to the codewords we used
|
|
initially anymore, but some similar structure ist
|
|
still there (compare with syndrome)
|
|
\end{itemize}
|
|
\end{itemize}
|
|
\textbf{General Notes:}
|
|
\begin{itemize}
|
|
\item Give a brief overview of the history of QEC
|
|
\item Note (and research if this is actually correct) that QC
|
|
was developed on an abstract level before thinking of
|
|
what hardware to use
|
|
\item Note that there are other codes than stabilizer codes
|
|
(and research and give some examples), but only
|
|
stabilizer codes are considered in this work
|
|
\item Degeneracy
|
|
\item The QEC decoding problem (considering degeneracy)
|
|
\end{itemize}
|
|
}
|
|
|
|
\subsection{Stabilizer Codes}
|
|
\subsection{CSS Codes}
|
|
\subsection{Quantum Low-Density Parity-Check Codes}
|
|
|