1459 lines
56 KiB
TeX
1459 lines
56 KiB
TeX
\chapter{Fundamentals}
|
|
\label{ch:Fundamentals}
|
|
|
|
\Ac{qec} is a field of research combining ``classical''
|
|
communications engineering and quantum information science.
|
|
This chapter provides the relevant theoretical background on both of
|
|
these topics and subsequently introduces the fundamentals of \ac{qec}.
|
|
|
|
% TODO: Is an explanation of BP with guided decimation needed in this chapter?
|
|
% TODO: Is an explanation of OSD needed chapter?
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\section{Classical Error Correction}
|
|
\label{sec:Classical Error Correction}
|
|
|
|
The core concept underpinning error correcting codes is the
|
|
realization that introducing a finite amount of redundancy to
|
|
information before transmission can considerably reduce the error rate.
|
|
Specifically, Shannon proved in 1948 that for any channel, a block
|
|
code can be found that achieves arbitrarily small probability of
|
|
error at any communication rate up to the capacity of the channel
|
|
when the block length approaches infinity
|
|
\cite[Sec.~13]{shannon_mathematical_1948}.
|
|
|
|
In this section, we explore the concepts of ``classical'' (as in non-quantum)
|
|
error correction that are central to this work.
|
|
We start by looking at different ways of encoding information,
|
|
first considering binary linear block codes in general and then \ac{ldpc} and
|
|
\ac{sc}-\ac{ldpc} codes.
|
|
Finally, we pivot to the decoding process, specifically the \ac{bp}
|
|
algorithm.
|
|
|
|
\subsection{Binary Linear Block Codes}
|
|
|
|
%
|
|
% Codewords, n, k, rate
|
|
%
|
|
|
|
One particularly important class of coding schemes is that of binary
|
|
linear block codes.
|
|
The information to be protected takes the form of a sequence of
|
|
binary symbols, which is split into separate blocks.
|
|
Each block is encoded, transmitted, and decoded separately.
|
|
The encoding step introduces redundancy by mapping input messages
|
|
$\bm{u} \in \mathbb{F}_2^k$ of length $k \in \mathbb{N}$ (called the
|
|
\textit{information length}) onto \textit{codewords} $\bm{x} \in
|
|
\mathbb{F}_2^n$ of length $n \in \mathbb{N}$ (called the
|
|
\textit{block length}) with $n > k$.
|
|
A measure of the amount of introduced redundancy is the \textit{code
|
|
rate} $R = k/n$.
|
|
We call the set of all codewords $\mathcal{C}$ the \textit{code}
|
|
\cite[Sec.~3.1.1]{ryan_channel_2009}.
|
|
|
|
%
|
|
% d_min and the [] Notation
|
|
%
|
|
|
|
During the encoding process, a mapping from $\mathbb{F}_2^k$
|
|
onto $\mathcal{C} \subset \mathbb{F}_2^n$ takes place.
|
|
The input messages are mapped onto an expanded vector space, where
|
|
they are ``further apart'', giving rise to the error correcting
|
|
properties of the code.
|
|
This notion of the distance between two codewords $\bm{x}_1$ and
|
|
$\bm{x}_2$ can be expressed using the \textit{Hamming distance} $d(\bm{x}_1,
|
|
\bm{x}_2)$, which is defined as the number of positions in which they differ.
|
|
We define the \textit{minimum distance} of a code $\mathcal{C}$ as
|
|
%
|
|
\begin{align*}
|
|
d_\text{min} := \min \left\{ d(\bm{x}_1, \bm{x}_2) : \bm{x}_1,
|
|
\bm{x}_2 \in \mathcal{C}, \bm{x}_1 \neq \bm{x}_2 \right\}
|
|
.
|
|
\end{align*}
|
|
%
|
|
We can signify that a binary linear block code has information length
|
|
$k$, block length $n$ and minimum distance $d_\text{min}$ using the
|
|
notation $[n,k,d_\text{min}]$ \cite[Sec.~1.3]{macwilliams_theory_1977}.
|
|
|
|
%
|
|
% Parity checks, H, and the syndrome
|
|
%
|
|
|
|
A particularly elegant way of describing the subspace $C$ of
|
|
$\mathbb{F}_2^n$ that the codewords make up is the notion of
|
|
\textit{parity checks}.
|
|
Since $\lvert \mathcal{C} \rvert = 2^k$ and $\lvert \mathbb{F}_2^n
|
|
\rvert = 2^n$, we could introduce $n-k$ conditions to constrain the
|
|
additional degrees of freedom.
|
|
These conditions, called parity checks, take the form of equations
|
|
over $\mathbb{F}_2^n$, linking the individual positions of each codeword.
|
|
We can arrange the coefficients of these equations in a
|
|
\textit{parity-check matrix} (\acs{pcm}) $\bm{H} \in
|
|
\mathbb{F}_2^{(n-k) \times n}$ and equivalently define the code as
|
|
\cite[Sec.~3.1.1]{ryan_channel_2009}
|
|
\begin{align*}
|
|
\mathcal{C} = \left\{ \bm{x} \in \mathbb{F}_2^n :
|
|
\bm{H}\bm{x}^\text{T} = \bm{0} \right\}
|
|
.%
|
|
\end{align*}
|
|
Note that in general we may have linearly dependent parity checks,
|
|
prompting us to define the \ac{pcm} as $\bm{H} \in
|
|
\mathbb{F}_2^{m\times n}$ with $\hspace{2mm} m \ge n-k$ instead.
|
|
The \textit{syndrome} $\bm{s} = \bm{H} \bm{v}^\text{T}$ describes
|
|
which parity checks a candidate codeword $\bm{v} \in \mathbb{F}_2^n$ violates.
|
|
The representation using the \ac{pcm} has the benefit of providing a
|
|
description of the code, the memory complexity of which doesn't grow
|
|
exponentially with $n$, in contrast to keeping track of all codewords directly.
|
|
|
|
%
|
|
% The decoding problem
|
|
%
|
|
|
|
Figure \ref{fig:Diagram of a transmission system} visualizes the
|
|
communication process \cite[Sec.~1.1]{ryan_channel_2009}.
|
|
An input message $\bm{u}\in \mathbb{F}_2^k$ is mapped onto a codeword $\bm{x}
|
|
\in \mathbb{F}_2^n$. This is passed on to a modulator, which
|
|
interacts with the physical channel.
|
|
A demodulator processes the channel output and forwards the result
|
|
$\bm{y}$ to a decoder.
|
|
We differentiate between \textit{soft-decision} decoding, where
|
|
$\bm{y} \in \mathbb{R}^n$, and \textit{hard-decision} decoding, where
|
|
$\bm{y} \in \mathbb{F}_2^n$ \cite[Sec.~1.5.1.3]{ryan_channel_2009}.
|
|
Finally, the decoder is responsible for obtaining an estimate
|
|
$\hat{\bm{u}} \in \mathbb{F}_2^k$ of the original input message.
|
|
This is done by first finding an estimate $\hat{\bm{x}}$ of the sent
|
|
codeword and undoing the encoding.
|
|
The decoding problem that we generally attempt to solve thus consists
|
|
in finding the best estimate $\hat{\bm{x}}$ given $\bm{y}$.
|
|
|
|
\begin{figure}[t]
|
|
\centering
|
|
|
|
\tikzset{
|
|
box/.style={
|
|
rectangle, draw=black, minimum width=17mm, minimum height=8mm,
|
|
},
|
|
}
|
|
|
|
\begin{tikzpicture}
|
|
[
|
|
node distance = 2mm and 7mm,
|
|
]
|
|
\node (in) {};
|
|
\node[box, right=of in] (enc) {Encoder};
|
|
\node[box, minimum width=25mm, right=of enc] (mod) {Modulator};
|
|
\node[box, below right=of mod] (cha) {Channel};
|
|
\node[box, minimum width=25mm, below left=of cha] (dem) {Demodulator};
|
|
\node[box, left=of dem] (dec) {Decoder};
|
|
\node[left=of dec] (out) {};
|
|
|
|
\draw[-{latex}] (in) -- (enc) node[midway, above] {$\bm{u}$};
|
|
\draw[-{latex}] (enc) -- (mod) node[midway, above] {$\bm{x}$};
|
|
\draw[-{latex}] (mod) -| (cha);
|
|
\draw[-{latex}] (cha) |- (dem);
|
|
\draw[-{latex}] (dem) -- (dec) node[midway, above] {$\bm{y}$};
|
|
\draw[-{latex}] (dec) -- (out) node[midway, above] {$\hat{\bm{u}}$};
|
|
\end{tikzpicture}
|
|
|
|
\caption{Overview of a transmission system.}
|
|
\label{fig:Diagram of a transmission system}
|
|
\end{figure}
|
|
%
|
|
|
|
%
|
|
% Hard vs. soft information
|
|
%
|
|
|
|
\subsection{Low-Density Parity-Check Codes}
|
|
|
|
%
|
|
% Core concept
|
|
%
|
|
|
|
Shannon's noisy-channel coding theorem is stated for codes whose block
|
|
length approaches infinity. This suggests that as the block length
|
|
becomes larger, the performance of the considered codes should
|
|
generally improve.
|
|
However, the size of the \ac{pcm}, and thus in general the decoding complexity,
|
|
of a linear block code grows quadratically with $n$.
|
|
This would quickly render decoding intractable as we increase the block length.
|
|
We can get around this problem by constructing $\bm{H}$ in such a
|
|
manner that the number of nonzero entries grows less than quadratically, e.g.,
|
|
only linearly.
|
|
This is exactly the motivation behind \ac{ldpc} codes
|
|
\cite[Ch.~1]{gallager_low_1960}.
|
|
|
|
%
|
|
% Tanner Graph, VNs and CNs
|
|
%
|
|
|
|
\ac{ldpc} codes belong to a class sometimes referred to as ``modern codes''.
|
|
These differ from ``classical codes'' in their decoding algorithms:
|
|
Classical codes are usually decoded using one-step hard-decision decoding,
|
|
whereas modern codes are suitable for iterative soft-decision
|
|
decoding \cite[Preface]{ryan_channel_2009}. The iterative decoding algorithms
|
|
in question are generally defined in terms of message passing on the
|
|
\textit{Tanner graph} of the code. The Tanner graph is a bipartite
|
|
graph that constitutes an alternative representation of the \ac{pcm}.
|
|
We define two types of nodes: \acp{vn}, corresponding to codeword
|
|
bits, and \acp{cn}, corresponding to individual parity checks.
|
|
We then construct the Tanner graph by connecting each \ac{cn} to
|
|
the \acp{vn} that make up the corresponding parity check
|
|
\cite[Sec.~5.1.2]{ryan_channel_2009}.
|
|
Figure \ref{PCM and Tanner graph of the Hamming code} shows this
|
|
construction for the [7,4,3]-Hamming code.
|
|
%
|
|
\begin{figure}[t]
|
|
\centering
|
|
|
|
\begin{align*}
|
|
\bm{H} =
|
|
\begin{pmatrix}
|
|
0 & 1 & 1 & 1 & 1 & 0 & 0 \\
|
|
1 & 0 & 1 & 1 & 0 & 1 & 0 \\
|
|
1 & 1 & 0 & 1 & 0 & 0 & 1 \\
|
|
\end{pmatrix}
|
|
\end{align*}
|
|
|
|
\vspace*{2mm}
|
|
|
|
\tikzset{
|
|
VN/.style={
|
|
circle, fill=KITgreen, minimum width=1mm, minimum height=1mm,
|
|
},
|
|
CN/.style={
|
|
rectangle, fill=KITblue, minimum width=1mm, minimum height=1mm,
|
|
},
|
|
}
|
|
|
|
\begin{tikzpicture}
|
|
\node[VN, label=above:$x_1$] (vn1) {};
|
|
\node[VN, right=12mm of vn1, label=above:$x_2$] (vn2) {};
|
|
\node[VN, right=12mm of vn2, label=above:$x_3$] (vn3) {};
|
|
\node[VN, right=12mm of vn3, label=above:$x_4$] (vn4) {};
|
|
\node[VN, right=12mm of vn4, label=above:$x_5$] (vn5) {};
|
|
\node[VN, right=12mm of vn5, label=above:$x_6$] (vn6) {};
|
|
\node[VN, right=12mm of vn6, label=above:$x_7$] (vn7) {};
|
|
|
|
\node[
|
|
CN, below=25mm of vn4,
|
|
label={below:$x_1 + x_3 + x_4 + x_6 = 0$}
|
|
] (cn2) {};
|
|
\node[
|
|
CN, left=40mm of cn2,
|
|
label={below:$x_2 + x_3 + x_4 + x_5 = 0$}
|
|
] (cn1) {};
|
|
\node[
|
|
CN, right=40mm of cn2,
|
|
label={below:$x_1 + x_2 + x_4 + x_7 = 0$}
|
|
] (cn3) {};
|
|
|
|
\foreach \n in {2,3,4,5} {
|
|
\draw (cn1) -- (vn\n);
|
|
}
|
|
|
|
\foreach \n in {1,3,4,6} {
|
|
\draw (cn2) -- (vn\n);
|
|
}
|
|
|
|
\foreach \n in {1,2,4,7} {
|
|
\draw (cn3) -- (vn\n);
|
|
}
|
|
\end{tikzpicture}
|
|
|
|
\caption{The \ac{pcm} and corresponding Tanner graph of the
|
|
[7,4,3]-Hamming code.}
|
|
\label{PCM and Tanner graph of the Hamming code}
|
|
\end{figure}
|
|
|
|
%
|
|
% N_V(j), N_C(i)
|
|
%
|
|
|
|
Mathematically, we represent a \ac{vn} using the index $i \in
|
|
\mathcal{I} := \left[
|
|
1 : n \right]$ and a \ac{cn} using the index $j \in \mathcal{J}
|
|
:= \left[ 1 : m \right]$.
|
|
We can then encode the information contained in the graph by defining
|
|
the neighborhood of a variable node $i$ as
|
|
$\mathcal{N}_\text{V} (i) = \left\{ j \in \mathcal{J} : \bm{H}_{j,i}
|
|
= 1 \right\}$
|
|
and that of a check node $j$ as
|
|
$\mathcal{N}_\text{C} (j) = \left\{ i \in \mathcal{I} : \bm{H}_{j,i}
|
|
= 1 \right\}$.
|
|
|
|
%
|
|
% Error floor and waterfall regions
|
|
%
|
|
|
|
We typically evaluate the performance of LDPC codes using the
|
|
\ac{ber} or the \ac{fer} (a \textit{frame} referes to one whole
|
|
transmitted block in this context).
|
|
Considering an \ac{awgn} channel, \autoref{fig:ldpc-perf} shows a
|
|
qualitative performance characteristic of an \ac{ldpc} code
|
|
\cite[Fig.~1]{costello_spatially_2014}. We talk of the
|
|
\textit{waterfall} and the \textit{error floor} regions.
|
|
|
|
\begin{figure}[t]
|
|
\centering
|
|
|
|
\begin{tikzpicture}
|
|
\begin{axis}[
|
|
width=12cm,
|
|
height=9cm,
|
|
xlabel={Signal-to-noise ratio},
|
|
ylabel={Error rate},
|
|
% xmin=0, xmax=6,
|
|
enlarge x limits=false,
|
|
ymin=1e-9, ymax=1,
|
|
ticks=none,
|
|
% y tick label={},
|
|
ymode=log,
|
|
grid=both,
|
|
grid style={line width=0.2pt, draw=gray!30},
|
|
major grid style={line width=0.4pt, draw=gray!50},
|
|
legend pos=north east,
|
|
legend cell align={left},
|
|
]
|
|
|
|
\addplot+[mark=none, solid, smooth, KITblue] coordinates {
|
|
(4.5789E-01, 1.1821E-01)
|
|
(6.6842E-01, 9.4575E-02)
|
|
(8.6316E-01, 5.2657E-02)
|
|
(1.0421E+00, 2.2183E-02)
|
|
(1.1789E+00, 8.3588E-03)
|
|
(1.3368E+00, 1.4835E-03)
|
|
(1.4895E+00, 1.6852E-04)
|
|
(1.5842E+00, 2.8285E-05)
|
|
(1.6737E+00, 4.2465E-06)
|
|
(1.7684E+00, 3.4519E-07)
|
|
(1.8316E+00, 3.9213E-08)
|
|
(1.8684E+00, 6.2247E-09)
|
|
(1.9053E+00, 1E-09)
|
|
};
|
|
\addlegendentry{Regular}
|
|
|
|
\addplot+[mark=none, solid, smooth, KITorange] coordinates {
|
|
(4.5789E-01, 1.1821E-01)
|
|
(6.4211E-01, 4.9800E-02)
|
|
(7.5263E-01, 1.2700E-02)
|
|
(8.1579E-01, 2.3177E-03)
|
|
(8.6842E-01, 3.5779E-04)
|
|
(9.1053E-01, 5.3716E-05)
|
|
(9.4737E-01, 4.8818E-06)
|
|
(9.8947E-01, 6.5555E-07)
|
|
(1.0421E+00, 9.5713E-08)
|
|
% (1.0684E+00, 2.9670E-08)
|
|
(1.1474E+00, 1.2499E-08)
|
|
(1.3000E+00, 7.1560E-09)
|
|
(1.4579E+00, 6.0535E-09)
|
|
% (1.6105E+00, 5E-09)
|
|
(1.9579E+00, 4E-09)
|
|
(2.2947E+00, 3.1876E-09)
|
|
% (2.8842E+00, 2.0403E-09)
|
|
};
|
|
\addlegendentry{Irregular}
|
|
|
|
\draw[gray, densely dashed]
|
|
(axis cs:0.65, 2e-3) rectangle (axis cs:1.65, 5e-5);
|
|
\node[below] at (axis cs:1.15, 6e-5) {Waterfall};
|
|
|
|
\draw[gray, densely dashed]
|
|
(axis cs:1, 6e-8) rectangle (axis cs:2, 2e-9);
|
|
\node[above] at (axis cs:1.5, 7e-8) {Error floor};
|
|
\end{axis}
|
|
\end{tikzpicture}
|
|
|
|
\caption{
|
|
Qualitative performance characteristic of \ac{ldpc} code
|
|
in an \ac{awgn} channel. Adapted from
|
|
\cite[Fig.~1]{costello_spatially_2014}.
|
|
}
|
|
\label{fig:ldpc-perf}
|
|
\end{figure}
|
|
|
|
Broadly, there are two kinds of \ac{ldpc} codes, \textit{regular} and
|
|
\textit{irregular}.
|
|
Regular codes are characterized by the fact that the weights, i.e.,
|
|
the numbers of ones, of their rows and columns are constant
|
|
\cite[Sec.~5.1.1]{ryan_channel_2009}.
|
|
Already during their introduction, regular \ac{ldpc} codes were shown to have
|
|
a minimum distance scaling linearly with the block length $n$ for
|
|
large values \cite[Ch.~2,~Theorem~1]{gallager_low_1960},
|
|
which leads to them not exhibiting an error floor under \ac{ml} decoding.
|
|
Irregular codes, on the other hand, generally do exhibit an error floor,
|
|
their redeeming quality being the ability to reach near-capacity
|
|
performance in the waterfall region \cite[Intro.]{costello_spatially_2014}.
|
|
|
|
\subsection{Spatially-Coupled LDPC Codes}
|
|
|
|
A relatively recent development in the world of \ac{ldpc} codes is
|
|
that of \ac{sc}-\ac{ldpc} codes.
|
|
Their key feature is that they combine the best properties of regular
|
|
and irregular codes.
|
|
They have a minimum distance that grows linearly with $n$, promising
|
|
good error floor behavior, and capacity approaching
|
|
iterative decoding behavior, promising good performance in the
|
|
waterfall region \cite[Intro.]{costello_spatially_2014}.
|
|
|
|
The essential property of \ac{sc}-\ac{ldpc} codes is that codewords
|
|
from different \textit{spatial positions}, that would ordinarily be sent
|
|
one after the other independently, are coupled.
|
|
This is achieved by connecting some \acp{vn} of one spatial position to
|
|
\acp{cn} of another, resulting in a \ac{pcm} of the form
|
|
\cite[Eq.~1]{hassan_fully_2016}
|
|
%
|
|
\begin{align*}
|
|
\bm{H} =
|
|
\begin{pmatrix}
|
|
\bm{H}_0(1) & & \\
|
|
\vdots & \ddots & \\
|
|
\bm{H}_K(1) & & \bm{H}_0(L) \\
|
|
& \ddots & \\
|
|
& & \bm{H}_K(L) \\
|
|
\end{pmatrix}
|
|
,
|
|
\end{align*}
|
|
%
|
|
where $K \in \mathbb{N}$ is the \textit{coupling width} and $L \in
|
|
\mathbb{N}$ is the number of spatial positions.
|
|
This construction results in a Tanner graph as depicted in
|
|
\autoref{fig:sc-ldpc-tanner}.
|
|
|
|
\begin{figure}[t]
|
|
\centering
|
|
|
|
\tikzset{
|
|
VN/.style={
|
|
circle, fill=KITgreen, minimum width=1mm, minimum height=1mm,
|
|
},
|
|
CN/.style={
|
|
rectangle, fill=KITblue, minimum width=1mm, minimum height=1mm,
|
|
},
|
|
}
|
|
|
|
\begin{tikzpicture}[node distance=7mm and 1cm]
|
|
\node[VN] (vn00) {};
|
|
\node[VN, below = of vn00] (vn01) {};
|
|
\node[VN, below = of vn01] (vn02) {};
|
|
\node[VN, below = of vn02] (vn03) {};
|
|
\node[VN, below = of vn03] (vn04) {};
|
|
|
|
\coordinate (temp) at ($(vn01)!0.5!(vn02)$);
|
|
|
|
\node[CN, right = of temp] (cn00) {};
|
|
\node[CN, below = of cn00] (cn01) {};
|
|
|
|
\draw (vn00) -- (cn00);
|
|
\draw (vn01) -- (cn00);
|
|
\draw (vn03) -- (cn00);
|
|
\draw (vn01) -- (cn01);
|
|
\draw (vn02) -- (cn01);
|
|
\draw (vn04) -- (cn01);
|
|
|
|
\foreach \i in {1,2,3} {
|
|
\pgfmathtruncatemacro{\previ}{\i-1}
|
|
\node[VN, right = 25mm of vn\previ 0] (vn\i0) {};
|
|
|
|
\foreach \j in {1,...,4} {
|
|
\pgfmathtruncatemacro{\prevj}{\j-1}
|
|
\node[VN, below = of vn\i\prevj] (vn\i\j) {};
|
|
}
|
|
|
|
\coordinate (temp) at ($(vn\i1)!0.5!(vn\i2)$);
|
|
|
|
\node[CN, right = of temp] (cn\i0) {};
|
|
\node[CN, below = of cn\i0] (cn\i1) {};
|
|
|
|
\draw (vn\i0) -- (cn\i0);
|
|
\draw (vn\i1) -- (cn\i0);
|
|
\draw (vn\i3) -- (cn\i0);
|
|
\draw (vn\i1) -- (cn\i1);
|
|
\draw (vn\i2) -- (cn\i1);
|
|
\draw (vn\i4) -- (cn\i1);
|
|
}
|
|
|
|
\node[right = 25mm of vn30] (vn40) {};
|
|
\node[below = of vn40] (vn41) {};
|
|
\node[below = of vn41] (vn42) {};
|
|
\node[below = of vn42] (vn43) {};
|
|
\node[below = of vn43] (vn44) {};
|
|
|
|
\coordinate (temp) at ($(vn41)!0.5!(vn42)$);
|
|
|
|
\node[right = of temp] (cn40) {};
|
|
\node[below = of cn40] (cn41) {};
|
|
|
|
\foreach \i in {0,1,2} {
|
|
\pgfmathtruncatemacro{\next}{\i+1}
|
|
\pgfmathtruncatemacro{\nextnext}{\i+2}
|
|
|
|
\draw (vn\i 3) to[bend right] (cn\next 1);
|
|
\draw (vn\i 1) to[bend left] (cn\nextnext 0);
|
|
}
|
|
|
|
\draw (vn33) to[bend right] (cn41);
|
|
|
|
\node at ($(cn40)!0.5!(cn41)$) {\dots};
|
|
|
|
\draw[decorate, decoration={brace, amplitude=10pt}]
|
|
([xshift=-5mm,yshift=2mm]vn00.north) --
|
|
([xshift=5mm,yshift=2mm]vn00.north -| cn20.north)
|
|
node[midway, above=4mm] {K};
|
|
\end{tikzpicture}
|
|
|
|
\caption{
|
|
Visualization of the coupling between the Tanner graphs
|
|
of individual spatial positions.
|
|
}
|
|
\label{fig:sc-ldpc-tanner}
|
|
\end{figure}
|
|
|
|
Note that at the first and last few spatial positions, some \acp{cn}
|
|
have lower degrees.
|
|
This leads to more reliable information about the
|
|
\acp{vn} that, as we will see, is
|
|
later passed to subsequent spatial positions during decoding.
|
|
This is precisely the effect that leads to the good performance of
|
|
\ac{sc}-\ac{ldpc} codes in the waterfall region \cite{costello_spatially_2014}.
|
|
|
|
\subsection{Iterative Decoding}
|
|
|
|
% Introduction
|
|
|
|
\ac{ldpc} codes are generally decoded using efficient iterative
|
|
algorithms, something that is possible due to their sparsity
|
|
\cite[Sec.~5.3]{ryan_channel_2009}.
|
|
The algorithm originally proposed alongside LDPC codes for this
|
|
purpose by Gallager in 1960 is now known as the \ac{spa}
|
|
\cite[5.4.1]{ryan_channel_2009}, also called \ac{bp}.
|
|
|
|
The optimality criterion the \ac{spa} is built around is a
|
|
symbol-wise \ac{map} decision \cite[Sec.~5.4.1]{ryan_channel_2009}.
|
|
The core idea of the resulting algorithm is to view \acp{cn} as
|
|
representing single-parity check codes and \acp{vn} as representing
|
|
repetition codes.
|
|
The algorithm alternates between consolidating soft information about
|
|
the \acp{vn} in the \acp{cn}, and consolidating soft information about
|
|
the \acp{cn} in the \acp{vn}.
|
|
To this end, messages are passed back and forth along the edges of
|
|
the Tanner graph.
|
|
$L_{i\rightarrow j}$ represents a message passed from \ac{vn} $i$ to
|
|
\ac{cn} j, $L_{i\leftarrow j}$ represents a message passed from
|
|
\ac{cn} j to \ac{vn} i.
|
|
The \acp{vn} additionally receive messages \cite[5.4.2]{ryan_channel_2009}
|
|
\begin{align*}
|
|
\tilde{L}_i = \log \frac{P(X=0 \vert Y=y)}{P(X=1 \vert Y=y)},
|
|
\end{align*}
|
|
computed from the channel outputs.
|
|
The consolidation of the information occurs in the \ac{vn} update
|
|
\begin{align*}
|
|
L_{i\rightarrow j} = \tilde{L}_i + \sum_{j'\in \mathcal{N}(i)\setminus
|
|
j} L_{i\leftarrow j'}
|
|
\end{align*}
|
|
and the \ac{cn} update
|
|
\begin{align*}
|
|
L_{i\leftarrow j} = 2\cdot \tanh^{-1} \left( \prod_{i'\in
|
|
\mathcal{N}(j)\setminus i} \tanh \frac{L_{i'\rightarrow j}}{2} \right)
|
|
.
|
|
\end{align*}
|
|
|
|
A basic assumption for the derivation of the \ac{spa} is that the
|
|
messages are statistically independent.
|
|
If the Tanner graph has cycles, however, this
|
|
condition is not met.
|
|
The shorter the cycles, the sooner this condition is violated and the
|
|
worse the approximation becomes \cite[Sec.~5.4.4]{ryan_channel_2009}.
|
|
Cycles of length four (so-called \emph{$4$-cycles}) are the shortest
|
|
possible cycles and are thus especially problematic.
|
|
|
|
% Min-sum algorithm
|
|
|
|
A simplification of the \ac{spa} is the min-sum decoder. Here, the
|
|
\ac{cn} update is approximated as \cite[Sec.~5.5.1]{ryan_channel_2009}
|
|
\begin{align*}
|
|
L_{i \leftarrow j} = \prod_{i' \in \mathcal{N}(j)\setminus i}
|
|
\sign \left( L_{i' \rightarrow j} \right)
|
|
\cdot \min_{i' \in \mathcal{N}(j)\setminus i} \lvert
|
|
L_{i'\rightarrow j} \rvert
|
|
.
|
|
\end{align*}
|
|
|
|
% Sliding-window decoding
|
|
|
|
For \ac{sc}-\ac{ldpc} codes, the iterative decoding process is wrapped by a
|
|
windowing step. This is done to reduce the latency and memory requirements and
|
|
also the overall computational complexity \cite{costello_spatially_2014}.
|
|
To this end, the Tanner graph is split into several overlapping windows.
|
|
During decoding, the messages that are passed along the edges of the
|
|
graph in the overlapping regions are kept in memory and used for the
|
|
decoding of subsequent blocks \cite[Sec.~III.~C.]{hassan_fully_2016}.
|
|
|
|
\section{Quantum Mechanics and Quantum Information Science}
|
|
\label{sec:Quantum Mechanics and Quantum Information Science}
|
|
|
|
Designing codes and decoders for \ac{qec} is generally performed on a
|
|
layer of abstraction far removed from the quantum mechanical
|
|
processes underlying the actual physics.
|
|
Nevertheless, having a fundamental understanding of the related
|
|
quantum mechanical concepts is useful to grasp the unique constraints
|
|
of this field.
|
|
The purpose of this section is to convey these concepts to the reader.
|
|
|
|
%%%%%%%%%%%%%%%%
|
|
\subsection{Core Concepts and Notation}
|
|
\label{subsec:Notation}
|
|
|
|
% Wave functions
|
|
|
|
In quantum mechanics, the state of a particle is described by a
|
|
\emph{wave function} $\psi(x,t)$.
|
|
The connection between this function and the observable world
|
|
is Born's statistical interpretation:
|
|
$\lvert \psi (x,t) \rvert^2$ is the \ac{pdf} of finding a praticle at
|
|
position $x$ and time $t$ \cite[Sec.~1.2]{griffiths_introduction_1995}.
|
|
|
|
% Dirac notation
|
|
|
|
A lot of the related mathematics can be very elegantly expressed
|
|
using the language of linear algebra.
|
|
The so called Bra-ket or Dirac notation is especially appropriate,
|
|
having been proposed by Paul Dirac in 1939 for the express purpose
|
|
of simplifying quantum mechanical notation \cite{dirac_new_1939}.
|
|
Two new symbols are defined, \emph{bra}s $\bra{\cdot}$ and
|
|
\emph{ket}s $\ket{\cdot}$.
|
|
Kets denote ordinary vectors, while bras denote their Hermitian conjugates.
|
|
For example, two vectors specified by the labels $a$ and $b$
|
|
respectively are written as $\ket{a}$ and $\ket{b}$.
|
|
Their inner product is $\braket{a\vert b}$.
|
|
|
|
% Expressing wave functions using linear algebra
|
|
|
|
The connection we will make between quantum mechanics and linear
|
|
algebra is that we will model the state space of a system as a
|
|
\emph{function space}.
|
|
We will represent the state of a particle with wave function
|
|
$\psi(x,t)$ using the vector $\ket{\psi}$
|
|
\cite[Sec.~3.3]{griffiths_introduction_1995}.
|
|
|
|
% Operators
|
|
|
|
Another important notion is that of an \emph{operator}, a transformation
|
|
that takes a function as an input and returns another function as an
|
|
output \cite[Sec.~3.2.2]{griffiths_introduction_1995}.
|
|
Operators are useful to describe the relations between different
|
|
quantities relating to a particle.
|
|
An example of this is the differential operator $\partial x$.
|
|
Two operators $P_1$ and $P_2$ are said to \emph{commute}, if $P_1P_2
|
|
= P_2P_1$ and \emph{anti-commute} if $P_1P_2 = -P_2P_1$.
|
|
|
|
%%%%%%%%%%%%%%%%
|
|
\subsection{Observables}
|
|
\label{subsec:Observables}
|
|
|
|
% Observable quantities
|
|
|
|
An \emph{observable quantity} $Q(x,p,t)$ is a quantity of a quantum
|
|
mechanical system that we can measure, such as the position $x$ or
|
|
momentum $p$ of a particle.
|
|
In general, such measurements are not deterministic, i.e.,
|
|
measurements on identically prepared states can yield different results.
|
|
There are some states, however, that are \emph{determinate} for a
|
|
specific observable: measuring those will always yield identical
|
|
observations \cite[Sec.~3.3]{griffiths_introduction_1995}.
|
|
|
|
% General expression for expected value of observable quantity
|
|
|
|
If we know the wave function of a particle, we should be able to
|
|
compute the expected value $\braket{Q}$ of any observable quantity we wish.
|
|
It can be shown that for any $Q$, we can compute a
|
|
corresponding operator $\hat{Q}$ such that
|
|
\cite[Sec.~3.3]{griffiths_introduction_1995}
|
|
\begin{align}
|
|
\label{eq:gen_expr_Q_exp}
|
|
\braket{Q} = \int_{-\infty}^{\infty} \psi^*(x,t) \hat{Q} \psi(x,t) dx
|
|
.%
|
|
\end{align}%
|
|
While the derivation of this relationship is out of the scope of this
|
|
work, we can at least look at an example to illustrate it.
|
|
Considering the position $Q = x$ of a particle and setting the observable
|
|
operator to $\hat{Q} = x$, we can write
|
|
\cite[Sec.~1.5]{griffiths_introduction_1995}
|
|
\begin{align*}
|
|
\braket{x} = \int_{-\infty}^{\infty} \psi^*(x,t) \cdot x \cdot \psi(x,t) dx
|
|
= \int_{-\infty}^{\infty} x \lvert \psi(x,t) \rvert ^2 dx
|
|
.%
|
|
\end{align*}
|
|
Note that $\lvert \psi(x,t) \rvert^2 $ represents the \ac{pdf} of
|
|
finding a particle in a specific state. We immediately see that the
|
|
formula simplifies to the direct calculation of the expected value.
|
|
|
|
% Determinate states and eigenvalues
|
|
|
|
Let us now examine how the observable operator $\hat{Q}$ relates to
|
|
the determinate states of the observable quantity.
|
|
We begin by translating \autoref{eq:gen_expr_Q_exp} into linear alebra as
|
|
\cite[Eq.~3.114]{griffiths_introduction_1995}
|
|
\begin{align}
|
|
\label{eq:gen_expr_Q_exp_lin}
|
|
\braket{Q} = \braket{\psi \vert \hat{Q}\psi}
|
|
.%
|
|
\end{align}
|
|
\autoref{eq:gen_expr_Q_exp_lin} expresses an inherently probabilistic
|
|
relationhip.
|
|
The determinate states are inherently deterministic.
|
|
To relate the two, we note that since determinate states should
|
|
always yield the same measurement results, the variance of the
|
|
observable should be zero.
|
|
We thus compute \cite[Eq.~3.116]{griffiths_introduction_1995}
|
|
\begin{align}
|
|
0 &\overset{!}{=} \braket{(Q - \braket{Q})^2}
|
|
= \braket{e_n \vert (\hat{Q} - \braket{Q})^2 e_n} \nonumber\\
|
|
&= \braket{(Q - \braket{Q})e_n \vert (\hat{Q} - \braket{Q})
|
|
e_n} \nonumber\\
|
|
&= \lVert (Q - \braket{Q}) e_n \rVert^2 \nonumber\\[3mm]
|
|
&\hspace{-8mm}\Leftrightarrow (\hat{Q} - \braket{Q}) \ket{e_n} =
|
|
0 \nonumber\\
|
|
\label{eq:observable_eigenrelation}
|
|
&\hspace{-8mm}\Leftrightarrow \hat{Q}\ket{e_n}
|
|
= \underbrace{\braket{Q}}_{\lambda_n} \ket{e_n}
|
|
.%
|
|
\end{align}%
|
|
%
|
|
Because we have assumed the variance to be zero, the expected value
|
|
$\braket{Q}$ is now the deterministic measurement result
|
|
corresponding to the determinate state
|
|
$\ket{e_n},~n\in \mathbb{N}$.
|
|
We can see that the determinate states are the \emph{eigenstates} of
|
|
the observable operator $\hat{Q}$ and that the measurement values are
|
|
the corresponding \emph{eigenvalues} $\lambda_n$
|
|
\cite[Sec.~3.3]{griffiths_introduction_1995}.
|
|
|
|
% Determinate states as a basis
|
|
|
|
As we are modelling the wave function $\psi(x,t)$ as a vector
|
|
$\ket{\psi}$, we can find a set of basis vectors to decompose it into.
|
|
We can use the determinate states for this purpose, expressing the state as%
|
|
\footnote{
|
|
We are only considering the case of having a \emph{discrete
|
|
spectrum} here, i.e., having a discrete set of eigenvalues and vectors.
|
|
For continuous spectra, the procedure is analogous.
|
|
}
|
|
\begin{align}
|
|
\label{eq:determinate_basis}
|
|
\ket{\psi} = \sum_{n=1}^{\infty} c_n \ket{e_n}, \hspace{3mm}
|
|
c_n := \braket{e_n \vert \psi}
|
|
.%
|
|
\end{align}
|
|
Inserting \autoref{eq:determinate_basis} into
|
|
\autoref{eq:gen_expr_Q_exp_lin} we obtain
|
|
% tex-fmt: off
|
|
\cite[Prob.~3.35c)]{griffiths_introduction_1995}
|
|
% tex-fmt: on
|
|
\begin{align*}
|
|
\braket{Q} = \left( \sum_{n=1}^{\infty} c_n \bra{e_n} \right)
|
|
\left( \sum_{m=1}^{\infty} c_m\hat{Q}\ket{e_m} \right)
|
|
= \sum_{n=1}^{\infty} \sum_{m=1}^{\infty} c_n c_m
|
|
\lambda_m\braket{e_n \vert e_m}
|
|
= \sum_{n=1}^{\infty} \lambda_n \lvert c_n \rvert ^2
|
|
.%
|
|
\end{align*}
|
|
We can thus interpret $\lvert c_n \rvert ^2$ as the probability of
|
|
obtaining value $\lambda_n$ from the measurement.
|
|
|
|
% Recap
|
|
|
|
To summarize, we mathematically model an observable quantity
|
|
$Q(x,t,p)$ using a corresponding operator $\hat{Q}$, which allows us
|
|
to compute the expected value as $\braket{Q} = \braket{\psi
|
|
\vert \hat{Q} \psi}$.
|
|
The eigenvectors of $\hat{Q}$ are the determinate states
|
|
$\ket{e_n},~n\in \mathbb{N}$ and the eigenvalues are the respective
|
|
measurement outcomes.
|
|
We can decompose an arbitrary state as $\ket{\psi} = \sum_{n=1}^{\infty} c_n
|
|
\ket{e_n}$, where $\lvert c_n \rvert ^2$ represents the probability
|
|
of obtaining a certain measurement value.
|
|
Note that when we speak of an \emph{observable}, we are usually
|
|
refering to the operator $\hat{Q}$.
|
|
|
|
%%%%%%%%%%%%%%%%
|
|
\subsection{Projective Measurements}
|
|
\label{subsec:Projective Measurements}
|
|
|
|
% Projective measurements
|
|
|
|
The measurements we considered in the previous section, for which
|
|
\autoref{eq:gen_expr_Q_exp_lin} holds, belong to the category of
|
|
\emph{projective measurements}.
|
|
For these, certain restrictions such as repeatability apply: the act
|
|
of measuring a quantum state should \emph{collapse} it onto one of
|
|
the determinate states.
|
|
Further measurements should then yield the same value.
|
|
More general methods of modelling measurements exist, e.g., describing
|
|
destructive measurements \cite[Box~2.5]{nielsen_quantum_2010}, but
|
|
they are not relevant to this work.
|
|
|
|
% Projection operators
|
|
|
|
We can model the collapse of the original state onto one of the
|
|
superimposed basis states as a \emph{projection}.
|
|
To see this, we use Equations \ref{eq:determinate_basis} and
|
|
\ref{eq:observable_eigenrelation} to compute
|
|
\begin{align*}
|
|
\hat{Q}\ket{\psi} = \sum_{n=1}^{\infty} c_n \hat{Q} \ket{e_n}
|
|
= \sum_{n=1}^{\infty} \lambda_n c_n \ket{e_n}
|
|
.%
|
|
\end{align*}%
|
|
We see that $\hat{Q}$ has the effect of multiplying the component
|
|
along each basis vector with the corresponding eigenvalue.
|
|
We decompose $\hat{Q}$ into its constituent parts that act on each of
|
|
the separate components as
|
|
\begin{align*}
|
|
\hat{Q} = \sum_{n=1}^{\infty} \lambda_n \hat{P}_n
|
|
\end{align*}
|
|
using \emph{projection operators} \cite[Eq.~3.160]{griffiths_introduction_1995}
|
|
\begin{align*}
|
|
\hat{P}_n := \ket{e_n}\bra{e_n}, \hspace{3mm} n\in \mathbb{N}
|
|
.
|
|
\end{align*}%
|
|
These project a vector onto the subspace spanned by $\ket{e_n}$.
|
|
|
|
% % Using projection operators to measure if a state has a component
|
|
% % along a basis vector
|
|
%
|
|
% A particularly interesting property of projection operators is that
|
|
% \begin{align*}
|
|
% \hat{P}_n (\hat{P}_n \ket{\psi}) = \hat{P}_n^2 \ket{\psi}
|
|
% = \hat{P}_n \ket{\psi},
|
|
% \end{align*}%
|
|
% and the only way this can hold for any $\ket{\psi}$ is if $\hat{P}_n$
|
|
% only has the eigenvalues $0$ or $1$
|
|
% % tex-fmt: off
|
|
% \cite[Prob.~3.57a)]{griffiths_introduction_1995}.
|
|
% % tex-fmt: on
|
|
% The eigenvalues can again be interpreted as possible measurement results.
|
|
% We can thus use $\hat{P}$ as an observable and treat
|
|
% the eigenvalue as an indicator of the state having a component along
|
|
% the related basis vector.
|
|
|
|
%%%%%%%%%%%%%%%%
|
|
\subsection{Qubits and Multi-Qubit States}
|
|
\label{subsec:Qubits and Multi-Qubit States}
|
|
|
|
% The qubit
|
|
|
|
% TODO: Make sure `quantum gate` is proper terminology
|
|
A central concept for quantum computing is that of the \emph{qubit}.
|
|
We employ it analogously to the classical \emph{bit}.
|
|
For classical computers, we alter bits' states using \emph{gates}.
|
|
We can chain multiple of these gates together to build up more complex logic,
|
|
such as half-adders or eventually a full processor.
|
|
In principle, quantum computers work in a similar fashion, only that
|
|
instead of bits we use qubits and instead of, e.g. {AND}, OR, and XOR
|
|
operations we use \emph{quantum gates} \cite[Sec.~1.3]{nielsen_quantum_2010}.
|
|
We define a qubit to be a component with determinate
|
|
states $\ket{0}$ and $\ket{1}$.
|
|
The general description of the state $\ket{\psi}$ of a qubit is thus
|
|
\begin{align}
|
|
\label{eq:gen_qubit_state}
|
|
\ket{\psi} = \alpha\ket{0} + \beta\ket{1}, \hspace{5mm} \alpha,
|
|
\beta \in \mathbb{C}
|
|
.%
|
|
\end{align}
|
|
|
|
% The tensor product and multi-qubit states
|
|
|
|
The overall state of a composite quantum system is described using
|
|
the \emph{tensor product}, denoted as $\otimes$
|
|
\cite[Sec.~2.2.8]{nielsen_quantum_2010}.
|
|
Take for example the two qubits
|
|
\begin{align*}
|
|
\ket{\psi_1} = \alpha_1 \ket{0} + \beta_1 \ket{1},\hspace*{10mm}
|
|
\ket{\psi_2} = \alpha_2 \ket{0} + \beta_2 \ket{1}
|
|
.%
|
|
\end{align*}
|
|
% TODO: Fix the fact that \psi is used above for the single-qubit
|
|
% case and below for the multi-qubit case
|
|
We examine the state $\ket{\psi}$ of the composite system.
|
|
Assuming the qubits are independent, this is a \emph{product state}
|
|
$\ket{\psi} = \ket{\psi_1}\otimes\ket{\psi_2}$.
|
|
When not ambiguous, we may omit the tensor product symbol or even write
|
|
the entire product state as a single ket
|
|
\cite[Sec.~6.2]{griffiths_consistent_2001}.
|
|
We have
|
|
\begin{align}
|
|
\label{eq:product_state}
|
|
\begin{split}
|
|
\ket{\psi} = \ket{\psi_1} \ket{\psi_2}
|
|
&= \left( \alpha_1 \ket{0} + \beta_1 \ket{1} \right)
|
|
\left( \alpha_2 \ket{0} + \beta_2 \ket{1} \right) \\
|
|
&= \alpha_1\alpha_2\ket{00}
|
|
+ \alpha_1\beta_2\ket{01}
|
|
+ \beta_1\alpha_2\ket{10}
|
|
+ \beta_1\beta_2\ket{11}
|
|
.%
|
|
\end{split}
|
|
\end{align}
|
|
We call $\ket{x_0, \ldots, x_n}~, x_i \in \{0,1\}$ the
|
|
\emph{computational basis states} \cite[Sec.~4.6]{nielsen_quantum_2010}.
|
|
To additionally simplify set notation, we define
|
|
\begin{align*}
|
|
\mathcal{M}^{\otimes n} := \underbrace{\mathcal{M}\otimes \ldots
|
|
\otimes \mathcal{M}}_{n \text{ times}}
|
|
.%
|
|
\end{align*}
|
|
|
|
% Entanglement
|
|
|
|
States that are not able to be decomposed into such products
|
|
are called \emph{entangled} \cite[Sec.~2.2.8]{nielsen_quantum_2010}.
|
|
An example of such states are the \emph{Bell states}
|
|
\begin{align*}
|
|
\begin{split}
|
|
\ket{\psi_{00}} &= \frac{\ket{00} + \ket{11}}{\sqrt{2}} \hspace{15mm}
|
|
\ket{\psi_{01}} = \frac{\ket{01} - \ket{10}}{\sqrt{2}} \\
|
|
\ket{\psi_{10}} &= \frac{\ket{00} + \ket{11}}{\sqrt{2}} \hspace{15mm}
|
|
\ket{\psi_{11}} = \frac{\ket{01} - \ket{10}}{\sqrt{2}}
|
|
\end{split}
|
|
\hspace{4mm}.%
|
|
\end{align*}
|
|
Quantum entanglement plays a major role in the way information
|
|
is encoded on quantum systems compared to classical ones.
|
|
Instead of employing only the individual qubit states, the
|
|
information is stored in the correlations between the qubits
|
|
\cite[Sec.~2]{preskill_quantum_2018}.
|
|
|
|
% The size of the vector space
|
|
|
|
As we can see in \autoref{eq:product_state}, the number of
|
|
computational basis states needed to express the full composite state
|
|
is $2^n$.
|
|
This is in contrast to classical systems, where the dimensionality of
|
|
the state space only grows linearly with $n$.
|
|
This exponential growth of the state space is what makes it difficult
|
|
to simulate quantum systems on classical hardware.
|
|
It is also what motivated the research into performing computations
|
|
using quantum hardware in the first place
|
|
\cite[Sec.~3]{feynman_simulating_1982}.
|
|
|
|
% Basic types of gates
|
|
|
|
After examining the modelling of single- and multi-qubit systems,
|
|
we now shift our focus to describing the evolution of their states.
|
|
We model state changes as operators.
|
|
Unlike classical systems, where there are only two possible states and
|
|
thus the only possible state change is a bit-flip, a general qubit
|
|
state as shown in \autoref{eq:gen_qubit_state} lives on a continuum of values.
|
|
We thus technically also have an infinite number of possible state changes.
|
|
Luckily, we can express any operator as a linear combination of the
|
|
\emph{Pauli operators} \cite[Sec.~2.2]{gottesman_stabilizer_1997}
|
|
\cite[Sec.~2.2]{roffe_quantum_2019}
|
|
\begin{align*}
|
|
\begin{array}{c}
|
|
I\text{ Operator} \\
|
|
\hline\\
|
|
\ket{0} \mapsto \ket{0} \\
|
|
\ket{1} \mapsto \ket{1}
|
|
\end{array}%
|
|
\hspace{10mm}%
|
|
\begin{array}{c}
|
|
X\text{ Operator} \\
|
|
\hline\\
|
|
\ket{0} \mapsto \ket{1} \\
|
|
\ket{1} \mapsto \ket{0}
|
|
\end{array}%
|
|
\hspace{10mm}%
|
|
\begin{array}{c}
|
|
Z\text{ Operator} \\
|
|
\hline\\
|
|
\ket{0} \mapsto -\ket{0} \\
|
|
\ket{1} \mapsto -\ket{1}
|
|
\end{array}%
|
|
\hspace{10mm}%
|
|
\begin{array}{c}
|
|
Y\text{ Operator} \\
|
|
\hline\\
|
|
\ket{0} \mapsto -j\ket{1} \\
|
|
\hspace{2.75mm}\ket{1} \mapsto -j\ket{0} \hspace*{1mm}.
|
|
\end{array}
|
|
\end{align*}
|
|
In fact, if we allow for complex coefficients, the $X$ and $Z$
|
|
operators are sufficient to express any other operator as a linear
|
|
combination \cite[Sec.~2.2]{roffe_quantum_2019}.
|
|
$I$ is the identity operator and $X$ and $Z$ are referred to as
|
|
\emph{bit-flips} and \emph{phase-flips} respectively.
|
|
We call the set $\mathcal{G}_n = \left\{ \pm I,\pm jI, \pm X,\pm jX,
|
|
\pm Y,\pm jY, \pm Z, \pm jZ \right\}^{\otimes n}$ the \emph{Pauli
|
|
group} over $n$ qubits.
|
|
|
|
In the context of modifying qubit states, we also call operators \emph{gates}.
|
|
When working with multi-qubit systems, we can also apply Pauli gates
|
|
to individual qubits independently, which we write ask e.g., $I_1 X_2
|
|
I_3 Z_4 Y_5$.
|
|
We often omit the identity operators, instead writing, e.g., $X_2 Z_4 Y_5$.
|
|
Other important operators include the \emph{Hadamard} and
|
|
\emph{controlled-NOT (CNOT)} gates \cite[Sec.~1.3]{nielsen_quantum_2010}
|
|
\vspace*{-7mm}
|
|
\begin{figure}[H]
|
|
\centering
|
|
\begin{minipage}[t]{0.4\textwidth}
|
|
\centering
|
|
\begin{align*}
|
|
\begin{array}{c}
|
|
H\text{ Operator} \\
|
|
\hline\\
|
|
\ket{0} \mapsto \frac{1}{\sqrt{2}} \left( \ket{0} +
|
|
\ket{1} \right) \\[2mm]
|
|
\ket{1} \mapsto \frac{1}{\sqrt{2}} \left( \ket{0} -
|
|
\ket{1} \right)
|
|
\end{array}
|
|
\end{align*}
|
|
\end{minipage}%
|
|
\begin{minipage}[t]{0.4\textwidth}
|
|
\centering
|
|
\begin{align*}
|
|
\begin{array}{c}
|
|
CNOT\text{ Operator} \\
|
|
\hline\\
|
|
\ket{00} \mapsto \ket{00} \\
|
|
\ket{01} \mapsto \ket{01} \\
|
|
\ket{10} \mapsto \ket{11} \\
|
|
\hspace{2.75mm}\ket{11} \mapsto \ket{10} \hspace*{1mm}.
|
|
\end{array}
|
|
\end{align*}
|
|
\end{minipage}%
|
|
\end{figure}
|
|
\vspace{-4mm}
|
|
\noindent Many more operators relevant to quantum computing exist, but they are
|
|
not covered here as they are not central to this work.
|
|
|
|
%%%%%%%%%%%%%%%%
|
|
\subsection{Quantum Circuits}
|
|
\label{Quantum Circuits}
|
|
|
|
% Intro
|
|
|
|
Using these quantum gates, we can construct \emph{circuits} to manipulate
|
|
the states of qubits \cite[Sec.~1.3.4]{nielsen_quantum_2010}.
|
|
Circuits are read from left to right and each horizontal wire
|
|
represents a qubit whose state evolves as it passes through
|
|
successive gates.
|
|
|
|
% General notation
|
|
|
|
A single line carries a quantum state, while a double line
|
|
denotes a classical bit, typically used to carry the result of a measurement.
|
|
A measurement is represented by a meter symbol.
|
|
In general, gates are represented as labeled boxes placed on one or more wires.
|
|
An exception is the CNOT gate, where the operation is represented as
|
|
the symbol $\oplus$.
|
|
|
|
% Controlled gates & example
|
|
|
|
We can additionally add a control input to a gate.
|
|
This conditions its application on the state of another qubit
|
|
\cite[Sec.~4.3]{nielsen_quantum_2010}.
|
|
The control connection is represented by a vertical line connecting
|
|
the gate to the corresponding qubit, where a filled dot is placed.
|
|
A controlled gate applies the respective operation only if the
|
|
control qubit is in state $\ket{1}$.
|
|
An example of this is the CNOT gate introduced in
|
|
\autoref{subsec:Qubits and Multi-Qubit States}, which is depicted in
|
|
\autoref{fig:cnot_circuit}.
|
|
|
|
\begin{figure}[t]
|
|
\centering
|
|
|
|
\begin{quantikz}
|
|
\lstick{$\ket{\psi}_1$} & \ctrl{1} & \\
|
|
\lstick{$\ket{\psi}_2$} & \targ{} & \\
|
|
\end{quantikz}
|
|
|
|
\caption{CNOT gate circuit.}
|
|
\label{fig:cnot_circuit}
|
|
\end{figure}
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\section{Quantum Error Correction}
|
|
\label{sec:Quantum Error Correction}
|
|
|
|
% TODO: Use this for the introduction as well
|
|
|
|
% General motivation behind QEC
|
|
|
|
One of the major barriers on the road to building a functioning
|
|
quantum computer is the inevitability of errors during quantum
|
|
computation due to the difficulty in sufficiently isolating the
|
|
qubits from external noise \cite[Intro.]{roffe_quantum_2019}.
|
|
This isolation is critical for quantum systems, as the constant interactions
|
|
with the environment act as small measurements, leading to the
|
|
eventual \emph{decoherence} of the quantum state
|
|
\cite[Intro.]{gottesman_stabilizer_1997}.
|
|
\ac{qec} is one approach of dealing with this problem, by protecting
|
|
the quantum state in a similar fashion to information in classical error
|
|
correction.
|
|
|
|
% The unique challenges of QEC
|
|
|
|
The problem setting of \ac{qec} differs slightly from the classical case, as
|
|
three main restrictions apply \cite[Sec.~2.4]{roffe_quantum_2019}:
|
|
\begin{itemize}
|
|
\item The no-cloning theorem states that it is
|
|
impossible to exactly copy the state of one qubit into another.
|
|
\item Qubit are susceptible to more types of errors than
|
|
just bit-flips, as we saw in
|
|
\autoref{subsec:Qubits and Multi-Qubit States}.
|
|
\item Directly measuring the state of a qubit collapses it onto
|
|
one of the determinate states, thereby potentially destroying
|
|
information.
|
|
\end{itemize}
|
|
|
|
% General idea (logical vs. physical gates) + notation
|
|
|
|
Much like in classical error correction, in \ac{qec} information
|
|
is protected by mapping it onto codewords in a higher-dimensional space,
|
|
thereby introducing redundancy.
|
|
To this end, $k \in \mathbb{N}$ \emph{logical qubits} are mapped onto
|
|
$n \in \mathbb{N}$ \emph{physical qubits}, $n>k$.
|
|
We circumvent the no-cloning restriction by not copying the state of any of
|
|
the $k$ logical qubits, instead spreading the total state out over all $n$
|
|
physical ones \cite[Intro.]{calderbank_good_1996}.
|
|
To differentiate quantum codes from classical ones, we denote a
|
|
code with parameters $k,n$ and minimum distance $d_\text{min}$ using
|
|
double brackets, as $\llbracket n,k,d_\text{min} \rrbracket$
|
|
\cite[Sec.~4]{roffe_quantum_2019}.
|
|
|
|
%%%%%%%%%%%%%%%%
|
|
\subsection{Stabilizer Measurements}
|
|
\label{subsec:Stabilizer Measurements}
|
|
|
|
% Setting the stage
|
|
|
|
Before we move on to the description of entire codes, we introduce
|
|
the notion of the \emph{stabilizer measurement}.
|
|
Consider the two-qubit repetition code
|
|
\cite[Sec.~2.4]{roffe_quantum_2019}, where we map
|
|
\begin{align*}
|
|
\ket{\psi} = \alpha \ket{0} + \beta \ket{1}
|
|
\hspace*{3mm} \mapsto \hspace*{3mm}
|
|
\ket{\psi}_\text{L}
|
|
= \alpha \underbrace{\ket{00}}_{=:\ket{0}_\text{L}} + \beta
|
|
\underbrace{\ket{11}}_{=:\ket{1}_\text{L}}
|
|
.%
|
|
\end{align*}
|
|
We call $\ket{\psi}_L$ the logical state.
|
|
We define the \emph{codespace} as $\mathcal{C} := \text{span}\mleft\{
|
|
\ket{00}, \ket{11} \mright\}$ and the \emph{error subspace} as
|
|
$\mathcal{F} := \text{span} \mleft\{\ket{01}, \ket{10} \mright\}$.
|
|
Note that this code is only able to detect single $X$-type errors.
|
|
|
|
% Measuring stabilizers
|
|
|
|
To determine if an error occurred, we want to measure
|
|
whether a state belongs
|
|
% TODO: Remove footnote?
|
|
% \footnote{
|
|
% It is possible for a state to not completely lie in either subspace.
|
|
% In this case, we can interpret it as being in
|
|
% $\mathcal{C}$ or $\mathcal{F}$ with a certain probability.
|
|
% }
|
|
to $\mathcal{C}$ or $\mathcal{F}$.
|
|
As explained in \autoref{subsec:Observables}, physical measurements
|
|
can be mathematically described using operators whose eigenvalues
|
|
are the possible measurement results.
|
|
Here, we need an operator with two eigenvalues and the corresponding
|
|
eigenspaces should be $\mathcal{C}$ and $\mathcal{F}$ respectively.
|
|
For the two-qubit code, $Z_1Z_2$ is such an operator:
|
|
\begin{align}
|
|
Z_1Z_2 E \ket{\psi}_\text{L} &= (+1) E \ket{\psi}_\text{L}
|
|
\hspace*{3mm} \forall
|
|
E \ket{\psi}_\text{L} \in \mathcal{C} \\
|
|
Z_1Z_2 E \ket{\psi}_\text{L} &= (-1) E \ket{\psi}_\text{L}
|
|
\hspace*{3mm} \forall
|
|
E \ket{\psi}_\text{L} \in \mathcal{F}
|
|
.%
|
|
\end{align}
|
|
$E \in \left\{ X,I \right\}$ is an operator describing a possible
|
|
error and $E \ket{\psi}_\text{L}$ is the resulting state after that error.
|
|
By measuring the corresponding eigenvalue, we can determine if
|
|
$E\ket{\psi}_\text{L}$ lies in $\mathcal{C}$ or $\mathcal{F}$.
|
|
% TODO: If necessary, cite \cite[Sec.~3]{roffe_quantum_2019} for the
|
|
% non-compromising meausrement of the information
|
|
To do this without directly observing (and thus potentially
|
|
collapsing) the logical state $\ket{\psi}_\text{L}$, we prepare an
|
|
ancilla qubit with state $\ket{0}_\text{A}$ and we entangle it with
|
|
$\ket{\psi}_\text{L}$ in such a way that the eigenvalue is indicated
|
|
by measuring that instead.
|
|
More specifically, using a stabilizer measurement circuit as shown in
|
|
\autoref{fig:stabilizer_measurement}, we transform the state of the
|
|
three-qubit system as
|
|
\begin{align}
|
|
\label{eq:error_projection}
|
|
E\ket{\psi}_\text{L} \ket{0}_\text{A} \hspace*{3mm} \rightarrow
|
|
\hspace*{3mm}
|
|
\underbrace{\frac{1}{2} \mleft( I_1I_2 + Z_1Z_2 \mright)}_{=:
|
|
P_\mathcal{C}} E\ket{\psi}_\text{L}
|
|
\ket{0}_\text{A}
|
|
+ \underbrace{\frac{1}{2} \mleft( I_1I_2 - Z_1Z_2 \mright)}_{=:
|
|
P_\mathcal{F}}
|
|
E\ket{\psi}_\text{L} \ket{1}_\text{A}
|
|
.%
|
|
\end{align}
|
|
If $E \ket{\psi}_\text{L} \in \mathcal{C}$, the second term will
|
|
cancel and we will deterministically measure $\ket{0}_\text{A}$ for
|
|
the ancilla qubit. Similarly, if $E \ket{\psi}_\text{L} \in
|
|
\mathcal{F}$, we will deterministically measure $\ket{1}_\text{A}$.
|
|
|
|
\begin{figure}[t]
|
|
\centering
|
|
|
|
\tikzset{
|
|
meter/.append style={
|
|
draw, rectangle,
|
|
font=\vphantom{A}, minimum width=8mm, minimum height=8mm,
|
|
path picture={
|
|
\draw[black]
|
|
([shift={(.1,.3)}]path picture bounding box.south west)
|
|
to[bend left=50]
|
|
([shift={(-.1,.3)}]path picture bounding box.south east);
|
|
\draw[black,-latex]
|
|
([shift={(0,.1)}]path picture bounding box.south)
|
|
-- ([shift={(.3,-.1)}]path picture bounding box.north);
|
|
}
|
|
}
|
|
}
|
|
|
|
\begin{tikzpicture}
|
|
\node[rectangle, minimum width=2cm, minimum height=3cm, draw]
|
|
(ZZ) {$Z_1Z_2$};
|
|
|
|
\coordinate (qi1) at (-3, 1);
|
|
\coordinate (qi2) at (-3, -1);
|
|
\coordinate (qi3) at (-3, -3);
|
|
|
|
\coordinate (qo1) at (4, 1);
|
|
\coordinate (qo2) at (4, -1);
|
|
\coordinate (qo3) at (4, -3);
|
|
|
|
\node[rectangle, minimum width=8mm, minimum height=8mm, draw]
|
|
(H1) at ($(qo3 -| ZZ) + (-2, 0)$) {H};
|
|
\node[rectangle, minimum width=8mm, minimum height=8mm, draw]
|
|
(H2) at ($(qo3 -| ZZ) + (2, 0)$) {H};
|
|
\node[circle, fill] (not) at (H1 -| ZZ) {};
|
|
\node[meter, right=5mm of H2] (mes) {};
|
|
|
|
\draw (qi1) -- (ZZ.west |- qi1);
|
|
\draw (qi2) -- (ZZ.west |- qi2);
|
|
|
|
\draw (qo1) -- (ZZ.east |- qo1);
|
|
\draw (qo2) -- (ZZ.east |- qo2);
|
|
|
|
\draw (qi3) -- (H1) -- (not) -- (H2) -- (mes);
|
|
|
|
\draw (not) -- (ZZ);
|
|
|
|
\coordinate (qo3u) at ($(qo3) + (0, .5mm)$);
|
|
\coordinate (qo3d) at ($(qo3) + (0, -.5mm)$);
|
|
|
|
\draw (mes.east |- qo3u) -- (qo3u);
|
|
\draw (mes.east |- qo3d) -- (qo3d);
|
|
|
|
\node[left] at (qi3) {$\ket{0}_\text{A}$};
|
|
\node[left] at ($(qi1)!.5!(qi2)$) {$E\ket{\psi}_\text{L}$};
|
|
\node[right] at ($(qo1)!.5!(qo2)$) {$E\ket{\psi}_\text{L}$};
|
|
|
|
\end{tikzpicture}
|
|
|
|
\caption{Stabilizer measurement circuit for the two-qubit repetition code.}
|
|
\label{fig:stabilizer_measurement}
|
|
\end{figure}
|
|
|
|
% Digitization of errors
|
|
|
|
Note that it is possible for a vector $E\ket{\psi}$ to not completely
|
|
lie in either subspace.
|
|
In this case, we can interpret it as being in $\mathcal{C}$ or
|
|
$\mathcal{F}$ with a certain probability.
|
|
However, when we measure the stabilizer, we will find that the vector
|
|
lies either in one or the other.
|
|
This is because the act of measuring the error partly collapses the
|
|
state, eliminating the uncertainty about the type of the error
|
|
\cite[Sec.~10.2]{nielsen_quantum_2010}.
|
|
This can be seen in \autoref{eq:error_projection}, as the expressions
|
|
$P_\mathcal{C}$ and $P_\mathcal{F}$ constitute projection operators onto
|
|
$\mathcal{C}$ and $\mathcal{F}$.
|
|
E.g., $P_\mathcal{C}$ will eliminate all components of $E
|
|
\ket{\psi}_\text{L}$ that lie in $\mathcal{F}$.
|
|
This process, together with the fact that any coherent error can be
|
|
decomposed into a linear combination of $X$ and $Z$ errors, means
|
|
that it is enough for a \ac{qec} to be able to correct only $X$ and $Z$ errors.
|
|
This effect is referred to as error \emph{digitization}
|
|
\cite[Sec.~2.2]{roffe_quantum_2019}.
|
|
|
|
% The stabilizer group
|
|
|
|
Operators such as $Z_1Z_2$ above are called \emph{stabilizers}.
|
|
An operator $P_i \in \mathcal{G}_n$ is called a stabilizer of an
|
|
$[[n, k, d_\text{min}]]$ code $\mathcal{C}$, if
|
|
\begin{itemize}
|
|
\item It stabilizes all logical states, i.e.,
|
|
$P_i\ket{\psi}_\text{L} = (+1)\ket{\psi}_\text{L} ~\forall~
|
|
\ket{\psi}_\text{L} \in \mathcal{C}$.
|
|
\item It commutes with all other stabilizers of the code. This
|
|
property is important to be able to measure the eigenvalue of
|
|
a stabilizer without disturbing the eigenvectors of the
|
|
others \cite[Sec.~1.2]{gottesman_stabilizer_1997}.
|
|
\end{itemize}
|
|
Formally, we define the \emph{stabilizer group} $\mathcal{S}$ as
|
|
\cite[Sec.~4.1]{roffe_quantum_2019}
|
|
\begin{align*}
|
|
\mathcal{S} = \left\{P_i \in \mathcal{G}_n ~:~ P_i \ket{\psi}_\text{L} =
|
|
(+1)\ket{\psi}_\text{L} \forall \ket{\psi}_\text{L} ~\cap~
|
|
[P_i,P_j] = 0 \forall i,j\right\}
|
|
,%
|
|
\end{align*}
|
|
where $[P_i,P_j] := P_iP_j - P_j P_i$ is called the \emph{commutator}
|
|
of $P_i$ and $P_j$.
|
|
We care in particular about the commuting properties of stabilizers
|
|
with respect to possible errors.
|
|
The measurement circuit for an arbitrary stabilizer $P_i$ modifies
|
|
the state as \cite[Eq.~29]{roffe_quantum_2019}
|
|
\begin{align*}
|
|
E\ket{\psi}_\text{L}\ket{0}_\text{A}
|
|
\hspace{3mm}\mapsto\hspace{3mm}
|
|
\frac{1}{2} \left( I + P_i
|
|
\right)E\ket{\psi}_\text{L}\ket{0}_\text{A} + \frac{1}{2}
|
|
\left( I - P_i \right)E\ket{\psi}_\text{A} \ket{1}_\text{A}
|
|
.%
|
|
\end{align*}
|
|
If a given error $E$ anticommutes with $P_i$, we have
|
|
\begin{align*}
|
|
EP_i \ket{\psi}_{L} &= -P_i E \ket{\psi}_\text{L} \\
|
|
\Rightarrow E \ket{\psi}_{L} &= -P_i E \ket{\psi}_\text{L} \\
|
|
\Rightarrow \left( I + P_i \right)E\ket{\psi}_\text{L} &= 0
|
|
\end{align*}
|
|
and the stabilizer measurement returns 1.
|
|
|
|
%%%%%%%%%%%%%%%%
|
|
\subsection{Stabilizer Codes}
|
|
\label{subsec:Stabilizer Codes}
|
|
|
|
% Structure of a stabilizer code
|
|
|
|
For classical binary linear block codes, we use $n-k$ parity-checks
|
|
to reduce the degrees of freedom introduced by the encoding operation.
|
|
Effectively, each parity-check defines a local code, splitting the
|
|
vector space in half, with only one half containing valid codewords.
|
|
The global code is the intersection of all local codes.
|
|
We can do the same in the quantum case.
|
|
Each split is represented using stabilizer, whose eigenvalues signify
|
|
whether a candidate vector lies in the local codespace or local error subspace.
|
|
It is only a valid codeword if it lies in the codespace of all local codes.
|
|
We call codes constructed this way \emph{stabilizer codes}.
|
|
|
|
% Syndrome extraction circuitry
|
|
|
|
Similar to the classical case, we can use a syndrome vector to
|
|
describe which local codes are violated.
|
|
To obtain the syndrome, we simply measure the corresponding
|
|
operators, each using a circuit as explained in
|
|
\autoref{subsec:Stabilizer Measurements}.
|
|
A full \emph{syndrome extraction circuit} is depicted in \autoref{fig:sec}.
|
|
|
|
% TODO: Move this further up to the commutativity of operators?
|
|
\indent\red{[Fixing the error after finding it
|
|
\cite[Sec.~10.5.5]{nielsen_quantum_2010}]} \\
|
|
\indent\red{[Logical operators \cite[Sec.~4.2]{roffe_quantum_2019}]} \\
|
|
\indent\red{[Measuring logical operators gives yields the outcomes of
|
|
the encoded computations \cite[Sec.~2.6]{derks_designing_2025}]} \\
|
|
\indent\red{[X and Z measurements can be performed with only CNOT and
|
|
Hadamard gates \cite[Sec.~10.5.8]{nielsen_quantum_2010}]} \\
|
|
\indent\red{[(?) Stabilizer generators]} \\
|
|
\indent\red{[Parity-check matrix \cite[Sec.~10.5.1]{nielsen_quantum_2010}]}
|
|
|
|
\begin{figure}[t]
|
|
\centering
|
|
|
|
\tikzset{
|
|
meter/.append style={
|
|
draw, rectangle,
|
|
font=\vphantom{A}, minimum width=8mm, minimum height=8mm,
|
|
path picture={
|
|
\draw[black]
|
|
([shift={(.1,.3)}]path picture bounding box.south west)
|
|
to[bend left=50]
|
|
([shift={(-.1,.3)}]path picture bounding box.south east);
|
|
\draw[black,-latex]
|
|
([shift={(0,.1)}]path picture bounding box.south)
|
|
-- ([shift={(.3,-.1)}]path picture bounding box.north);
|
|
}
|
|
}
|
|
}
|
|
|
|
\red{Hier könnte Ihre Werbung stehen.}
|
|
|
|
\caption{
|
|
\red{Illustration of a general syndrome extraction circuit.
|
|
Adapted from \cite[Figure~4]{roffe_quantum_2019}.}
|
|
}
|
|
\label{fig:sec}
|
|
\end{figure}
|
|
|
|
%%%%%%%%%%%%%%%%
|
|
\subsection{Calderbank-Shor-Steane Codes}
|
|
\label{subsec:Calderbank-Shor-Steane Codes}
|
|
|
|
Stabilizer codes are especially practical to work with when they can
|
|
handle $X$- and $Z$-type errors independently.
|
|
We can then separate the stabilizer generators into some with only
|
|
$Z$ operators and some with only $X$ operators.
|
|
|
|
\indent\red{[Z-type operators for X type errors and vice versa ]} \\
|
|
\indent\red{[Construction from two binary linear codes
|
|
\cite[p.~452,469]{nielsen_quantum_2010}]}
|
|
|
|
\subsection{Quantum Low-Density Parity-Check Codes}
|
|
|
|
\noindent\red{[Constant overhead scaling]} \\
|
|
\noindent\red{[Scaling of minimum distance with code length]} \\
|
|
\noindent\red{[Bivariate Bicycle codes]} \\
|
|
\noindent\red{[Decoding QLDPC codes (syndrome-based BP)]} \\
|
|
\noindent\red{[Degeneracy -> BP+OSD, BPGD]} \\
|
|
\noindent\red{[``The task of decoding is therefore to infer, from a
|
|
measured syndrome, the most likely error coset rather than the exact
|
|
physical error.''
|
|
% tex-fmt: off
|
|
\cite[Sec.~II~B)]{koutsioumpas_colour_2025}]}%
|
|
% tex-fmt: on
|
|
\\
|
|
|
|
\red{
|
|
\textbf{General Notes:}
|
|
\begin{itemize}
|
|
\item Note that there are other codes than stabilizer codes
|
|
(and research and give some examples), but only
|
|
stabilizer codes are considered in this work
|
|
\item Degeneracy
|
|
\item The QEC decoding problem (considering degeneracy)
|
|
\cite[Sec.~2.3]{yao_belief_2024}
|
|
\end{itemize}
|
|
\textbf{Content:}
|
|
\begin{itemize}
|
|
\item Stabilizer codes
|
|
\begin{itemize}
|
|
\item syndrome extraction circuit
|
|
\item Stabilizer codes are effectively the QM
|
|
% TODO: Actually binary linear codes or
|
|
% just linear codes?
|
|
equivalent of binary linear codes (e.g.,
|
|
expressible via check matrix)
|
|
\item Similar to parity checks, quantum states can be
|
|
more conveniently described using stabilizers
|
|
rather than working with the states directly
|
|
\cite[Sec.~10.5.1]{nielsen_quantum_2010}
|
|
\end{itemize}
|
|
\item Color codes?
|
|
\item Surface codes?
|
|
\end{itemize}
|
|
}
|
|
|