510 lines
18 KiB
TeX
510 lines
18 KiB
TeX
\chapter{Fundamentals}
|
|
\label{ch:Fundamentals}
|
|
|
|
\Ac{qec} is a field of research combining ``classical''
|
|
communications engineering and quantum information science.
|
|
This chapter provides the relevant theoretical background on both of
|
|
these topics and subsequently introduces the the fundamentals of \ac{qec}.
|
|
|
|
% TODO: Is an explanation of BP with guided decimation needed in this chapter?
|
|
% TODO: Is an explanation of OSD needed chapter?
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\section{Classical Error Correction}
|
|
\label{sec:Classical Error Correction}
|
|
|
|
% TODO: Maybe rephrase: The core concept is not the realization, its's the
|
|
% thing itself
|
|
The core concept underpinning error correcting codes is the
|
|
realization that the introduction of a finite amount of redundancy
|
|
to information before its transmission can leed to a considerably
|
|
reduced error rate.
|
|
Specifically, Shannon proved in 1948 that for any channel, a block
|
|
code can be found that achieves arbitrarily small probability of
|
|
error at any communication rate up to the capacity of the channel
|
|
when the block length approaches infinity
|
|
\cite[Sec.~13]{shannon_mathematical_1948}.
|
|
|
|
In this section, we explore the concepts of ``classical'' (as in non-quantum)
|
|
error correction that are central to this work.
|
|
We start by looking at different ways of encoding information,
|
|
first considering binary linear block codes in general and then \ac{ldpc} and
|
|
\ac{sc}-\ac{ldpc} codes.
|
|
Finally, we pivot to the decoding process, specifically the \ac{bp}
|
|
algorithm.
|
|
|
|
% TODO: Use subsubsections?
|
|
\subsection{Binary Linear Block Codes}
|
|
|
|
%
|
|
% Codewords, n, k, rate
|
|
%
|
|
|
|
% TODO: Do I need a specific reference for the expanded Hilbert space thing?
|
|
One particularly important class of coding schemes is that of binary
|
|
linear block codes.
|
|
The information to be protected takes the form of a sequence of of
|
|
binary symbols, which is split into separate blocks.
|
|
Each block is encoded, transmitted, and decoded separately.
|
|
The encoding step introduces redundancy by mapping input messages
|
|
$\bm{u} \in \mathbb{F}_2^k$ of length $k \in \mathbb{N}$ (called the
|
|
\textit{information length}) onto \textit{codewords} $\bm{x} \in
|
|
\mathbb{F}_2^n$ of length $n \in \mathbb{N}$ (called the
|
|
\textit{block length}) with $n > k$.
|
|
A measure of the amount of introduced redundancy is the \textit{code
|
|
rate} $R = k/n$.
|
|
We call the set of all codewords $\mathcal{C}$ the \textit{code}
|
|
\cite[Sec.~3.1.1]{ryan_channel_2009}.
|
|
|
|
%
|
|
% d_min and the [] Notation
|
|
%
|
|
|
|
During the encoding process, a mapping from $\mathbb{F}_2^k$
|
|
onto $\mathcal{C} \subset \mathbb{F}_2^n$ takes place.
|
|
The input messages are mapped onto an expanded vector space, where
|
|
they are ``further appart'', giving rise to the error correcting
|
|
properties of the code.
|
|
This notion of the distance between two codewords $\bm{x}_1$ and
|
|
$\bm{x}_2$ can be expressed using the \textit{Hamming distance} $d(\bm{x}_1,
|
|
\bm{x}_2)$, which is defined as the number of positions in which they differ.
|
|
We define the \textit{minimum distance} of a code $\mathcal{C}$ as
|
|
%
|
|
\begin{align*}
|
|
d_\text{min} := \min \left\{ d(\bm{x}_1, \bm{x}_2) : \bm{x}_1,
|
|
\bm{x}_2 \in \mathcal{C}, \bm{x}_1 \neq \bm{x}_2 \right\}
|
|
.
|
|
\end{align*}
|
|
%
|
|
We can signify that a binary linear block code has information length
|
|
$k$, block length $n$ and minimum distance $d_\text{min}$ using the
|
|
notation $[n,k,d_\text{dmin}]$ \cite[Sec.~1.3]{macwilliams_theory_1977}.
|
|
|
|
%
|
|
% Parity checks, H, and the syndrome
|
|
%
|
|
|
|
A particularly elegant way of describing the subspace $C$ of
|
|
$\mathbb{F}_2^n$ that the codewords make up is the notion of
|
|
\textit{parity checks}.
|
|
Since $\lvert \mathcal{C} \rvert = 2^k$ and $\lvert \mathbb{F}_2^n
|
|
\rvert = 2^n$, we could introduce $n-k$ conditions to constrain the
|
|
additional degrees of freedom.
|
|
These conditions, called parity checks, take the form of equations
|
|
over $\mathbb{F}_2^n$, linking the individual positions of each codeword.
|
|
We can arrange the coefficients of these equations in a
|
|
\textit{parity-check matrix} (\acs{pcm}) $\bm{H} \in
|
|
\mathbb{F}_2^{(n-k) \times n}$ and equivalently define the code as
|
|
\cite[Sec.~3.1.1]{ryan_channel_2009}
|
|
%
|
|
\begin{align*}
|
|
\mathcal{C} = \left\{ \bm{x} \in \mathbb{F}_2^n :
|
|
\bm{H}\bm{x}^\text{T} = \bm{0} \right\}
|
|
.%
|
|
\end{align*}
|
|
Note that in general we may have linearly dependent parity checks,
|
|
prompting us to define the \ac{pcm} as $\bm{H} \in
|
|
\mathbb{F}_2^{m\times n}$ with $\hspace{2mm} m \ge n-k$ instead.
|
|
% TODO: Define m
|
|
%
|
|
The \textit{syndrome} $\bm{s} = \bm{H} \bm{v}^\text{T}$ describes
|
|
which parity checks a candidate codeword $\bm{v} \in \mathbb{F}_2^n$ violates.
|
|
The representation using the \ac{pcm} has the benefit of providing a
|
|
description of the code, the memory complexity of which doesn't grow
|
|
exponentially with $n$, in contrast to keeping track of all codewords directly.
|
|
|
|
%
|
|
% The decoding problem
|
|
%
|
|
|
|
Figure \ref{fig:Diagram of a transmission system} visualizes the
|
|
communication process \cite[Sec.~1.1]{ryan_channel_2009}.
|
|
An input message $\bm{u}\in \mathbb{F}_2^k$ is mapped onto a codeword $\bm{x}
|
|
\in \mathbb{F}_2^n$. This is passed on to a modulator, which
|
|
interacts with the physical channel.
|
|
A demodulator processes the channel output and forwards the result
|
|
$\bm{y} \in \mathbb{R}^n$ to a decoder.
|
|
Finally, the decoder is responsible for obtaining an estimate
|
|
$\hat{\bm{u}} \in \mathbb{F}_2^k$ of the original input message.
|
|
This is done by first finding an estimate $\hat{\bm{x}}$ of the sent
|
|
codeword and undoing the encoding.
|
|
The decoding problem that we generally attempt to solve thus consists
|
|
in finding the best estimate $\hat{\bm{x}}$ given $\bm{y}$.
|
|
One approach is to use the \ac{ml} criterion \cite[Sec.
|
|
1.4]{ryan_channel_2009}
|
|
\begin{align*}
|
|
\hat{\bm{u}}_\text{ML} = \arg\max_{\bm{x} \in \mathcal{C}}
|
|
P(\bm{Y} = \bm{y} \vert \bm{X} = \bm{x})
|
|
.
|
|
\end{align*}
|
|
Finally, we differentiate between \textit{soft-decision} decoding, where
|
|
$\bm{y} \in \mathbb{R}^n$, and \textit{hard-decision} decoding, where
|
|
$\bm{y} \in \mathbb{F}_2^n$ \cite[Sec.~1.5.1.3]{ryan_channel_2009}.
|
|
%
|
|
\begin{figure}[h]
|
|
\centering
|
|
|
|
\tikzset{
|
|
box/.style={
|
|
rectangle, draw=black, minimum width=17mm, minimum height=8mm,
|
|
},
|
|
}
|
|
|
|
\begin{tikzpicture}
|
|
[
|
|
node distance = 2mm and 7mm,
|
|
]
|
|
\node (in) {};
|
|
\node[box, right=of in] (enc) {Encoder};
|
|
\node[box, minimum width=25mm, right=of enc] (mod) {Modulator};
|
|
\node[box, below right=of mod] (cha) {Channel};
|
|
\node[box, minimum width=25mm, below left=of cha] (dem) {Demodulator};
|
|
\node[box, left=of dem] (dec) {Decoder};
|
|
\node[left=of dec] (out) {};
|
|
|
|
\draw[-{latex}] (in) -- (enc) node[midway, above] {$\bm{u}$};
|
|
\draw[-{latex}] (enc) -- (mod) node[midway, above] {$\bm{x}$};
|
|
\draw[-{latex}] (mod) -| (cha);
|
|
\draw[-{latex}] (cha) |- (dem);
|
|
\draw[-{latex}] (dem) -- (dec) node[midway, above] {$\bm{y}$};
|
|
\draw[-{latex}] (dec) -- (out) node[midway, above] {$\hat{\bm{u}}$};
|
|
\end{tikzpicture}
|
|
|
|
\caption{Overview of a transmission system.}
|
|
\label{fig:Diagram of a transmission system}
|
|
\end{figure}
|
|
%
|
|
|
|
%
|
|
% Hard vs. soft information
|
|
%
|
|
|
|
\subsection{Low-Density Parity-Check Codes}
|
|
|
|
%
|
|
% Core concept
|
|
%
|
|
|
|
Shannon's noisy-channel coding theorem is stated for codes whose block
|
|
length approaches infinity. This suggests that as the block length
|
|
becomes larger, the performance of the considered condes should
|
|
generally improve.
|
|
However, the size of the \ac{pcm}, and thus in general the decoding complexity,
|
|
of a linear block code grows quadratically with $n$.
|
|
This would quickly render decoding intractable as we increase the block length.
|
|
We can get around this problem by constructing $\bm{H}$ in such a
|
|
manner that the number of nonzero entries grows less than quadratically, e.g.,
|
|
only linearly.
|
|
This is exactly the motivation behind \ac{ldpc} codes \cite[Ch.
|
|
1]{gallager_low_1960}.
|
|
|
|
%
|
|
% Tanner Graph, VNs and CNs
|
|
%
|
|
|
|
\ac{ldpc} codes belong to a class sometimes referred to as ``modern codes''.
|
|
These differ from ``classical codes'' in their decoding algorithms:
|
|
Classical codes are usually decoded using one-step hard-decision decoding,
|
|
whereas modern codes are suitable for iterative soft-decision
|
|
decoding \cite[Preface]{ryan_channel_2009}. The iterative decoding algorithms
|
|
in question are generally defined in terms of message passing on the
|
|
\textit{Tanner graph} of the code. The Tanner graph is a bipartite
|
|
graph that constitues an alternative representation of the \ac{pcm}.
|
|
We define two types of nodes: \acp{vn}, corresponding to codeword
|
|
bits, and \acp{cn}, corresponding to individual parity checks.
|
|
We then construct the Tanner graph by connecting each \ac{cn} to
|
|
the \acp{vn} that make up the corresponding parity check
|
|
\cite[Sec.~5.1.2]{ryan_channel_2009}.
|
|
Figure \ref{PCM and Tanner graph of the Hamming code} shows this
|
|
construction for the [7,4,3]-Hamming code.
|
|
%
|
|
\begin{figure}[H]
|
|
\centering
|
|
|
|
\begin{align*}
|
|
\bm{H} =
|
|
\begin{pmatrix}
|
|
0 & 1 & 1 & 1 & 1 & 0 & 0 \\
|
|
1 & 0 & 1 & 1 & 0 & 1 & 0 \\
|
|
1 & 1 & 0 & 1 & 0 & 0 & 1 \\
|
|
\end{pmatrix}
|
|
\end{align*}
|
|
|
|
\vspace*{2mm}
|
|
|
|
\tikzset{
|
|
VN/.style={
|
|
circle, fill=KITgreen, minimum width=1mm, minimum height=1mm,
|
|
},
|
|
CN/.style={
|
|
rectangle, fill=KITblue, minimum width=1mm, minimum height=1mm,
|
|
},
|
|
}
|
|
|
|
\begin{tikzpicture}
|
|
\node[VN, label=above:$x_1$] (vn1) {};
|
|
\node[VN, right=12mm of vn1, label=above:$x_2$] (vn2) {};
|
|
\node[VN, right=12mm of vn2, label=above:$x_3$] (vn3) {};
|
|
\node[VN, right=12mm of vn3, label=above:$x_4$] (vn4) {};
|
|
\node[VN, right=12mm of vn4, label=above:$x_5$] (vn5) {};
|
|
\node[VN, right=12mm of vn5, label=above:$x_6$] (vn6) {};
|
|
\node[VN, right=12mm of vn6, label=above:$x_7$] (vn7) {};
|
|
|
|
\node[
|
|
CN, below=25mm of vn4,
|
|
label={below:$x_1 + x_3 + x_4 + x_6 = 0$}
|
|
] (cn2) {};
|
|
\node[
|
|
CN, left=40mm of cn2,
|
|
label={below:$x_2 + x_3 + x_4 + x_5 = 0$}
|
|
] (cn1) {};
|
|
\node[
|
|
CN, right=40mm of cn2,
|
|
label={below:$x_1 + x_2 + x_4 + x_7 = 0$}
|
|
] (cn3) {};
|
|
|
|
\foreach \n in {2,3,4,5} {
|
|
\draw (cn1) -- (vn\n);
|
|
}
|
|
|
|
\foreach \n in {1,3,4,6} {
|
|
\draw (cn2) -- (vn\n);
|
|
}
|
|
|
|
\foreach \n in {1,2,4,7} {
|
|
\draw (cn3) -- (vn\n);
|
|
}
|
|
\end{tikzpicture}
|
|
|
|
\caption{The \ac{pcm} and corresponding Tanner graph of the
|
|
[7,4,3]-Hamming code.}
|
|
\label{PCM and Tanner graph of the Hamming code}
|
|
\end{figure}
|
|
|
|
%
|
|
% N_V(j), N_C(i)
|
|
%
|
|
|
|
Mathematically, we represent a \ac{vn} using the index $i \in
|
|
\mathcal{I} := \left[
|
|
1 : n \right]$ and a \ac{cn} using the index $j \in \mathcal{J}
|
|
:= \left[ 1 : m \right]$.
|
|
We can then encode the information contained in the graph by defining
|
|
the neighborhood of a varialbe node $i$ as
|
|
$\mathcal{N}_\text{V} (i) = \left\{ i \in \mathcal{I} : \bm{H}_{j,i}
|
|
= 1 \right\}$
|
|
and that of a check node $j$ as
|
|
$\mathcal{N}_\text{C} (j) = \left\{ j \in \mathcal{J} : \bm{H}_{j,i}
|
|
= 1 \right\}$.
|
|
|
|
% TODO: Do we need any of these?
|
|
% \red{
|
|
% \begin{itemize}
|
|
% \item Cycles (? - Only if needed later)
|
|
% \item Regular vs irregular (? - only if needed later)
|
|
% \end{itemize}
|
|
% }
|
|
|
|
\subsection{Spatially-Coupled LDPC Codes}
|
|
|
|
A relatively recent development in the world of \ac{ldpc} codes is
|
|
that of \ac{sc}-\ac{ldpc} codes.\\
|
|
\red{[a bit more history (developed by \ldots, developed from \ldots,
|
|
\ldots)]}\\
|
|
\red{[core concept]}
|
|
|
|
\red{
|
|
\begin{itemize}
|
|
\item Tanner graph + PCM
|
|
\item Key benefits and reasoning behind them
|
|
\item Cite \cite{costello_spatially_2014} \cite{hassan_fully_2016}
|
|
\end{itemize}
|
|
}
|
|
|
|
\subsection{Belief Propagation}
|
|
|
|
\red{[short intro]} \\
|
|
\red{[key points (sub-optimal but good enough, low complexity, \ldots)]} \\
|
|
\red{[top-level overview (iterative algorithm that approximates \ldots)]}
|
|
|
|
\red{
|
|
\begin{itemize}
|
|
\item SPA and NMS algorithms
|
|
% TODO: Would it be better to split this into a separate section?
|
|
\item Sliding-window decoding of SC-LDPC codes
|
|
\item Cite \cite{ryan_channel_2009} \cite{hassan_fully_2016}
|
|
\cite{costello_spatially_2014}
|
|
\end{itemize}
|
|
}
|
|
|
|
\section{Quantum Mechanics and Quantum Information Science}
|
|
\label{sec:Quantum Mechanics and Quantum Information Science}
|
|
|
|
% TODO: Should the brief intro to QC be made later on or here?
|
|
%%%%%%%%%%%%%%%%
|
|
\subsection{Core Concepts and Notation}
|
|
\label{subsec:Notation}
|
|
|
|
\ldots can be very elegantly expressed using the language of
|
|
linear algebra.
|
|
\todo{Mention that we model the state of a quantum mechanical system
|
|
as a vector}
|
|
The so called Bra-ket or Dirac notation is especially appropriate,
|
|
having been proposed by Paul Dirac in 1939 for the express purpose
|
|
of simplifying quantum mechanical notation \cite{dirac_new_1939}.
|
|
Two new symbols are defined, \emph{bra}s $\bra{\cdot}$ and
|
|
\emph{ket}s $\ket{\cdot}$.
|
|
Kets denote ordinary vectors, while bras denote their Hermitian conjugates.
|
|
For example, two vectors specified by the labels $a$ and $b$
|
|
respectively are written as $\ket{a}$ and $\ket{b}$.
|
|
Their inner product is $\braket{a\vert b}$.
|
|
|
|
\red{\textbf{Tensor product}}
|
|
\red{\ldots
|
|
\todo{Introduce determinate state or use a different word?}
|
|
Take for example two systems with the determinate states $\ket{0}$
|
|
and $\ket{1}$. In general, the state of each can be written as the
|
|
superposition%
|
|
%
|
|
\begin{align*}
|
|
\alpha \ket{0} + \beta \ket{1}
|
|
.%
|
|
\end{align*}
|
|
%
|
|
Combining these two sytems into one, the overall state becomes%
|
|
%
|
|
\begin{align*}
|
|
&\mleft( \alpha_1 \ket{0} + \beta_1 \ket{1} \mright) \otimes
|
|
\mleft( \alpha_2 \ket{0} + \beta_2 \ket{1} \mright) \\
|
|
= &\alpha_1 \alpha_2 \ket{0} \ket{0}
|
|
+ \alpha_1 \alpha_2 \ket{0} \ket{1}
|
|
+ \beta_1 \alpha_2 \ket{1} \ket{0}
|
|
+ \beta_1 \beta_2 \ket{1} \ket{1}
|
|
% =: &\alpha_{00} \ket{00}
|
|
% + \alpha_{01} \ket{01}
|
|
% + \alpha_{10} \ket{10}
|
|
% + \alpha_{11} \ket{11}
|
|
.%
|
|
\end{align*}%
|
|
%
|
|
\ldots When not ambiguous in the context, the tensor product
|
|
symbol may be omitted, e.g.,
|
|
\begin{align*}
|
|
\ket{0} \otimes \ket{0} = \ket{0}\ket{0}
|
|
.%
|
|
\end{align*}
|
|
}
|
|
|
|
As we will see, the core concept that gives quantum computing its
|
|
power is entanglement. When two quantum mechanical systems are
|
|
entangled, measuring the state of one will collapse that of the other.
|
|
Take for example two subsystems with the overall state
|
|
%
|
|
\begin{align*}
|
|
\ket{\psi} = \frac{1}{\sqrt{2}} \mleft( \ket{0}\ket{0} +
|
|
\ket{1}\ket{1} \mright)
|
|
.%
|
|
\end{align*}
|
|
%
|
|
If we measure the first subsystem as being in $\ket{0}$, we can
|
|
be certain that a measurement of the second subsystem will also yield $\ket{0}$.
|
|
Introducing a new notation for entangled states, we can write%
|
|
%
|
|
\begin{align*}
|
|
\ket{\psi} = \frac{1}{\sqrt{2}} \left( \ket{00} + \ket{11} \right)
|
|
.%
|
|
\end{align*}
|
|
%
|
|
|
|
\subsection{Projective Measurements}
|
|
\label{subsec:Projective Measurements}
|
|
|
|
% TODO: Write
|
|
|
|
%%%%%%%%%%%%%%%%
|
|
\subsection{Quantum Gates}
|
|
\label{subsec:Quantum Gates}
|
|
|
|
\red{
|
|
\textbf{Content:}
|
|
\begin{itemize}
|
|
\item Bra-ket notation
|
|
\item The tensor product
|
|
\item Projective measurements (the related operators,
|
|
eigenvalues/eigenspaces, etc.)
|
|
\begin{itemize}
|
|
\item First explain what an operator is
|
|
\end{itemize}
|
|
\item Abstract intro to QC: Use gates to process qubit
|
|
states, similar to classical case
|
|
\item X, Z, Y operators/gates
|
|
\item Hadamard gate (+ X and Z are the same thing in differt bases)
|
|
\item Notation of operators on multi-qubit states
|
|
\item The Pauli, Clifford and Magic groups
|
|
\end{itemize}
|
|
}
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\section{Quantum Error Correction}
|
|
\label{sec:Quantum Error Correction}
|
|
|
|
\red{
|
|
\textbf{Content:}
|
|
\begin{itemize}
|
|
\item General context
|
|
\begin{itemize}
|
|
\item Why we want QC
|
|
\item Why we need QEC (correcting errors due to noisy gates)
|
|
\item Main challenges of QEC compared to classical
|
|
error correction
|
|
\end{itemize}
|
|
\item Stabilizer codes
|
|
\begin{itemize}
|
|
\item Definition of a stabilizer code
|
|
\item The stabilizer its generators (note somewhere
|
|
that the generators have to commute to be able to
|
|
be measured without disturbing each other)
|
|
\item syndrome extraction circuit
|
|
\item Stabilizer codes are effectively the QM
|
|
% TODO: Actually binary linear codes or just linear codes?
|
|
equivalent of binary linear codes (e.g.,
|
|
expressible via check matrix)
|
|
\end{itemize}
|
|
\item Digitization of errors
|
|
\item CSS codes
|
|
\item Color codes?
|
|
\item Surface codes?
|
|
\item Fault tolerant error correction (gates with which we do
|
|
error correction are also noisy)
|
|
\begin{itemize}
|
|
\item Transversal operations
|
|
\item \dots
|
|
\end{itemize}
|
|
\item Circuit level noise
|
|
\item Detector error model
|
|
\begin{itemize}
|
|
\item Columns of the check matrix represent different
|
|
possible error patterns $\rightarrow$ Check matrix
|
|
doesn't quite correspond to the codewords we used
|
|
initially anymore, but some similar structure ist
|
|
still there (compare with syndrome)
|
|
\end{itemize}
|
|
\end{itemize}
|
|
\textbf{General Notes:}
|
|
\begin{itemize}
|
|
\item Give a brief overview of the history of QEC
|
|
\item Note (and research if this is actually correct) that QC
|
|
was developed on an abstract level before thinking of
|
|
what hardware to use
|
|
\item Note that there are other codes than stabilizer codes
|
|
(and research and give some examples), but only
|
|
stabilizer codes are considered in this work
|
|
\item Degeneracy
|
|
\item The QEC decoding problem (considering degeneracy)
|
|
\end{itemize}
|
|
}
|
|
|
|
\subsection{Stabilizer Codes}
|
|
\subsection{CSS Codes}
|
|
\subsection{Quantum Low-Density Parity-Check Codes}
|
|
|