\chapter{Fundamentals} \label{ch:Fundamentals} \Ac{qec} is a field of research combining ``classical'' communications engineering and quantum information science. This chapter provides the relevant theoretical background on both of these topics and subsequently introduces the the fundamentals of \ac{qec}. % TODO: Is an explanation of BP with guided decimation needed in this chapter? % TODO: Is an explanation of OSD needed chapter? %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Classical Error Correction} \label{sec:Classical Error Correction} % TODO: Maybe rephrase: The core concept is not the realization, its's the % thing itself The core concept underpinning error correcting codes is the realization that the introduction of a finite amount of redundancy to information before its transmission can leed to a considerably reduced error rate. Specifically, Shannon proved in 1948 that for any channel, a block code can be found that achieves arbitrarily small probability of error at any communication rate up to the capacity of the channel when the block length approaches infinity \cite[Sec.~13]{shannon_mathematical_1948}. In this section, we explore the concepts of ``classical'' (as in non-quantum) error correction that are central to this work. We start by looking at different ways of encoding information, first considering binary linear block codes in general and then \ac{ldpc} and \ac{sc}-\ac{ldpc} codes. Finally, we pivot to the decoding process, specifically the \ac{bp} algorithm. % TODO: Use subsubsections? \subsection{Binary Linear Block Codes} % % Codewords, n, k, rate % % TODO: Do I need a specific reference for the expanded Hilbert space thing? One particularly important class of coding schemes is that of binary linear block codes. The information to be protected takes the form of a sequence of of binary symbols, which is split into separate blocks. Each block is encoded, transmitted, and decoded separately. The encoding step introduces redundancy by mapping input messages $\bm{u} \in \mathbb{F}_2^k$ of length $k \in \mathbb{N}$ (called the \textit{information length}) onto \textit{codewords} $\bm{x} \in \mathbb{F}_2^n$ of length $n \in \mathbb{N}$ (called the \textit{block length}) with $n > k$. A measure of the amount of introduced redundancy is the \textit{code rate} $R = k/n$. We call the set of all codewords $\mathcal{C}$ the \textit{code} \cite[Sec.~3.1.1]{ryan_channel_2009}. % % d_min and the [] Notation % During the encoding process, a mapping from $\mathbb{F}_2^k$ onto $\mathcal{C} \subset \mathbb{F}_2^n$ takes place. The input messages are mapped onto an expanded vector space, where they are ``further appart'', giving rise to the error correcting properties of the code. This notion of the distance between two codewords $\bm{x}_1$ and $\bm{x}_2$ can be expressed using the \textit{Hamming distance} $d(\bm{x}_1, \bm{x}_2)$, which is defined as the number of positions in which they differ. We define the \textit{minimum distance} of a code $\mathcal{C}$ as % \begin{align*} d_\text{min} := \min \left\{ d(\bm{x}_1, \bm{x}_2) : \bm{x}_1, \bm{x}_2 \in \mathcal{C}, \bm{x}_1 \neq \bm{x}_2 \right\} . \end{align*} % We can signify that a binary linear block code has information length $k$, block length $n$ and minimum distance $d_\text{min}$ using the notation $[n,k,d_\text{dmin}]$ \cite[Sec.~1.3]{macwilliams_theory_1977}. % % Parity checks, H, and the syndrome % A particularly elegant way of describing the subspace $C$ of $\mathbb{F}_2^n$ that the codewords make up is the notion of \textit{parity checks}. Since $\lvert \mathcal{C} \rvert = 2^k$ and $\lvert \mathbb{F}_2^n \rvert = 2^n$, we could introduce $n-k$ conditions to constrain the additional degrees of freedom. These conditions, called parity checks, take the form of equations over $\mathbb{F}_2^n$, linking the individual positions of each codeword. We can arrange the coefficients of these equations in a \textit{parity-check matrix} (\acs{pcm}) $\bm{H} \in \mathbb{F}_2^{(n-k) \times n}$ and equivalently define the code as \cite[Sec.~3.1.1]{ryan_channel_2009} % \begin{align*} \mathcal{C} = \left\{ \bm{x} \in \mathbb{F}_2^n : \bm{H}\bm{x}^\text{T} = \bm{0} \right\} .% \end{align*} Note that in general we may have linearly dependent parity checks, prompting us to define the \ac{pcm} as $\bm{H} \in \mathbb{F}_2^{m\times n}$ with $\hspace{2mm} m \ge n-k$ instead. % TODO: Define m % The \textit{syndrome} $\bm{s} = \bm{H} \bm{v}^\text{T}$ describes which parity checks a candidate codeword $\bm{v} \in \mathbb{F}_2^n$ violates. The representation using the \ac{pcm} has the benefit of providing a description of the code, the memory complexity of which doesn't grow exponentially with $n$, in contrast to keeping track of all codewords directly. % % The decoding problem % Figure \ref{fig:Diagram of a transmission system} visualizes the communication process \cite[Sec.~1.1]{ryan_channel_2009}. An input message $\bm{u}\in \mathbb{F}_2^k$ is mapped onto a codeword $\bm{x} \in \mathbb{F}_2^n$. This is passed on to a modulator, which interacts with the physical channel. A demodulator processes the channel output and forwards the result $\bm{y} \in \mathbb{R}^n$ to a decoder. Finally, the decoder is responsible for obtaining an estimate $\hat{\bm{u}} \in \mathbb{F}_2^k$ of the original input message. This is done by first finding an estimate $\hat{\bm{x}}$ of the sent codeword and undoing the encoding. The decoding problem that we generally attempt to solve thus consists in finding the best estimate $\hat{\bm{x}}$ given $\bm{y}$. One approach is to use the \ac{ml} criterion \cite[Sec. 1.4]{ryan_channel_2009} \begin{align*} \hat{\bm{u}}_\text{ML} = \arg\max_{\bm{x} \in \mathcal{C}} P(\bm{Y} = \bm{y} \vert \bm{X} = \bm{x}) . \end{align*} Finally, we differentiate between \textit{soft-decision} decoding, where $\bm{y} \in \mathbb{R}^n$, and \textit{hard-decision} decoding, where $\bm{y} \in \mathbb{F}_2^n$ \cite[Sec.~1.5.1.3]{ryan_channel_2009}. % \begin{figure}[h] \centering \tikzset{ box/.style={ rectangle, draw=black, minimum width=17mm, minimum height=8mm, }, } \begin{tikzpicture} [ node distance = 2mm and 7mm, ] \node (in) {}; \node[box, right=of in] (enc) {Encoder}; \node[box, minimum width=25mm, right=of enc] (mod) {Modulator}; \node[box, below right=of mod] (cha) {Channel}; \node[box, minimum width=25mm, below left=of cha] (dem) {Demodulator}; \node[box, left=of dem] (dec) {Decoder}; \node[left=of dec] (out) {}; \draw[-{latex}] (in) -- (enc) node[midway, above] {$\bm{u}$}; \draw[-{latex}] (enc) -- (mod) node[midway, above] {$\bm{x}$}; \draw[-{latex}] (mod) -| (cha); \draw[-{latex}] (cha) |- (dem); \draw[-{latex}] (dem) -- (dec) node[midway, above] {$\bm{y}$}; \draw[-{latex}] (dec) -- (out) node[midway, above] {$\hat{\bm{u}}$}; \end{tikzpicture} \caption{Overview of a transmission system.} \label{fig:Diagram of a transmission system} \end{figure} % % % Hard vs. soft information % \subsection{Low-Density Parity-Check Codes} % % Core concept % Shannon's noisy-channel coding theorem is stated for codes whose block length approaches infinity. This suggests that as the block length becomes larger, the performance of the considered condes should generally improve. However, the size of the \ac{pcm}, and thus in general the decoding complexity, of a linear block code grows quadratically with $n$. This would quickly render decoding intractable as we increase the block length. We can get around this problem by constructing $\bm{H}$ in such a manner that the number of nonzero entries grows less than quadratically, e.g., only linearly. This is exactly the motivation behind \ac{ldpc} codes \cite[Ch. 1]{gallager_low_1960}. % % Tanner Graph, VNs and CNs % \ac{ldpc} codes belong to a class sometimes referred to as ``modern codes''. These differ from ``classical codes'' in their decoding algorithms: Classical codes are usually decoded using one-step hard-decision decoding, whereas modern codes are suitable for iterative soft-decision decoding \cite[Preface]{ryan_channel_2009}. The iterative decoding algorithms in question are generally defined in terms of message passing on the \textit{Tanner graph} of the code. The Tanner graph is a bipartite graph that constitues an alternative representation of the \ac{pcm}. We define two types of nodes: \acp{vn}, corresponding to codeword bits, and \acp{cn}, corresponding to individual parity checks. We then construct the Tanner graph by connecting each \ac{cn} to the \acp{vn} that make up the corresponding parity check \cite[Sec.~5.1.2]{ryan_channel_2009}. Figure \ref{PCM and Tanner graph of the Hamming code} shows this construction for the [7,4,3]-Hamming code. % \begin{figure}[H] \centering \begin{align*} \bm{H} = \begin{pmatrix} 0 & 1 & 1 & 1 & 1 & 0 & 0 \\ 1 & 0 & 1 & 1 & 0 & 1 & 0 \\ 1 & 1 & 0 & 1 & 0 & 0 & 1 \\ \end{pmatrix} \end{align*} \vspace*{2mm} \tikzset{ VN/.style={ circle, fill=KITgreen, minimum width=1mm, minimum height=1mm, }, CN/.style={ rectangle, fill=KITblue, minimum width=1mm, minimum height=1mm, }, } \begin{tikzpicture} \node[VN, label=above:$x_1$] (vn1) {}; \node[VN, right=12mm of vn1, label=above:$x_2$] (vn2) {}; \node[VN, right=12mm of vn2, label=above:$x_3$] (vn3) {}; \node[VN, right=12mm of vn3, label=above:$x_4$] (vn4) {}; \node[VN, right=12mm of vn4, label=above:$x_5$] (vn5) {}; \node[VN, right=12mm of vn5, label=above:$x_6$] (vn6) {}; \node[VN, right=12mm of vn6, label=above:$x_7$] (vn7) {}; \node[ CN, below=25mm of vn4, label={below:$x_1 + x_3 + x_4 + x_6 = 0$} ] (cn2) {}; \node[ CN, left=40mm of cn2, label={below:$x_2 + x_3 + x_4 + x_5 = 0$} ] (cn1) {}; \node[ CN, right=40mm of cn2, label={below:$x_1 + x_2 + x_4 + x_7 = 0$} ] (cn3) {}; \foreach \n in {2,3,4,5} { \draw (cn1) -- (vn\n); } \foreach \n in {1,3,4,6} { \draw (cn2) -- (vn\n); } \foreach \n in {1,2,4,7} { \draw (cn3) -- (vn\n); } \end{tikzpicture} \caption{The \ac{pcm} and corresponding Tanner graph of the [7,4,3]-Hamming code.} \label{PCM and Tanner graph of the Hamming code} \end{figure} % % N_V(j), N_C(i) % Mathematically, we represent a \ac{vn} using the index $i \in \mathcal{I} := \left[ 1 : n \right]$ and a \ac{cn} using the index $j \in \mathcal{J} := \left[ 1 : m \right]$. We can then encode the information contained in the graph by defining the neighborhood of a varialbe node $i$ as $\mathcal{N}_\text{V} (i) = \left\{ i \in \mathcal{I} : \bm{H}_{j,i} = 1 \right\}$ and that of a check node $j$ as $\mathcal{N}_\text{C} (j) = \left\{ j \in \mathcal{J} : \bm{H}_{j,i} = 1 \right\}$. % TODO: Do we need any of these? % \red{ % \begin{itemize} % \item Cycles (? - Only if needed later) % \item Regular vs irregular (? - only if needed later) % \end{itemize} % } \subsection{Spatially-Coupled LDPC Codes} A relatively recent development in the world of \ac{ldpc} codes is that of \ac{sc}-\ac{ldpc} codes.\\ \red{[a bit more history (developed by \ldots, developed from \ldots, \ldots)]}\\ \red{[core concept]} \red{ \begin{itemize} \item Tanner graph + PCM \item Key benefits and reasoning behind them \item Cite \cite{costello_spatially_2014} \cite{hassan_fully_2016} \end{itemize} } \subsection{Belief Propagation} \red{[short intro]} \\ \red{[key points (sub-optimal but good enough, low complexity, \ldots)]} \\ \red{[top-level overview (iterative algorithm that approximates \ldots)]} \red{ \begin{itemize} \item SPA and NMS algorithms % TODO: Would it be better to split this into a separate section? \item Sliding-window decoding of SC-LDPC codes \item Cite \cite{ryan_channel_2009} \cite{hassan_fully_2016} \cite{costello_spatially_2014} \end{itemize} } \section{Quantum Mechanics and Quantum Information Science} \label{sec:Quantum Mechanics and Quantum Information Science} % TODO: Should the brief intro to QC be made later on or here? %%%%%%%%%%%%%%%% \subsection{Core Concepts and Notation} \label{subsec:Notation} \ldots can be very elegantly expressed using the language of linear algebra. \todo{Mention that we model the state of a quantum mechanical system as a vector} The so called Bra-ket or Dirac notation is especially appropriate, having been proposed by Paul Dirac in 1939 for the express purpose of simplifying quantum mechanical notation \cite{dirac_new_1939}. Two new symbols are defined, \emph{bra}s $\bra{\cdot}$ and \emph{ket}s $\ket{\cdot}$. Kets denote ordinary vectors, while bras denote their Hermitian conjugates. For example, two vectors specified by the labels $a$ and $b$ respectively are written as $\ket{a}$ and $\ket{b}$. Their inner product is $\braket{a\vert b}$. \red{\textbf{Tensor product}} \red{\ldots \todo{Introduce determinate state or use a different word?} Take for example two systems with the determinate states $\ket{0}$ and $\ket{1}$. In general, the state of each can be written as the superposition% % \begin{align*} \alpha \ket{0} + \beta \ket{1} .% \end{align*} % Combining these two sytems into one, the overall state becomes% % \begin{align*} &\mleft( \alpha_1 \ket{0} + \beta_1 \ket{1} \mright) \otimes \mleft( \alpha_2 \ket{0} + \beta_2 \ket{1} \mright) \\ = &\alpha_1 \alpha_2 \ket{0} \ket{0} + \alpha_1 \alpha_2 \ket{0} \ket{1} + \beta_1 \alpha_2 \ket{1} \ket{0} + \beta_1 \beta_2 \ket{1} \ket{1} % =: &\alpha_{00} \ket{00} % + \alpha_{01} \ket{01} % + \alpha_{10} \ket{10} % + \alpha_{11} \ket{11} .% \end{align*}% % \ldots When not ambiguous in the context, the tensor product symbol may be omitted, e.g., \begin{align*} \ket{0} \otimes \ket{0} = \ket{0}\ket{0} .% \end{align*} } As we will see, the core concept that gives quantum computing its power is entanglement. When two quantum mechanical systems are entangled, measuring the state of one will collapse that of the other. Take for example two subsystems with the overall state % \begin{align*} \ket{\psi} = \frac{1}{\sqrt{2}} \mleft( \ket{0}\ket{0} + \ket{1}\ket{1} \mright) .% \end{align*} % If we measure the first subsystem as being in $\ket{0}$, we can be certain that a measurement of the second subsystem will also yield $\ket{0}$. Introducing a new notation for entangled states, we can write% % \begin{align*} \ket{\psi} = \frac{1}{\sqrt{2}} \left( \ket{00} + \ket{11} \right) .% \end{align*} % \subsection{Projective Measurements} \label{subsec:Projective Measurements} % TODO: Write %%%%%%%%%%%%%%%% \subsection{Quantum Gates} \label{subsec:Quantum Gates} \red{ \textbf{Content:} \begin{itemize} \item Bra-ket notation \item The tensor product \item Projective measurements (the related operators, eigenvalues/eigenspaces, etc.) \begin{itemize} \item First explain what an operator is \end{itemize} \item Abstract intro to QC: Use gates to process qubit states, similar to classical case \item X, Z, Y operators/gates \item Hadamard gate (+ X and Z are the same thing in differt bases) \item Notation of operators on multi-qubit states \item The Pauli, Clifford and Magic groups \end{itemize} } %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Quantum Error Correction} \label{sec:Quantum Error Correction} \red{ \textbf{Content:} \begin{itemize} \item General context \begin{itemize} \item Why we want QC \item Why we need QEC (correcting errors due to noisy gates) \item Main challenges of QEC compared to classical error correction \end{itemize} \item Stabilizer codes \begin{itemize} \item Definition of a stabilizer code \item The stabilizer its generators (note somewhere that the generators have to commute to be able to be measured without disturbing each other) \item syndrome extraction circuit \item Stabilizer codes are effectively the QM % TODO: Actually binary linear codes or just linear codes? equivalent of binary linear codes (e.g., expressible via check matrix) \end{itemize} \item Digitization of errors \item CSS codes \item Color codes? \item Surface codes? \item Fault tolerant error correction (gates with which we do error correction are also noisy) \begin{itemize} \item Transversal operations \item \dots \end{itemize} \item Circuit level noise \item Detector error model \begin{itemize} \item Columns of the check matrix represent different possible error patterns $\rightarrow$ Check matrix doesn't quite correspond to the codewords we used initially anymore, but some similar structure ist still there (compare with syndrome) \end{itemize} \end{itemize} \textbf{General Notes:} \begin{itemize} \item Give a brief overview of the history of QEC \item Note (and research if this is actually correct) that QC was developed on an abstract level before thinking of what hardware to use \item Note that there are other codes than stabilizer codes (and research and give some examples), but only stabilizer codes are considered in this work \item Degeneracy \item The QEC decoding problem (considering degeneracy) \end{itemize} } \subsection{Stabilizer Codes} \subsection{CSS Codes} \subsection{Quantum Low-Density Parity-Check Codes}