Compare commits
9 Commits
1edc3f301a
...
thesis-v1.
| Author | SHA1 | Date | |
|---|---|---|---|
| 4dfb3a7c35 | |||
| 10d791fe04 | |||
| 06852b8e62 | |||
| 400dc47df0 | |||
| ece8fc1715 | |||
| 56e3a0e5ca | |||
| 8d6df8a79d | |||
| c41ac9f61f | |||
| a41e0b05fe |
@@ -35,9 +35,9 @@ algorithm.
|
||||
% Codewords, n, k, rate
|
||||
%
|
||||
|
||||
One particularly important class of coding schemes is that of binary
|
||||
linear block codes.
|
||||
The information to be protected takes the form of a sequence of
|
||||
Binary linear block codes form one particularly important class of
|
||||
coding schemes.
|
||||
The information to be protected is represented by a sequence of
|
||||
binary symbols, which is split into separate blocks.
|
||||
Each block is encoded, transmitted, and decoded separately.
|
||||
The encoding step introduces redundancy by mapping input messages
|
||||
@@ -45,10 +45,11 @@ $\bm{u} \in \mathbb{F}_2^k$ of length $k \in \mathbb{N}$ (called the
|
||||
\textit{information length}) onto \textit{codewords} $\bm{x} \in
|
||||
\mathbb{F}_2^n$ of length $n \in \mathbb{N}$ (called the
|
||||
\textit{block length}) with $n > k$.
|
||||
A measure of the amount of introduced redundancy is the \textit{code
|
||||
rate} $R = k/n$.
|
||||
We call the set of all codewords $\mathcal{C}$ the \textit{code}
|
||||
\cite[Sec.~3.1.1]{ryan_channel_2009}.
|
||||
The \textit{code rate} $R = k/n$ is a measure of the amount of
|
||||
introduced redundancy.
|
||||
We call the set of all codewords
|
||||
$\mathcal{C} = \{\bm{x}^{(1)}, \bm{x}^{(2)}, \ldots, \bm{x}^{(2^k)}\}$
|
||||
the \textit{code} \cite[Sec.~3.1.1]{ryan_channel_2009}.
|
||||
|
||||
%
|
||||
% d_min and the [] Notation
|
||||
@@ -77,7 +78,7 @@ $[n,k,d_\text{min}]$.
|
||||
% Parity checks, H, and the syndrome
|
||||
%
|
||||
|
||||
A particularly elegant way of describing the code space $C$ is the
|
||||
A particularly elegant way of describing the code space $\mathcal{C}$ is the
|
||||
notion of \textit{parity checks}.
|
||||
Since $\lvert \mathcal{C} \rvert = 2^k$ and $\lvert \mathbb{F}_2^n
|
||||
\rvert = 2^n$, there are $n-k$ conditions constrain the additional
|
||||
@@ -86,17 +87,17 @@ These conditions, called parity checks, take the form of equations
|
||||
over $\mathbb{F}_2^n$, linking the individual positions of each codeword.
|
||||
We can arrange the coefficients of these equations in a
|
||||
\textit{parity-check matrix} (\acs{pcm}) $\bm{H} \in
|
||||
\mathbb{F}_2^{(n-k) \times n}$ and equivalently define the code as
|
||||
\cite[Sec.~3.1.1]{ryan_channel_2009}
|
||||
\mathbb{F}_2^{(n-k) \times n}$, $\text{rank}(\bm{H}) = n-k$, and
|
||||
equivalently define the code as \cite[Sec.~3.1.1]{ryan_channel_2009}
|
||||
\begin{align*}
|
||||
\mathcal{C} = \left\{ \bm{x} \in \mathbb{F}_2^n :
|
||||
\bm{H}\bm{x}^\text{T} = \bm{0} \right\}
|
||||
\mathcal{C} := \text{kern}(\bm{H}) = \left\{ \bm{x} \in \mathbb{F}_2^n :
|
||||
\bm{H}\bm{x}^\mathsf{T} = \bm{0} \right\}
|
||||
.%
|
||||
\end{align*}
|
||||
Note that in general we may have linearly dependent parity checks,
|
||||
In general, we have linearly dependent parity checks,
|
||||
prompting us to define the \ac{pcm} as $\bm{H} \in
|
||||
\mathbb{F}_2^{m\times n}$ with $\hspace{2mm} m \ge n-k$ instead.
|
||||
The \textit{syndrome} $\bm{s} = \bm{H} \bm{v}^\text{T}$ describes
|
||||
The \textit{syndrome} $\bm{s} = \bm{H} \bm{v}^\mathsf{T}$ describes
|
||||
which parity checks a vector $\bm{v} \in \mathbb{F}_2^n$ violates.
|
||||
The representation using the \ac{pcm} has the benefit of providing a
|
||||
description of the code, the memory complexity of which does not grow
|
||||
@@ -118,9 +119,9 @@ $\bm{y} \in \mathbb{R}^n$, and \textit{hard-decision} decoding, where
|
||||
$\bm{y} \in \mathbb{F}_2^n$ \cite[Sec.~1.5.1.3]{ryan_channel_2009}.
|
||||
Finally, the decoder is responsible for obtaining an estimate
|
||||
$\hat{\bm{u}} \in \mathbb{F}_2^k$ of the original input message.
|
||||
This is done by first finding an estimate $\hat{\bm{x}}$ of the sent
|
||||
This can be done by first finding an estimate $\hat{\bm{x}}$ of the sent
|
||||
codeword and undoing the encoding.
|
||||
The decoding problem that we generally attempt to solve thus consists
|
||||
The decoding problem that we attempt to solve thus consists
|
||||
in finding the best estimate $\hat{\bm{x}}$ given $\bm{y}$.
|
||||
|
||||
\begin{figure}[t]
|
||||
@@ -168,9 +169,9 @@ in finding the best estimate $\hat{\bm{x}}$ given $\bm{y}$.
|
||||
%
|
||||
|
||||
Shannon's noisy-channel coding theorem is stated for codes whose block
|
||||
length approaches infinity. This suggests that as the block length
|
||||
becomes larger, the performance of the considered codes should
|
||||
generally improve.
|
||||
length $n$ approaches infinity.
|
||||
This suggests that as the block length becomes larger, the
|
||||
performance of the considered codes should generally improve.
|
||||
However, the size of the \ac{pcm} of a linear block code grows
|
||||
quadratically with $n$.
|
||||
This would quickly render decoding intractable as we increase the
|
||||
@@ -189,13 +190,14 @@ This is exactly the motivation behind \ac{ldpc} codes
|
||||
These differ from ``classical codes'' in their decoding algorithms:
|
||||
Classical codes are usually decoded using one-step hard-decision decoding,
|
||||
whereas modern codes are suitable for iterative soft-decision
|
||||
decoding \cite[Preface]{ryan_channel_2009}. The iterative decoding algorithms
|
||||
decoding \cite[Preface]{ryan_channel_2009}.
|
||||
For \ac{ldpc} codes, the iterative decoding algorithms
|
||||
are generally defined in terms of message passing on the
|
||||
\textit{Tanner graph} of a code. The Tanner graph is a bipartite
|
||||
graph that constitutes an alternative representation of the \ac{pcm}.
|
||||
We define two types of nodes: \acp{vn}, corresponding to codeword
|
||||
We define two types of nodes: \Acp{vn}, corresponding to codeword
|
||||
bits, and \acp{cn}, corresponding to individual parity checks.
|
||||
We then construct the Tanner graph by connecting each \ac{cn} to
|
||||
Then, we construct the Tanner graph by connecting each \ac{cn} to
|
||||
the \acp{vn} that make up the corresponding parity check
|
||||
\cite[Sec.~5.1.2]{ryan_channel_2009}.
|
||||
\Cref{PCM and Tanner graph of the Hamming code} shows the Tanner
|
||||
@@ -273,11 +275,11 @@ Mathematically, we represent a \ac{vn} using the index $i \in
|
||||
and a \ac{cn} using the index $j \in \mathcal{J}
|
||||
:= \left[ 0 : m-1 \right]$.
|
||||
We can then encode the information contained in the graph by defining
|
||||
the neighborhood of a variable node $i$ as
|
||||
$\mathcal{N}_\text{V} (i) = \left\{ j \in \mathcal{J} : \bm{H}_{j,i}
|
||||
the neighborhood of a \ac{vn} $i$ as
|
||||
$\mathcal{N}_\text{V} (i) = \left\{ j \in \mathcal{J} : H_{j,i}
|
||||
= 1 \right\}$
|
||||
and that of a check node $j$ as
|
||||
$\mathcal{N}_\text{C} (j) = \left\{ i \in \mathcal{I} : \bm{H}_{j,i}
|
||||
and the neighborhood of a \ac{cn} $j$ as
|
||||
$\mathcal{N}_\text{C} (j) = \left\{ i \in \mathcal{I} : H_{j,i}
|
||||
= 1 \right\}$.
|
||||
|
||||
%
|
||||
@@ -379,15 +381,15 @@ the numbers of ones, of their rows and columns are constant
|
||||
Already during their introduction, regular \ac{ldpc} codes were shown to have
|
||||
a minimum distance scaling linearly with the block length $n$ for
|
||||
large values \cite[Ch.~2,~Theorem~1]{gallager_low_1960},
|
||||
which leads to the fact that they do not exhibit an error floor under
|
||||
\ac{ml} decoding.
|
||||
Irregular codes, on the other hand, generally do exhibit an error floor,
|
||||
while their redeeming quality is the ability to reach near-capacity
|
||||
performance in the waterfall region \cite[Intro.]{costello_spatially_2014}.
|
||||
which leads to a more favorable behavior of the error rate for high
|
||||
signal-to-noise ratios.
|
||||
Irregular codes, on the other hand, have more severe error floor behavior.
|
||||
However, they have the the ability to reach near-capacity performance
|
||||
in the waterfall region \cite[Intro.]{costello_spatially_2014}.
|
||||
|
||||
\subsection{Spatially-Coupled LDPC Codes}
|
||||
|
||||
A recent development in the field of \ac{ldpc} codes is that of
|
||||
A more recent development in the field of \ac{ldpc} codes is that of
|
||||
\ac{sc}-\ac{ldpc} codes.
|
||||
Their key feature is that they combine the best properties of regular
|
||||
and irregular codes.
|
||||
@@ -399,11 +401,12 @@ waterfall region \cite[Intro.]{costello_spatially_2014}.
|
||||
The essential property of \ac{sc}-\ac{ldpc} codes is that codewords
|
||||
from different \textit{spatial positions}, which would ordinarily be sent
|
||||
one after the other independently, are linked.
|
||||
This is achieved by connecting some \acp{vn} of one spatial position to
|
||||
\acp{cn} of another, resulting in a \ac{pcm} of the form
|
||||
This is achieved by introducing edges between \acp{vn} of one spatial
|
||||
position and \acp{cn} of another, resulting in a \ac{pcm} of the form
|
||||
\cite[Eq.~1]{hassan_fully_2016}
|
||||
%
|
||||
\begin{align*}
|
||||
\begin{align}
|
||||
\label{eq:PCM}
|
||||
\bm{H} =
|
||||
\begin{pmatrix}
|
||||
\bm{H}_0(1) & & \\
|
||||
@@ -413,10 +416,11 @@ This is achieved by connecting some \acp{vn} of one spatial position to
|
||||
& & \bm{H}_K(L) \\
|
||||
\end{pmatrix}
|
||||
,
|
||||
\end{align*}
|
||||
\end{align}
|
||||
%
|
||||
where $K \in \mathbb{N}$ is the \textit{coupling width} and $L \in
|
||||
\mathbb{N}$ is the number of spatial positions.
|
||||
The parts of the \ac{pcm} left empty in \Cref{eq:PCM} are filled with zeros.
|
||||
This construction results in a Tanner graph as depicted in
|
||||
\Cref{fig:sc-ldpc-tanner}.
|
||||
|
||||
@@ -513,7 +517,7 @@ Note that at the first few spatial positions some \acp{cn} have lower degrees.
|
||||
This leads to more reliable information about the
|
||||
\acp{vn} that, as we will see, is
|
||||
later passed to subsequent spatial positions during decoding.
|
||||
This is precisely the effect that leads to the good performance of
|
||||
This is precisely the effect that leads to the improved performance of
|
||||
\ac{sc}-\ac{ldpc} codes in the waterfall region \cite{costello_spatially_2014}.
|
||||
|
||||
\subsection{Iterative Decoding}
|
||||
@@ -521,15 +525,14 @@ This is precisely the effect that leads to the good performance of
|
||||
|
||||
% Introduction
|
||||
|
||||
\ac{ldpc} codes are generally decoded using efficient iterative
|
||||
algorithms, something that is possible due to their sparsity
|
||||
\cite[Sec.~5.3]{ryan_channel_2009}.
|
||||
The algorithm originally proposed alongside LDPC codes for this
|
||||
purpose by Gallager in 1960 is now known as the \ac{spa}
|
||||
Due to their sparse graphs, efficient iterative decoders exist for
|
||||
\ac{ldpc} codes \cite[Sec.~5.3]{ryan_channel_2009}.
|
||||
The decoding algorithm originally proposed alongside LDPC codes by
|
||||
Gallager in 1960 is now known as the \ac{spa}
|
||||
\cite[5.4.1]{ryan_channel_2009}, also called \acf{bp}.
|
||||
|
||||
The optimality criterion the \ac{spa} is built around is a
|
||||
symbol-wise \ac{map} decision \cite[Sec.~5.4.1]{ryan_channel_2009}.
|
||||
bit-wise \ac{map} decision \cite[Sec.~5.4.1]{ryan_channel_2009}.
|
||||
The core idea of the resulting algorithm is to view \acp{cn}
|
||||
and \acp{vn} as representing individual local codes.
|
||||
A \ac{cn} represents a single parity check on the connected \acp{vn},
|
||||
@@ -539,11 +542,11 @@ should agree on its value; it can therefore be understood as a repetition code.
|
||||
The algorithm alternates between consolidating soft information about
|
||||
the \acp{vn} in the \acp{cn}, and consolidating soft information about
|
||||
the \acp{cn} in the \acp{vn}.
|
||||
To this end, messages are passed back and forth along the edges of
|
||||
the Tanner graph.
|
||||
To this end, messages computed in the nodes are passed back and forth
|
||||
along the edges of the Tanner graph.
|
||||
$L_{i\rightarrow j}$ represents a message passed from \ac{vn} $i$ to
|
||||
\ac{cn} j, $L_{i\leftarrow j}$ represents a message passed from
|
||||
\ac{cn} j to \ac{vn} i.
|
||||
\ac{cn} $j$, $L_{i\leftarrow j}$ represents a message passed from
|
||||
\ac{cn} $j$ to \ac{vn} $i$.
|
||||
The \acp{vn} additionally receive messages \cite[5.4.2]{ryan_channel_2009}
|
||||
\begin{align*}
|
||||
\tilde{L}_i = \log \frac{P(X=0 \vert Y=y)}{P(X=1 \vert Y=y)},
|
||||
@@ -574,7 +577,7 @@ possible cycles and are thus especially problematic.
|
||||
|
||||
% Min-sum algorithm
|
||||
|
||||
A simplification of the \ac{spa} is the min-sum decoder. Here, the
|
||||
A simplification of the \ac{spa} is the min-sum algorithm. Here, the
|
||||
\ac{cn} update is approximated as \cite[Sec.~5.5.1]{ryan_channel_2009}
|
||||
\begin{align*}
|
||||
L_{i \leftarrow j} = \prod_{i' \in \mathcal{N}_\text{C}(j)\setminus \{i\}}
|
||||
@@ -598,7 +601,7 @@ decoding of subsequent blocks \cite[Sec.~III.~C.]{hassan_fully_2016}.
|
||||
\label{sec:Quantum Mechanics and Quantum Information Science}
|
||||
|
||||
Designing codes and decoders for \ac{qec} is generally performed on a
|
||||
layer of abstraction far removed from the quantum mechanical
|
||||
layer of mathematical abstraction far removed from the quantum mechanical
|
||||
processes underlying the actual physics.
|
||||
Nevertheless, having a fundamental understanding of the related
|
||||
quantum mechanical concepts is useful to grasp the unique constraints
|
||||
@@ -618,39 +621,41 @@ function and the observable world:
|
||||
$\lvert \psi (x,t) \rvert^2$ is the \ac{pdf} of finding a particle at
|
||||
position $x$ and time $t$ \cite[Sec.~1.2]{griffiths_introduction_1995}.
|
||||
Note that this presupposes a normalization of $\psi$ such that
|
||||
$\int_{-\infty}^{\infty} \lvert \psi(x,t) \rvert^2 dx = 1$.
|
||||
\begin{align*}
|
||||
\int_{-\infty}^{\infty} \lvert \psi(x,t) \rvert^2 dx = 1
|
||||
.%
|
||||
\end{align*}
|
||||
|
||||
% Dirac notation
|
||||
|
||||
Much of the related mathematics can be very elegantly expressed
|
||||
using the language of linear algebra.
|
||||
The so-called Bra-ket or Dirac notation is especially appropriate,
|
||||
having been proposed by Paul Dirac in 1939 for the express purpose
|
||||
of simplifying quantum mechanical notation \cite{dirac_new_1939}.
|
||||
Two new symbols are defined, \emph{bra}s $\bra{\cdot}$ and
|
||||
\emph{ket}s $\ket{\cdot}$.
|
||||
The language of linear algebra allows one to express the related
|
||||
mathematics particularly elegantly.
|
||||
The so-called Bra-ket or Dirac notation, introducced
|
||||
by Paul Dirac in 1939 for the express purpose of simplifying quantum
|
||||
mechanical notation \cite{dirac_new_1939}, is especially appropriate.
|
||||
Two new symbols are defined, \emph{bra} $\bra{\cdot}$ and
|
||||
\emph{ket} $\ket{\cdot}$.
|
||||
Kets denote column vectors, while bras denote their Hermitian conjugates.
|
||||
For example, two vectors specified by the labels $a$ and $b$
|
||||
respectively are written as $\ket{a}$ and $\ket{b}$.
|
||||
For example, two vectors specified by the labels $a$ and $b$,
|
||||
respectively, are written as $\ket{a}$ and $\ket{b}$.
|
||||
Their inner product is $\braket{a\vert b}$.
|
||||
|
||||
% Expressing wave functions using linear algebra
|
||||
|
||||
The connection we will make between quantum mechanics and linear
|
||||
algebra is that we will model the state space of a system as a
|
||||
\emph{function space}, the Hilbert space $L_2$.
|
||||
We will represent the state of a particle with wave function
|
||||
$\psi(x,t)$ using the vector $\ket{\psi}$
|
||||
\cite[Sec.~3.3]{griffiths_introduction_1995}.
|
||||
\emph{function space}, namely the Hilbert space $L_2$.
|
||||
The state of a particle with wave function $\psi(x,t)$ is represented
|
||||
by the vector $\ket{\psi}$ \cite[Sec.~3.3]{griffiths_introduction_1995}.
|
||||
|
||||
% Operators
|
||||
|
||||
Another important notion is that of an \emph{operator}, a transformation
|
||||
that takes a function as an input and returns another function as an
|
||||
output \cite[Sec.~3.2.2]{griffiths_introduction_1995}.
|
||||
that maps a function onto another function
|
||||
\cite[Sec.~3.2.2]{griffiths_introduction_1995}.
|
||||
A prominent example of this is the differential operator $\partial x$.
|
||||
Operators are useful to describe the relations between different
|
||||
quantities relating to a particle.
|
||||
An example of this is the differential operator $\partial x$.
|
||||
We define the \emph{commutator} of two operators $P_1$ and $P_2$ as
|
||||
\begin{align*}
|
||||
[P_1,P_2] = P_1P_2 - P_2P_1
|
||||
@@ -669,22 +674,21 @@ We say the two operators \emph{commute} iff $[P_1,P_2] = 0$, and they
|
||||
|
||||
% Observable quantities
|
||||
|
||||
An \emph{observable quantity} $Q(x,p,t)$ is a quantity of a quantum
|
||||
mechanical system that we can measure, such as the position $x$ or
|
||||
An \emph{observable} $Q(x,p,t)$ is a quantity of a quantum
|
||||
mechanical system that we can measure, e.g., the position $x$ or
|
||||
momentum $p$ of a particle.
|
||||
In general, such measurements are not deterministic, i.e.,
|
||||
measurements on identically prepared states can yield different results.
|
||||
There are some states, however, that are \emph{determinate} for a
|
||||
specific observable: measuring those will always yield identical
|
||||
observations \cite[Sec.~3.3]{griffiths_introduction_1995}.
|
||||
However, some states are \emph{determinate} for a
|
||||
specific observable: Measuring those will always yield identical
|
||||
outcomes \cite[Sec.~3.3]{griffiths_introduction_1995}.
|
||||
|
||||
% General expression for expected value of observable quantity
|
||||
|
||||
If we know the wave function of a particle, we should be able to
|
||||
compute the expected value $\braket{Q}$ of any observable quantity we wish.
|
||||
It can be shown that for any $Q$, we can find a
|
||||
corresponding Hermitian operator $\hat{Q}$ such that
|
||||
\cite[Sec.~3.3]{griffiths_introduction_1995}
|
||||
If the wave function of a particle is known, the expected value
|
||||
$\braket{Q}$ of any observable quantity can be computed.
|
||||
Indeed, for any $Q$, there exists a corresponding Hermitian operator
|
||||
$\hat{Q}$ such that \cite[Sec.~3.3]{griffiths_introduction_1995}
|
||||
\begin{align}
|
||||
\label{eq:gen_expr_Q_exp}
|
||||
\braket{Q} = \int_{-\infty}^{\infty} \psi^*(x,t) \hat{Q} \psi(x,t) dx
|
||||
@@ -700,9 +704,9 @@ operator to $\hat{Q} = x$, we can write
|
||||
= \int_{-\infty}^{\infty} x \lvert \psi(x,t) \rvert ^2 dx
|
||||
.%
|
||||
\end{align*}
|
||||
Note that $\lvert \psi(x,t) \rvert^2 $ represents the \ac{pdf} of
|
||||
finding a particle in a specific state. We immediately see that the
|
||||
formula simplifies to the direct calculation of the expected value.
|
||||
Note that $\lvert \psi(x,t) \rvert^2 $ is the \ac{pdf} of
|
||||
finding a particle at position $x$. Hence, we immediately see that
|
||||
the formula simplifies to the direct calculation of the expected value.
|
||||
|
||||
% Determinate states and eigenvalues
|
||||
|
||||
@@ -716,40 +720,40 @@ We begin by translating \Cref{eq:gen_expr_Q_exp} into linear algebra as
|
||||
.%
|
||||
\end{align}
|
||||
\Cref{eq:gen_expr_Q_exp_lin} expresses an inherently probabilistic
|
||||
relationship.
|
||||
The determinate states are inherently deterministic.
|
||||
relationship, whereas the determinate states are inherently deterministic.
|
||||
To relate the two, we note that since determinate states should
|
||||
always yield the same measurement results, the variance of the
|
||||
observable should be zero.
|
||||
observable must be zero.
|
||||
We thus compute \cite[Eq.~3.116]{griffiths_introduction_1995}
|
||||
\begin{align}
|
||||
0 &\overset{!}{=} \braket{(Q - \braket{Q})^2}
|
||||
= \braket{e_n \vert (\hat{Q} - \braket{Q})^2 e_n} \nonumber\\
|
||||
&= \braket{(Q - \braket{Q})e_n \vert (\hat{Q} - \braket{Q})
|
||||
&= \braket{(\hat{Q} - \braket{Q})e_n \vert (\hat{Q} - \braket{Q})
|
||||
e_n} \nonumber\\
|
||||
&= \lVert (Q - \braket{Q}) e_n \rVert^2 \nonumber\\[3mm]
|
||||
&\hspace{-8mm}\Leftrightarrow (\hat{Q} - \braket{Q}) \ket{e_n} =
|
||||
0 \nonumber\\
|
||||
&= \lVert (\hat{Q} - \braket{Q}) e_n \rVert^2 \nonumber\\[3mm]
|
||||
&\hspace{-14mm}\iff (\hat{Q} - \braket{Q}) \ket{e_n}
|
||||
= 0 \nonumber\\
|
||||
\label{eq:observable_eigenrelation}
|
||||
&\hspace{-8mm}\Leftrightarrow \hat{Q}\ket{e_n}
|
||||
&\hspace{-14mm}\iff \hat{Q}\ket{e_n}
|
||||
= \underbrace{\braket{Q}}_{\lambda_n} \ket{e_n}
|
||||
.%
|
||||
\end{align}%
|
||||
%
|
||||
Because we have assumed the variance to be zero, the expected value
|
||||
$\braket{Q}$ is now the deterministic measurement result
|
||||
By setting the variance to zero, the expected value
|
||||
$\braket{Q}$ becomes a deterministic measurement result
|
||||
corresponding to the determinate state
|
||||
$\ket{e_n},~n\in \mathbb{N}$.
|
||||
We can see that the determinate states are the \emph{eigenstates} of
|
||||
the observable operator $\hat{Q}$ and that the measurement values are
|
||||
the corresponding \emph{eigenvalues} $\lambda_n$
|
||||
The determinate states are precisely the \emph{eigenstates} of
|
||||
the observable operator $\hat{Q}$, and the associated measurement
|
||||
values are the corresponding \emph{eigenvalues} $\lambda_n$
|
||||
\cite[Sec.~3.3]{griffiths_introduction_1995}.
|
||||
|
||||
% Determinate states as a basis
|
||||
|
||||
As we are modelling the wave function $\psi(x,t)$ as a vector
|
||||
As we model the wave function $\psi(x,t)$ as a vector
|
||||
$\ket{\psi}$, we can find a set of basis vectors to decompose it into.
|
||||
We can use the determinate states for this purpose, expressing the state as%
|
||||
In particular, we can use the determinate states for this purpose,
|
||||
expressing the state as%
|
||||
\footnote{
|
||||
We only consider the case of having a \emph{discrete
|
||||
spectrum} here, i.e., having a discrete set of eigenvalues and vectors.
|
||||
@@ -787,7 +791,7 @@ $Q(x,t,p)$ using a corresponding operator $\hat{Q}$, which allows us
|
||||
to compute the expected value as $\braket{Q} = \braket{\psi
|
||||
\vert \hat{Q} \psi}$.
|
||||
The eigenvectors of $\hat{Q}$ are the determinate states
|
||||
$\ket{e_n},~n\in \mathbb{N}$ and the eigenvalues are the respective
|
||||
$\ket{e_n},~n\in \mathbb{N}$, and the eigenvalues are the respective
|
||||
measurement outcomes.
|
||||
We can decompose an arbitrary state as $\ket{\psi} = \sum_{n=1}^{\infty} c_n
|
||||
\ket{e_n}$, where $\lvert c_n \rvert ^2$ represents the probability
|
||||
@@ -805,16 +809,16 @@ The measurements we considered in the previous section, for which
|
||||
\Cref{eq:gen_expr_Q_exp_lin} holds, belong to the category of
|
||||
\emph{projective measurements}.
|
||||
For these, certain restrictions such as repeatability apply: the act
|
||||
of measuring a quantum state should \emph{collapse} it onto one of
|
||||
of measuring a quantum state \emph{collapses} it onto one of
|
||||
the determinate states.
|
||||
Further measurements should then yield the same value.
|
||||
More general methods of modelling measurements exist, e.g., describing
|
||||
Further measurements then yield the same value.
|
||||
More general methods of modelling measurements exist, e.g.,
|
||||
destructive measurements \cite[Box~2.5]{nielsen_quantum_2010}, but
|
||||
they are not relevant to this work.
|
||||
|
||||
% Projection operators
|
||||
|
||||
We can model the collapse of the original state onto one of the
|
||||
We model the collapse of the original state onto one of the
|
||||
superimposed basis states as a \emph{projection}.
|
||||
To see this, we use
|
||||
\Cref{eq:determinate_basis,eq:observable_eigenrelation} to compute
|
||||
@@ -833,9 +837,9 @@ the separate components as
|
||||
using \emph{projection operators} \cite[Eq.~3.160]{griffiths_introduction_1995}
|
||||
\begin{align*}
|
||||
\hat{P}_n := \ket{e_n}\bra{e_n}, \hspace{3mm} n\in \mathbb{N}
|
||||
.
|
||||
,
|
||||
\end{align*}%
|
||||
These project a vector onto the subspace spanned by $\ket{e_n}$.
|
||||
which project a vector onto the subspace spanned by $\ket{e_n}$.
|
||||
|
||||
% % Using projection operators to measure if a state has a component
|
||||
% % along a basis vector
|
||||
@@ -861,10 +865,9 @@ These project a vector onto the subspace spanned by $\ket{e_n}$.
|
||||
|
||||
% Intro
|
||||
|
||||
% TODO: Make sure `quantum gate` is proper terminology
|
||||
A central concept for quantum computing is that of the \emph{qubit}.
|
||||
We employ it analogously to the classical \emph{bit}.
|
||||
For classical computers, we alter bits' states using \emph{gates}.
|
||||
It takes the place of the classical \emph{bit}.
|
||||
For classical computers, we alter the state of a bit using \emph{gates}.
|
||||
We can chain multiple of these gates together to build up more complex logic,
|
||||
such as half-adders or eventually a full processor.
|
||||
In principle, quantum computers work in a similar fashion, only that
|
||||
@@ -895,10 +898,10 @@ A qubit is defined to be a system with quantum state
|
||||
\alpha \\
|
||||
\beta
|
||||
\end{pmatrix}
|
||||
= \alpha \ket{0} + \beta \ket{1}
|
||||
= \alpha \ket{0} + \beta \ket{1}, \hspace{5mm} \alpha,\beta \in \mathbb{C}
|
||||
.%
|
||||
\end{align}
|
||||
The overall state of a composite quantum system is described using
|
||||
The overall state of a multi-qubit quantum system is described using
|
||||
the \emph{tensor product}, denoted as $\otimes$
|
||||
\cite[Sec.~2.2.8]{nielsen_quantum_2010}.
|
||||
Take for example the two qubits
|
||||
@@ -927,9 +930,9 @@ i.e.,
|
||||
.%
|
||||
\end{split}
|
||||
\end{align}
|
||||
We call $\ket{x_0, \ldots, x_n}~, x_i \in \{0,1\}$ the
|
||||
We call $\ket{x_0, \ldots, x_n},~x_i \in \{0,1\}$ the
|
||||
\emph{computational basis states} \cite[Sec.~4.6]{nielsen_quantum_2010}.
|
||||
To additionally simplify set notation, we define
|
||||
To simplify set notation, we define
|
||||
\begin{align*}
|
||||
\mathcal{M}^{\otimes n} := \underbrace{\mathcal{M}\otimes \ldots
|
||||
\otimes \mathcal{M}}_{n \text{ times}}
|
||||
@@ -938,7 +941,7 @@ To additionally simplify set notation, we define
|
||||
|
||||
% Entanglement
|
||||
|
||||
States that are not able to be decomposed into such products
|
||||
States that are not able to be decomposed into products of single-qubit states
|
||||
are called \emph{entangled} \cite[Sec.~2.2.8]{nielsen_quantum_2010}.
|
||||
An example of such states are the \emph{Bell states}
|
||||
\begin{align*}
|
||||
@@ -976,7 +979,7 @@ we now shift our focus to describing the evolution of their states.
|
||||
We model state changes as operators.
|
||||
Unlike classical systems, where there are only two possible states and
|
||||
thus the only possible state change is a bit-flip, a general qubit
|
||||
state as shown in \Cref{eq:gen_qubit_state} lives on a continuum of values.
|
||||
state as shown in \Cref{eq:gen_qubit_state} lies on a continuum of values.
|
||||
We thus technically also have an infinite number of possible state changes.
|
||||
Fortunately, we can express any operator as a linear combination of the
|
||||
\emph{Pauli operators} \cite[Sec.~2.2]{gottesman_stabilizer_1997}
|
||||
@@ -1013,13 +1016,15 @@ Fortunately, we can express any operator as a linear combination of the
|
||||
In fact, if we allow for complex coefficients, the $X$ and $Z$
|
||||
operators are sufficient to express any other operator as a linear
|
||||
combination \cite[Sec.~2.2]{roffe_quantum_2019}.
|
||||
$I$ is the identity operator and $X$ and $Z$ are referred to as
|
||||
Hereby, $I$ is the identity operator and $X$ and $Z$ are referred to as
|
||||
\emph{bit-flips} and \emph{phase-flips} respectively.
|
||||
We call the set $\mathcal{G}_n = \left\{ \pm I,\pm \mathrm{i}I, \pm
|
||||
X,\pm \mathrm{i}X,
|
||||
\pm Y,\pm \mathrm{i}Y, \pm Z, \pm \mathrm{i}Z \right\}^{\otimes n}$
|
||||
the \emph{Pauli
|
||||
group} over $n$ qubits.
|
||||
We call the set
|
||||
\begin{align}
|
||||
\mathcal{G}_n = \left\{ \pm I,\pm \mathrm{i}I, \pm
|
||||
X,\pm \mathrm{i}X,
|
||||
\pm Y,\pm \mathrm{i}Y, \pm Z, \pm \mathrm{i}Z \right\}^{\otimes n}
|
||||
\end{align}
|
||||
the \emph{Pauli group} over $n$ qubits.
|
||||
|
||||
In the context of modifying qubit states, we also call operators \emph{gates}.
|
||||
When working with multi-qubit systems, we can also apply Pauli gates
|
||||
@@ -1049,7 +1054,7 @@ Other important operators include the \emph{Hadamard} and
|
||||
\centering
|
||||
\begin{align*}
|
||||
\begin{array}{c}
|
||||
CNOT\text{ Operator} \\
|
||||
\text{CNOT Operator} \\
|
||||
\hline\\
|
||||
\ket{00} \mapsto \ket{00} \\
|
||||
\ket{01} \mapsto \ket{01} \\
|
||||
@@ -1060,7 +1065,9 @@ Other important operators include the \emph{Hadamard} and
|
||||
\end{minipage}%
|
||||
\end{figure}
|
||||
\vspace{-4mm}
|
||||
\noindent Many more operators relevant to quantum computing exist, but they are
|
||||
\noindent The CNOT operator is a 2-qubit gate that applies a bit-flip to the
|
||||
second qubit conditioned on the state of the first one.
|
||||
Many more operators relevant to quantum computing exist, but they are
|
||||
not covered here as they are not central to this work.
|
||||
|
||||
%%%%%%%%%%%%%%%%
|
||||
@@ -1093,9 +1100,8 @@ The control connection is represented by a vertical line connecting
|
||||
the gate to the corresponding qubit, where a filled dot is placed.
|
||||
A controlled gate applies the respective operation only if the
|
||||
control qubit is in state $\ket{1}$.
|
||||
An example of this is the CNOT gate introduced in
|
||||
\Cref{subsec:Qubits and Multi-Qubit States}, which is depicted in
|
||||
\Cref{fig:cnot_circuit}.
|
||||
\Cref{fig:cnot_circuit} depicts an example of this: The CNOT gate
|
||||
introduced in \Cref{subsec:Qubits and Multi-Qubit States}.
|
||||
|
||||
\begin{figure}[t]
|
||||
\centering
|
||||
@@ -1117,7 +1123,7 @@ An example of this is the CNOT gate introduced in
|
||||
|
||||
% General motivation behind QEC
|
||||
|
||||
One of the major barriers on the road to building a functioning
|
||||
One of the major barriers on the road to building a functioning and scalable
|
||||
quantum computer is the inevitability of errors during quantum
|
||||
computation. These arise due to the difficulty in sufficiently isolating the
|
||||
qubits from external noise \cite[Sec.~1]{roffe_quantum_2019}.
|
||||
@@ -1126,7 +1132,7 @@ with the environment act as small measurements, an effect called
|
||||
\emph{decoherence} of the quantum state
|
||||
\cite[Sec.~1]{gottesman_stabilizer_1997}.
|
||||
\ac{qec} is one approach of dealing with this problem, by protecting
|
||||
the quantum state in a similar fashion to information in classical error
|
||||
a quantum state in a similar fashion to information in classical error
|
||||
correction.
|
||||
|
||||
% The unique challenges of QEC
|
||||
@@ -1146,9 +1152,10 @@ Three main restrictions apply \cite[Sec.~2.4]{roffe_quantum_2019}:
|
||||
|
||||
% General idea (logical vs. physical gates) + notation
|
||||
|
||||
Much like in classical error correction, in \ac{qec} information
|
||||
is protected by mapping it onto codewords in a higher-dimensional space,
|
||||
thereby introducing redundancy.
|
||||
Much like in classical error correction, \ac{qec} protects information by
|
||||
introducing redundancy.
|
||||
The information, represented by a state in a low-dimensional space,
|
||||
is mapped onto an encoded state in a higher-dimensional space.
|
||||
To this end, $k \in \mathbb{N}$ \emph{logical qubits} are mapped onto
|
||||
$n \in \mathbb{N}$ \emph{physical qubits}, $n>k$.
|
||||
We circumvent the no-cloning restriction by not copying the state of any of
|
||||
@@ -1169,8 +1176,9 @@ This is due to the \emph{backlog problem}
|
||||
\cite[Sec.~II.G.3.]{terhal_quantum_2015}: There are certain gates
|
||||
at which the effect of existing errors on single qubits may be
|
||||
exacerbated by transforming them to multi-qubit errors.
|
||||
We wish to correct the errors before passing qubits through such gates.
|
||||
If the \ac{qec} system is not fast enough, there will be an increasing
|
||||
If we ensure decoding with sufficiently low latency, we can correct
|
||||
the errors before passing qubits through such gates.
|
||||
However, if the \ac{qec} system is not fast enough, there will be an increasing
|
||||
backlog of information at this point in the circuit, leading to an
|
||||
exponential slowdown in computation.
|
||||
|
||||
@@ -1200,8 +1208,8 @@ Note that this code is only able to detect single $X$-type errors.
|
||||
|
||||
% Measuring stabilizers
|
||||
|
||||
To determine if an error occurred, we want to measure
|
||||
whether a state belongs
|
||||
To determine if an error occurred, we aim at to measuring whether a
|
||||
state belongs
|
||||
% TODO: Remove footnote?
|
||||
% \footnote{
|
||||
% It is possible for a state to not completely lie in either subspace.
|
||||
@@ -1210,11 +1218,12 @@ whether a state belongs
|
||||
% }
|
||||
to $\mathcal{C}$ or $\mathcal{F}$.
|
||||
As explained in \Cref{subsec:Observables}, physical measurements
|
||||
can be mathematically described using operators whose eigenvalues
|
||||
can be mathematically described using operators, whose eigenvalues
|
||||
are the possible measurement results.
|
||||
Here, we need an operator with two eigenvalues and the corresponding
|
||||
eigenspaces should be $\mathcal{C}$ and $\mathcal{F}$ respectively.
|
||||
For the two-qubit code, $Z_1Z_2$ is such an operator:
|
||||
For the two-qubit repetition code, $Z_1Z_2 \in \mathcal{G}_2$ is such
|
||||
an operator:
|
||||
\begin{align}
|
||||
Z_1Z_2 E \ket{\psi}_\text{L} &= (+1) E \ket{\psi}_\text{L}
|
||||
\hspace*{3mm} \forall
|
||||
@@ -1225,13 +1234,14 @@ For the two-qubit code, $Z_1Z_2$ is such an operator:
|
||||
.%
|
||||
\end{align}
|
||||
$E \in \left\{ X,I \right\}$ is an operator describing a possible
|
||||
error and $E \ket{\psi}_\text{L}$ is the resulting state after that error.
|
||||
single-qubit error and $E \ket{\psi}_\text{L}$ is the resulting state
|
||||
after that error.
|
||||
By measuring the corresponding eigenvalue, we can determine if
|
||||
$E\ket{\psi}_\text{L}$ lies in $\mathcal{C}$ or $\mathcal{F}$.
|
||||
% TODO: If necessary, cite \cite[Sec.~3]{roffe_quantum_2019} for the
|
||||
% non-compromising meausrement of the information
|
||||
To do this without directly observing (and thus potentially
|
||||
collapsing) the logical state $\ket{\psi}_\text{L}$, we prepare an
|
||||
To do this without directly observing and, thus potentially
|
||||
collapsing, the logical state $\ket{\psi}_\text{L}$, we prepare an
|
||||
ancilla qubit with state $\ket{0}_\text{A}$ and entangle it with
|
||||
$\ket{\psi}_\text{L}$ in such a way that the eigenvalue is indicated
|
||||
by measuring the ancilla qubit instead.
|
||||
@@ -1296,11 +1306,11 @@ This effect is referred to as error \emph{digitization}
|
||||
% The stabilizer group
|
||||
|
||||
Operators such as $Z_1Z_2$ above are called \emph{stabilizers}.
|
||||
More generally, an operator $P_i \in \mathcal{G}_n$ is called a stabilizer of an
|
||||
More generally, an operator $P_i \in \mathcal{G}_n$ is a stabilizer of an
|
||||
$\llbracket n, k, d_\text{min} \rrbracket$ code $\mathcal{C}$, if
|
||||
\begin{itemize}
|
||||
\item It stabilizes all logical states, i.e.,
|
||||
$P_i\ket{\psi}_\text{L} = (+1)\ket{\psi}_\text{L} ~\forall~
|
||||
$P_i\ket{\psi}_\text{L} = (+1)\ket{\psi}_\text{L}, ~\forall~
|
||||
\ket{\psi}_\text{L} \in \mathcal{C}$.
|
||||
\item It commutes with all other stabilizers $P_j$ of the code,
|
||||
i.e., $[P_i, P_j] = 0$.
|
||||
@@ -1316,8 +1326,8 @@ Formally, we define the \emph{stabilizer group} $\mathcal{S}$ as
|
||||
[P_i,P_j] = 0 \forall i,j\right\}
|
||||
.%
|
||||
\end{align*}
|
||||
We care in particular about the commuting properties of stabilizers
|
||||
with respect to possible errors.
|
||||
We care about the commuting properties of stabilizers with respect to
|
||||
possible errors, in particular.
|
||||
The measurement circuit for an arbitrary stabilizer $P_i$ modifies
|
||||
the state as \cite[Eq.~29]{roffe_quantum_2019}
|
||||
\begin{align*}
|
||||
@@ -1350,6 +1360,7 @@ If a given error $E$ anticommutes with $P_i$, we have
|
||||
\end{align*}
|
||||
and measuring the ancilla $\text{A}_i$ corresponding to stabilizer
|
||||
$P_i$ returns 1.
|
||||
Similarly, if it commutes, the ancilla measurement returns 0.
|
||||
|
||||
%%%%%%%%%%%%%%%%
|
||||
\subsection{Stabilizer Codes}
|
||||
@@ -1357,9 +1368,10 @@ $P_i$ returns 1.
|
||||
|
||||
% Structure of a stabilizer code
|
||||
|
||||
For classical binary linear block codes, we use $n-k$ parity-checks
|
||||
Stabilizer codes are the quantum analogue of classical binary linear
|
||||
block codes, for which we use $n-k$ parity checks
|
||||
to reduce the degrees of freedom introduced by the encoding operation.
|
||||
Effectively, each parity-check defines a local code splitting the
|
||||
Effectively, each parity check defines a local code splitting the
|
||||
vector space in half, with only one part containing valid codewords.
|
||||
The global code is the intersection of all local codes.
|
||||
We can do the same in the quantum case.
|
||||
@@ -1377,19 +1389,23 @@ operators $P_i$, each using a circuit as explained in
|
||||
\Cref{subsec:Stabilizer Measurements}.
|
||||
Note that this is an abstract representation of the syndrome extraction.
|
||||
For the actual implementation in hardware, we can transform this into
|
||||
a circuit that requires only CNOT and H-gates
|
||||
a circuit that requires only CNOT and $H$-gates
|
||||
\cite[Sec.~10.5.8]{nielsen_quantum_2010}.
|
||||
|
||||
% Logical operators
|
||||
|
||||
In order to modify the logical state encoded using the physical
|
||||
qubits, we can use \emph{logical operators} \cite[Sec.~4.2]{roffe_quantum_2019}.
|
||||
For each qubit, there are two logical operators, $X_i$ and $Z_j$.
|
||||
These are operators that
|
||||
For a $\llbracket n,k \rrbracket$ stabilizer code, there exist
|
||||
logical operators generated by $2k$ representatives $X_i,
|
||||
Z_j,~i,j\in[1:k]$ such that
|
||||
\begin{itemize}
|
||||
\item Commute with all the stabilizers in $\mathcal{S}$.
|
||||
\item Anti-commute with one another, i.e., $[ \overline{X}_i,
|
||||
\overline{Z}_i ]_{+} = \overline{X}_i \overline{Z}_i +
|
||||
\item They commute with all stabilizers in $\mathcal{S}$.
|
||||
\item For $i=j$, they anti-commute with one another, i.e., $[
|
||||
\overline{X}_i, \overline{Z}_i ]_{+} = \overline{X}_i
|
||||
\overline{Z}_i + \overline{Z}_i \overline{X}_i = 0$.
|
||||
\item For $i\neq j$, they commute with one another, i.e., $[ \overline{X}_i,
|
||||
\overline{Z}_i ] = \overline{X}_i \overline{Z}_i -
|
||||
\overline{Z}_i \overline{X}_i = 0$.
|
||||
\end{itemize}
|
||||
We can also measure these operators to find out the logical state a
|
||||
@@ -1399,22 +1415,22 @@ physical state corresponds to \cite[Sec.~2.6]{derks_designing_2025}.
|
||||
|
||||
% TODO: Do I have to introduce before that stabilizers only need X
|
||||
% and Z operators?
|
||||
We can represent stabilizer codes using a \emph{check matrix}
|
||||
\cite[Sec.~10.5.1]{nielsen_quantum_2010}
|
||||
We can represent stabilizer codes using a binary \emph{check matrix}
|
||||
$\bm{H} \in \mathbb{F}_2^{(n-k)\times(2n)}$
|
||||
\cite[Sec.~10.5.1]{nielsen_quantum_2010} with
|
||||
\begin{align*}
|
||||
\bm{H} = \left[
|
||||
\begin{array}{c|c}
|
||||
\bm{H}_X & \bm{H}_Z
|
||||
\end{array}
|
||||
\right]
|
||||
,%
|
||||
.%
|
||||
\end{align*}
|
||||
with $\bm{H} \in \mathbb{F}_2^{(n-k)\times(2n)}$.
|
||||
This is similar to a classical \ac{pcm} in that it contains $n-k$
|
||||
rows, each describing one constraint. Each constraint restricts an additional
|
||||
degree of freedom of the higher-dimensional space we use to introduce
|
||||
redundancy.
|
||||
In contrast to the classical case, this matrix now has $2n$ columns,
|
||||
In contrast to the classical case, this matrix has $2n$ columns,
|
||||
as we have to consider both the $X$ and $Z$ type operators that make up
|
||||
the stabilizers.
|
||||
Take for example the Steane code \cite[Eq.~10.83]{nielsen_quantum_2010}.
|
||||
@@ -1433,8 +1449,8 @@ We can describe it using the check matrix
|
||||
\right]
|
||||
.%
|
||||
\end{align}
|
||||
The first $n$ columns correspond to $X$ operators acting on the
|
||||
corresponding physical qubit, the rest to the $Z$ operators.
|
||||
The first $n$ columns correspond to $X$ stabilizers acting on the
|
||||
corresponding physical qubit, the rest to the $Z$ stabilizers.
|
||||
|
||||
\begin{figure}[t]
|
||||
\centering
|
||||
@@ -1463,27 +1479,27 @@ corresponding physical qubit, the rest to the $Z$ operators.
|
||||
|
||||
% Intro
|
||||
|
||||
Stabilizer codes are especially practical to work with when they can
|
||||
handle $X$ and $Z$ type errors independently.
|
||||
Stabilizer codes are especially practical to work with when the
|
||||
stabilizers can be split into one subset consisting only of
|
||||
$Z$ stabilizers and one consisting only of $X$ stabilizers.
|
||||
As $Z$ errors anti-commute with $X$ operators in the stabilizers and
|
||||
vice versa, this property translates into being able to split the
|
||||
stabilizers into a subset being made up of only $X$
|
||||
operators and the rest only of $Z$ operators.
|
||||
vice versa, this property translates into being able to correct $X$
|
||||
or $Z$ errors independently.
|
||||
We call such codes \ac{css} codes.
|
||||
We can see this property in \Cref{eq:steane} in the check matrix
|
||||
of the Steane code.
|
||||
|
||||
% Construction
|
||||
|
||||
We can exploit this separate consideration of $X$ and $Z$ errors in
|
||||
We can exploit this separate consideration of $X$ and $Z$ stabilizers in
|
||||
the construction of \ac{css} codes.
|
||||
We combine two binary linear codes $\mathcal{C}_1$ and
|
||||
$\mathcal{C}_2$, each responsible for correcting one type of error
|
||||
$\mathcal{C}_2$, each responsible for correcting either $Z$ or $X$ errors
|
||||
\cite[Sec.~10.5.6]{nielsen_quantum_2010}.
|
||||
Using the dual code of $\mathcal{C}_2$ \cite[Eq.~3.4]{ryan_channel_2009}
|
||||
\begin{align*}
|
||||
\mathcal{C}_2^\perp := \left\{ \bm{x}' \in \mathbb{F}^2 :
|
||||
\bm{x}' \bm{x}^\text{T} = 0 ~\forall \bm{x} \in \mathcal{C}_2 \right\}
|
||||
\bm{x}' \bm{x}^\mathsf{T} = 0 ~\forall \bm{x} \in \mathcal{C}_2 \right\}
|
||||
,%
|
||||
\end{align*}
|
||||
we define $\bm{H}_X$ as the \ac{pcm} of $\mathcal{C}_2^\perp$ and $\bm{H}_Z$
|
||||
@@ -1501,7 +1517,7 @@ In order to yield a valid stabilizer code, $\mathcal{C}_1$ and
|
||||
$\mathcal{C}_2$ must satisfy the commutativity condition
|
||||
\begin{align}
|
||||
\label{eq:css_condition}
|
||||
\bm{H}_X \bm{H}_Z^\text{T} = \bm{0}
|
||||
\bm{H}_X \bm{H}_Z^\mathsf{T} = \bm{0}
|
||||
.%
|
||||
\end{align}
|
||||
We can ensure this by choosing $\mathcal{C}_1$ and $\mathcal{C}_2$
|
||||
@@ -1516,15 +1532,15 @@ such that $\mathcal{C}_2 \subset \mathcal{C}_1$.
|
||||
Various methods of constructing \ac{qec} codes exist
|
||||
\cite{swierkowska_eccentric_2025}.
|
||||
Topological codes, for example, encode information in the features of
|
||||
a lattice and are intrinsically robust against local errors.
|
||||
a lattice in a way that allows for local interactions between qubits.
|
||||
Among these, the \emph{surface code} is the most widely studied.
|
||||
Another example are concatenated codes, which nest one code within
|
||||
another, allowing for especially simple and flexible constructions
|
||||
\cite[Sec.~3.2]{swierkowska_eccentric_2025}.
|
||||
An area of research that has recently seen more attention is that of
|
||||
quantum \ac{ldpc} (\acs{qldpc}) codes.
|
||||
They have much better encoding efficiency than, e.g., the surface
|
||||
code, scaling up of which would be prohibitively expensive
|
||||
They have much higher rate than, e.g., surface codes, scaling up of
|
||||
which would be prohibitively expensive
|
||||
\cite[Sec.~I]{bravyi_high-threshold_2024}.
|
||||
|
||||
% Bivariate Bicycle codes
|
||||
@@ -1536,7 +1552,7 @@ $\bm{H}_Z$ are constructed from two matrices $\bm{A}$ and $\bm{B}$ as
|
||||
\begin{align*}
|
||||
\bm{H}_X = [\bm{A} \vert \bm{B}]
|
||||
\hspace*{5mm} \text{and} \hspace*{5mm}
|
||||
\bm{H}_Z = [\bm{B}^\text{T} \vert \bm{A}^\text{T}]
|
||||
\bm{H}_Z = [\bm{B}^\mathsf{T} \vert \bm{A}^\mathsf{T}]
|
||||
.%
|
||||
\end{align*}
|
||||
This way, we can guarantee the satisfaction of the commutativity
|
||||
@@ -1576,16 +1592,17 @@ This necessitates a modification of the standard \ac{bp} algorithm
|
||||
introduced in \Cref{subsec:Iterative Decoding}
|
||||
\cite[Sec.~3.1]{yao_belief_2024}.
|
||||
Instead of attempting to find the most likely codeword directly, the
|
||||
algorithm will now try to find an error pattern $\hat{\bm{e}} \in
|
||||
\mathbb{F}_2^n$ that satisfies
|
||||
syndrome-based decoding algorithm tries to find an error pattern
|
||||
$\hat{\bm{e}} \in \mathbb{F}_2^n$ that satisfies
|
||||
\begin{align*}
|
||||
\bm{H} \hat{\bm{e}}^\text{T} = \bm{s}
|
||||
\bm{H} \hat{\bm{e}}^\mathsf{T} = \bm{s}
|
||||
.%
|
||||
\end{align*}
|
||||
To this end, we initialize the channel \acp{llr} as
|
||||
\begin{align*}
|
||||
\tilde{L}_i = \log{\frac{P(X_i = 0)}{P(X_i = 1)}} = \log{\frac{1
|
||||
- p_i}{p_i}}
|
||||
\tilde{L}_i = \log{\frac{P(X_i = 0)}{P(X_i = 1)}} = \log{
|
||||
\left( \frac{1 - p_i}{p_i} \right)
|
||||
}
|
||||
,%
|
||||
\end{align*}
|
||||
where $p_i$ is the prior probability of error of \ac{vn} $i$.
|
||||
@@ -1642,7 +1659,7 @@ The resulting syndrome-based \ac{bp} algorithm is shown in
|
||||
\right\}$
|
||||
\EndFor
|
||||
|
||||
\If{$\bm{H}\hat{\bm{e}}^\text{T} = \bm{s}$}
|
||||
\If{$\bm{H}\hat{\bm{e}}^\mathsf{T} = \bm{s}$}
|
||||
\State \textbf{break}
|
||||
\EndIf
|
||||
|
||||
@@ -1721,7 +1738,7 @@ This way, we obtain the \ac{ler}.
|
||||
\mathbbm{1}\left\{ L^\text{total}_i \right\}$
|
||||
\EndFor
|
||||
|
||||
\If{$\bm{H}\hat{\bm{e}}^\text{T} = \bm{s}$}
|
||||
\If{$\bm{H}\hat{\bm{e}}^\mathsf{T} = \bm{s}$}
|
||||
\State \textbf{break}
|
||||
\Else
|
||||
\State $i_\text{max} \leftarrow \argmax_{i \in \mathcal{I}'} \lvert L^\text{total}_i \rvert $
|
||||
|
||||
@@ -16,17 +16,19 @@ using qubits.
|
||||
While the use of error correcting codes may facilitate this, it also
|
||||
introduces two new challenges \cite[Sec.~4]{gottesman_introduction_2009}:
|
||||
\begin{itemize}
|
||||
\item We must be able to perform operations on the encoded state
|
||||
in such a way that we do not lose the protection against errors.
|
||||
\item \ac{qec} systems are themselves partially implemented in
|
||||
quantum hardware. In addition to the errors we have
|
||||
originally introduced them for, these systems must
|
||||
be able to account for the fact they are implemented on noisy
|
||||
hardware themselves.
|
||||
\item To realize a quantum algorithm, we must be able to
|
||||
perform operations on the encoded state in such a way that we
|
||||
do not lose the protection against errors.
|
||||
\item \ac{qec} systems, in particular the syndrome extraction
|
||||
circuit, are themselves partially implemented in
|
||||
quantum hardware.
|
||||
In addition to the errors we have originally introduced them
|
||||
for, these systems must therefore be able to account for the
|
||||
fact they are implemented on noisy hardware themselves.
|
||||
\end{itemize}
|
||||
In the literature, both of these points are viewed under the umbrella
|
||||
of \emph{fault-tolerant} quantum computing.
|
||||
We focus only on the second aspect in this work.
|
||||
In this thesis, we focus on the second aspect.
|
||||
|
||||
It was recognized early on as a challenge of \ac{qec} that the correction
|
||||
machinery itself may introduce new faults \cite[Sec.~III]{shor_scheme_1995}.
|
||||
@@ -43,16 +45,16 @@ address both.
|
||||
We model the possible occurrence of errors during any processing
|
||||
stage as different \emph{error locations} $E_i,~i\in [1:N]$
|
||||
in the circuit.
|
||||
$N \in \mathbb{N}$ is the total number of considered error locations.
|
||||
The parameter $N \in \mathbb{N}$ is the total number of considered
|
||||
error locations.
|
||||
The \emph{circuit error vector} $\bm{e} \in \{0,1\}^N$ is a vector
|
||||
indicating which errors occurred, with
|
||||
\begin{align*}
|
||||
e_i :=
|
||||
\begin{cases}
|
||||
1, & \text{Error $E_i$ occurred} \\
|
||||
0, & \text{otherwise}
|
||||
1, & \text{error $E_i$ occurred}, \\
|
||||
0, & \text{otherwise}.
|
||||
\end{cases}
|
||||
.%
|
||||
\end{align*}
|
||||
\Cref{fig:fault_tolerance_overview} illustrates the flow of errors.
|
||||
Specifically for \ac{css} codes, a \ac{qec} procedure is deemed
|
||||
@@ -72,12 +74,14 @@ fault-tolerant, if \cite[Def.~4.2]{derks_designing_2025}
|
||||
where $t = \lfloor (d_\text{min} -1)/2 \rfloor$ is the number of
|
||||
errors the code is able to correct.
|
||||
The vectors $\bm{e}_{\text{output},X}$ and $\bm{e}_{\text{output},Z}$
|
||||
denote only $X$ and $Z$ errors respectively.
|
||||
denote only $X$ and $Z$ errors, respectively.
|
||||
|
||||
% TODO: Properly introduce d_min for QEC, specifically for CSS codes
|
||||
In order to deal with internal errors that flip syndrome bits,
|
||||
multiple rounds of syndrome measurements must be performed.
|
||||
Typically, the number of syndrome extraction rounds is chosen as $d_\text{min}$.
|
||||
multiple rounds of syndrome measurements are performed.
|
||||
Typically, the number of syndrome extraction rounds is chosen as
|
||||
$d_\text{min}$, e.g., \cite{gong_toward_2024}
|
||||
\cite{koutsioumpas_automorphism_2025}.
|
||||
|
||||
% % This is the definition of a fault-tolerant QEC gadget
|
||||
% A \ac{qec} procedure is deemed fault tolerant if
|
||||
@@ -150,7 +154,7 @@ Typically, the number of syndrome extraction rounds is chosen as $d_\text{min}$.
|
||||
% Intro
|
||||
|
||||
We collect the probabilities of error at each location in the
|
||||
\emph{noise model}, a vector $\bm{p} \in [0,1]^N$.
|
||||
\emph{noise model}, represented by a vector $\bm{p} \in [0,1]^N$.
|
||||
There are different types of noise models, each allowing for
|
||||
different error locations in the circuit.
|
||||
|
||||
@@ -178,8 +182,7 @@ $\ket{\psi}_\text{L}$ as \emph{data qubits}.
|
||||
Note that this is a concrete implementation using CNOT gates, as
|
||||
opposed to the system-level view introduced in
|
||||
\Cref{subsec:Stabilizer Codes}.
|
||||
We visualize the different types of noise models in
|
||||
\Cref{fig:noise_model_types}.
|
||||
\Cref{fig:noise_model_types} visualizes the different types of noise models.
|
||||
|
||||
%%%%%%%%%%%%%%%%
|
||||
\subsection{Bit-Flip Noise}
|
||||
@@ -190,7 +193,7 @@ This corresponds to the classical \ac{bsc}, i.e., only $X$ errors on the
|
||||
data qubits are possible \cite[Appendix~A]{gidney_new_2023}.
|
||||
The occurrence of bit-flip errors is modeled as a Bernoulli process
|
||||
$\text{Bern}(p)$.
|
||||
This type of noise model is shown in \Cref{subfig:bit_flip}.
|
||||
\Cref{subfig:bit_flip} shows this type of noise model.
|
||||
|
||||
Note that bit-flip noise is not suitable for developing fault-tolerant
|
||||
systems, as it does not account for errors during the syndrome extraction.
|
||||
@@ -223,7 +226,7 @@ Here, we consider multiple rounds of syndrome measurements with a
|
||||
depolarizing channel before each round.
|
||||
Additionally, we allow for measurement errors by having $X$ error
|
||||
locations right before each measurement \cite[Appendix~A]{gidney_new_2023}.
|
||||
Note that it is enough to only consider $X$ errors at these points,
|
||||
Note that it is enough to only consider $X$ errors before measuring,
|
||||
since that is the only type of error directly affecting the
|
||||
measurement outcomes.
|
||||
This model is depicted in \Cref{subfig:phenomenological}.
|
||||
@@ -253,7 +256,7 @@ While phenomenological noise is useful for some design aspects of
|
||||
fault-tolerant circuitry, for simulations, circuit-level noise should
|
||||
always be used \cite[Sec.~4.2]{derks_designing_2025}.
|
||||
Note that this introduces new challenges during the decoding process,
|
||||
as the decoding complexity is increased considerably due to the many
|
||||
as the decoding complexity is considerably increased due to the many
|
||||
error locations.
|
||||
|
||||
\begin{figure}[t]
|
||||
@@ -284,11 +287,11 @@ error locations.
|
||||
framework for
|
||||
passing information about a circuit used for \ac{qec} to a decoder.
|
||||
They are also useful as a theoretical tool to aid in the design of
|
||||
fault-tolerant \ac{qec} schemes.
|
||||
E.g., they can be used to easily determine whether a measurement
|
||||
schedule is fault-tolerant \cite[Example~12]{derks_designing_2025}.
|
||||
fault-tolerant \ac{qec} schemes, e.g., they can be used to easily
|
||||
determine whether a measurement schedule is fault-tolerant
|
||||
\cite[Example~12]{derks_designing_2025}.
|
||||
|
||||
Other approaches of implementing fault-tolerance circuits exist, such as
|
||||
Other approaches of implementing fault-tolerance circuits exist, e.g.,
|
||||
flag error correction, which uses additional ancilla qubits to detect
|
||||
potentially damaging high-weight errors \cite[Sec.~1]{chamberland_flag_2018}.
|
||||
However, \acp{dem} offer some unique advantages
|
||||
@@ -310,7 +313,7 @@ To achieve fault tolerance, the goal we strive towards is to
|
||||
consider the internal errors in addition to the input errors during
|
||||
the decoding process.
|
||||
The core idea behind detector error models is to do this by defining
|
||||
a new \emph{circuit code} that describes the circuit.
|
||||
a new \emph{circuit code} describing the whole circuit.
|
||||
Each \ac{vn} of this new code corresponds to an error location in the
|
||||
circuit and each \ac{cn} corresponds to a syndrome measurement.
|
||||
% This circuit code, combined with the prior probabilities of error
|
||||
@@ -446,12 +449,11 @@ matrix} $\bm{\Omega} \in \mathbb{F}_2^{M\times N}$, with
|
||||
\begin{align*}
|
||||
\Omega_{\ell,i} =
|
||||
\begin{cases}
|
||||
1, & \text{Error $i$ flips measurement $\ell$}\\
|
||||
0, & \text{otherwise}
|
||||
1, & \text{error $i$ flips measurement $\ell$},\\
|
||||
0, & \text{otherwise},
|
||||
\end{cases}
|
||||
,%
|
||||
\end{align*}
|
||||
where $M \in \mathbb{N}$ is the number of measurements.
|
||||
where $M \in \mathbb{N}$ is the number of performed syndrome measurements.
|
||||
To obtain $\bm{\Omega}$, we must propagate Pauli errors through the
|
||||
circuit, tracking which measurements they affect
|
||||
\cite[Sec.~2.4]{derks_designing_2025}.
|
||||
@@ -466,8 +468,8 @@ Each round yields an additional set of syndrome bits,
|
||||
and we combine them by stacking them in a new vector
|
||||
$\bm{s} \in \mathbb{F}_2^{R(n-k)}$, where $R \in \mathbb{N}$ is the
|
||||
number of syndrome measurement rounds.
|
||||
We thus have to replicate the rows of $\bm{H}_Z$, once for each
|
||||
additional syndrome measurement, to obtain
|
||||
Thus, we have to replicate the rows of $\bm{H}_Z$, once for each
|
||||
additional syndrome measurement, and obtain
|
||||
\begin{align*}
|
||||
\bm{\Omega}_0 =
|
||||
\begin{pmatrix}
|
||||
@@ -493,11 +495,11 @@ extraction circuitry, so we still consider only bit flip noise at this stage.
|
||||
Recall that $\bm{\Omega}_0$ describes which \ac{vn} is connected to
|
||||
which parity check and the syndrome indicates which parity checks
|
||||
are violated.
|
||||
This means that if an error exists at only a single \ac{vn}, we can
|
||||
read off the syndrome in the corresponding column.
|
||||
Therefore, if an error occurs that corresponds to a single \ac{vn},
|
||||
the measured syndrome is the corresponding column.
|
||||
If errors occur at multiple locations, the resulting syndrome will be
|
||||
the linear combination of the respective columns.
|
||||
We thus have
|
||||
Thus, we have
|
||||
\begin{align*}
|
||||
\bm{s} \in \text{span} \{\bm{\Omega}_0\}
|
||||
.%
|
||||
@@ -505,13 +507,13 @@ We thus have
|
||||
|
||||
% Expand to phenomenological
|
||||
|
||||
We now wish to expand the error model to phenomenological noise, though
|
||||
Next, we expand the error model to phenomenological noise, though
|
||||
only considering $X$ errors in this case.
|
||||
We introduce new error locations at the appropriate positions,
|
||||
arriving at the circuit depicted in
|
||||
resulting in the circuit depicted in
|
||||
\Cref{fig:rep_code_multiple_rounds_phenomenological}.
|
||||
For each additional error location, we extend $\bm{\Omega}_0$ by
|
||||
appending the corresponding syndrome vector as a column.
|
||||
appending the corresponding syndrome vector as a column, yielding
|
||||
\begin{gather}
|
||||
\label{eq:syndrome_matrix_ex}
|
||||
\bm{\Omega}_1 =
|
||||
@@ -668,7 +670,7 @@ extraction round.
|
||||
|
||||
\begin{figure}[t]
|
||||
\begin{gather*}
|
||||
\hspace*{-33.3mm}%
|
||||
\hspace*{-31.8mm}%
|
||||
\begin{array}{c}
|
||||
E_6 \\
|
||||
\downarrow
|
||||
@@ -790,15 +792,14 @@ to a detector.
|
||||
We should note at this point that the combination of measurements
|
||||
into detectors has no bearing on the actual construction of the
|
||||
syndrome extraction circuitry.
|
||||
It is something that happens ``virtually'' after the fact and only
|
||||
affects the decoder.
|
||||
It is something that happens ``virtually'' and only affects the decoder.
|
||||
|
||||
Note that we can use the detector matrix $\bm{D}$ to describe the set
|
||||
of possible measurement outcomes under the absence of noise.
|
||||
Similar to the we use a \ac{pcm} to describe the code space as
|
||||
\begin{equation*}
|
||||
\mathcal{C}
|
||||
= \{ \bm{x} \in \mathbb{F}_2^{n} : \bm{H}\bm{x}^\text{T} = \bm{0} \}
|
||||
= \{ \bm{x} \in \mathbb{F}_2^{n} : \bm{H}\bm{x}^\mathsf{T} = \bm{0} \}
|
||||
,%
|
||||
\end{equation*}
|
||||
the set of possible measurement outcomes is simply $\text{kern}\{\bm{D}\}$
|
||||
@@ -815,7 +816,7 @@ affect the measurements (through $\bm{\Omega}$), and we know how the
|
||||
measurements relate to the detectors (through $\bm{D}$).
|
||||
For decoding, we are interested in the effect of the errors on the
|
||||
detectors directly.
|
||||
We thus construct the \emph{detector error matrix} $\bm{H} \in
|
||||
Thus, we construct the \emph{detector error matrix} $\bm{H} \in
|
||||
\mathbb{F}_2^{D\times N}$ \cite[Def.~2.9]{derks_designing_2025} as
|
||||
\begin{align*}
|
||||
\bm{H} := \bm{D}\bm{\Omega}
|
||||
@@ -843,10 +844,10 @@ violate the same set of detectors, i.e.,
|
||||
\begin{align*}
|
||||
\hspace{-15mm}
|
||||
% tex-fmt: off
|
||||
&& \bm{H} \bm{e}_1^\text{T} & \neq \bm{H} \bm{e}_2^\text{T} \\
|
||||
\iff \hspace{-33mm} && \bm{H} \left( \bm{e}_1 - \bm{e}_2 \right)^\text{T} & \neq 0 \\
|
||||
\iff \hspace{-33mm} && \bm{D} \bm{\Omega} \left( \bm{e}_1 - \bm{e}_2 \right)^\text{T} & \neq 0 \\
|
||||
\iff \hspace{-33mm} && \bm{\Omega} \left( \bm{e}_1 - \bm{e}_2 \right)^\text{T} & \notin \text{kern} \{\bm{D}\}
|
||||
&& \bm{H} \bm{e}_1^\mathsf{T} & \neq \bm{H} \bm{e}_2^\mathsf{T} \\
|
||||
\iff \hspace{-33mm} && \bm{H} \left( \bm{e}_1 - \bm{e}_2 \right)^\mathsf{T} & \neq 0 \\
|
||||
\iff \hspace{-33mm} && \bm{D} \bm{\Omega} \left( \bm{e}_1 - \bm{e}_2 \right)^\mathsf{T} & \neq 0 \\
|
||||
\iff \hspace{-33mm} && \bm{\Omega} \left( \bm{e}_1 - \bm{e}_2 \right)^\mathsf{T} & \notin \text{kern} \{\bm{D}\}
|
||||
% tex-fmt: on
|
||||
.%
|
||||
\end{align*}
|
||||
@@ -859,7 +860,7 @@ It may, however, change the decoding performance when using a practical decoder.
|
||||
|
||||
What constitutes a good set of detectors is difficult to assess
|
||||
without performing explicit decoding simulations, since it ultimately
|
||||
depends on the decoder employed.
|
||||
depends on the employed decoder.
|
||||
For iterative decoders, high sparsity is generally beneficial, but
|
||||
finding detectors that maximize sparsity is an NP-complete problem
|
||||
\cite[Sec.~2.6]{derks_designing_2025}.
|
||||
@@ -868,7 +869,7 @@ at a later stage.
|
||||
To the measurement results from each syndrome extraction round we
|
||||
can add the results from the previous round, as illustrated in
|
||||
\Cref{fig:detectors_from_measurements_general}.
|
||||
We thus have $D=n-k$.
|
||||
Thus, we have $D=n-k$.
|
||||
Concretely, we denote the outcome of
|
||||
measurement $\ell \in [1:n-k]$ in round $r \in [1:R]$ by
|
||||
$m_\ell^{(r)} \in \mathbb{F}_2$
|
||||
@@ -935,9 +936,10 @@ note that the error $E_6$ in
|
||||
\Cref{fig:rep_code_multiple_rounds_phenomenological} has not only
|
||||
triggered the measurements in the syndrome extraction round immediately
|
||||
afterwards, but all subsequent ones as well.
|
||||
To only see errors in the rounds immediately following them, we
|
||||
consider our newly defined detectors instead of the measurements,
|
||||
that effectively compute the difference between the measurements.
|
||||
To only see the effect of errors in the syndrome measurement round
|
||||
immediately following them, we consider our newly defined detectors
|
||||
instead of the measurements.
|
||||
These effectively compute the difference between the measurements.
|
||||
|
||||
Each error can only trigger syndrome bits that follow it.
|
||||
This is reflected in the triangular structure of $\bm{\Omega}$ in
|
||||
@@ -945,7 +947,7 @@ This is reflected in the triangular structure of $\bm{\Omega}$ in
|
||||
Combining the measurements into detectors according to
|
||||
\Cref{eq:measurement_combination}, we are effectively performing
|
||||
row additions in such a way as to clear the bottom left of the matrix.
|
||||
The detector error matrix
|
||||
The resulting detector error matrix
|
||||
\begin{align*}
|
||||
\bm{H} =
|
||||
\left(
|
||||
@@ -959,7 +961,7 @@ The detector error matrix
|
||||
\end{array}
|
||||
\right)
|
||||
\end{align*}
|
||||
obtained this way has a block-diagonal structure.
|
||||
has a block-diagonal structure.
|
||||
Note that we exploit the fact that each syndrome measurement round is
|
||||
identical to obtain this structure.
|
||||
|
||||
@@ -1008,9 +1010,8 @@ error matrix $\bm{H}$ and the noise model $\bm{p}$.
|
||||
\cite[Sec.~6]{derks_designing_2025}.
|
||||
It serves as an abstract representation of a circuit and can be used
|
||||
both to transfer information to a decoder but also to aid in the
|
||||
design of fault-tolerant systems.
|
||||
E.g., it can be used to investigate the properties of a circuit with
|
||||
respect to fault tolerance.
|
||||
design of fault-tolerant systems, e.g., it can be used to investigate
|
||||
the properties of a circuit with respect to fault tolerance.
|
||||
It contains all information necessary for the decoding process.
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
@@ -1052,7 +1053,7 @@ value, the physical error rate $p_\text{phys}$.
|
||||
|
||||
% Per-round LER
|
||||
|
||||
Another aspect that is important to consider is the meaning of the
|
||||
Another important aspect to consider is the meaning of the
|
||||
\ac{ler} in the context of a \ac{qec} system with multiple
|
||||
rounds of syndrome measurements.
|
||||
In order to facilitate the comparability of results obtained from
|
||||
@@ -1063,7 +1064,7 @@ The simplest way of calculating the per-round \ac{ler} is by modeling
|
||||
each round as an independent experiment.
|
||||
For each experiment, an error might occur with a certain probability
|
||||
$p_\text{e,round}$.
|
||||
The overall probability of error is then
|
||||
Then the overall probability of error is
|
||||
\begin{align}
|
||||
\hspace{-12mm}
|
||||
p_\text{e,total} &= 1 - (1 - p_\text{e,round})^{R} \nonumber\\
|
||||
@@ -1073,13 +1074,14 @@ The overall probability of error is then
|
||||
.%
|
||||
\hspace{12mm}
|
||||
\end{align}
|
||||
We approximate $p_\text{e,total}$ using a Monte Carlo simulation and
|
||||
compute the per-round-\ac{ler} using \Cref{eq:per_round_ler}.
|
||||
To this end, we approximate $p_\text{e,total}$ using a Monte Carlo
|
||||
simulation and
|
||||
compute the per-round-\ac{ler} according to \Cref{eq:per_round_ler}.
|
||||
This is the approach taken in \cite{gong_toward_2024}\cite{wang_fully_2025}.
|
||||
|
||||
Another approach \cite{chen_exponential_2021}%
|
||||
\cite{bausch_learning_2024}\cite{beni_tesseract_2025} is to assume an
|
||||
exponential decay for the decoder's \emph{logical fidelity}
|
||||
exponential decay for the \emph{logical fidelity} of the decoder
|
||||
\cite[Eq.~(2)]{bausch_learning_2024}
|
||||
\begin{align*}
|
||||
F_\text{total} = (F_\text{round})^{R}
|
||||
@@ -1104,10 +1106,10 @@ topic to our own work.
|
||||
\subsection{Stim}
|
||||
\label{subsec:Stim}
|
||||
|
||||
It is not immediately apparent how the \ac{dem} will look from looking
|
||||
at a code's \ac{pcm}, because it heavily depends on the exact circuit
|
||||
construction and choice of noise model.
|
||||
As we noted in \Cref{subsec:Measurement Syndrome Matrix}, we can
|
||||
It is not immediately apparent how the \ac{dem} will look from
|
||||
considering the \ac{pcm} of a code, because it heavily depends on the
|
||||
exact circuit construction and choice of noise model.
|
||||
As we noted in \Cref{subsec:Measurement Syndrome Matrix}, we
|
||||
obtain a measurement syndrome matrix by propagating Pauli frames
|
||||
through the circuit.
|
||||
The standard choice of simulation tool used for this purpose is
|
||||
@@ -1118,16 +1120,16 @@ pypi package.
|
||||
In fact, it was in this tool that the concept of the \ac{dem} was
|
||||
first introduced.
|
||||
|
||||
One capability of stim, and \acp{dem} in general, that we didn't go
|
||||
into detail about in this chapter is the merging of error mechanisms.
|
||||
One capability of stim, and \acp{dem} in general, that we did not
|
||||
explain in detail in this chapter, is the merging of error mechanisms.
|
||||
Since \acp{dem} differentiate errors based on their effect on the
|
||||
measurements and not on their Pauli type and location
|
||||
\cite[Sec.~1.4.3]{higgott_practical_2024}, it is natural to group
|
||||
errors that have the same effect.
|
||||
errors that have the same effect, i.e., syndrome.
|
||||
This slightly lowers the computational complexity of decoding, as the
|
||||
number of resulting \acp{vn} is reduced.
|
||||
|
||||
While stim is a useful tool for circuit simulation, it doesn't
|
||||
While stim is a useful tool for circuit simulation, it does not
|
||||
include many utilities for building syndrome extraction circuitry automatically.
|
||||
The user has to define most, if not all, of the circuit manually,
|
||||
depending on the code in question.
|
||||
|
||||
@@ -470,7 +470,7 @@ model and is difficult to predict beforehand.
|
||||
The block-diagonal structure reflects the time-like locality
|
||||
of the syndrome extraction circuit, with each block
|
||||
corresponding to one syndrome measurement round.
|
||||
Two consecutive windows are highlighted: the window size $W
|
||||
Two consecutive windows are highlighted: The window size $W
|
||||
\in \mathbb{N}$ controls the number of syndrome rounds
|
||||
included in each window, while the step size $F \in
|
||||
\mathbb{N}$ controls how many rounds separate the start of
|
||||
@@ -701,7 +701,7 @@ estimates committed after decoding window $\ell$, we have to set
|
||||
\begin{align*}
|
||||
\left(\bm{s}\right)_{\mathcal{J}_\text{overlap}^{(\ell)}} =
|
||||
\bm{H}_\text{overlap}^{(\ell)}
|
||||
\left( \hat{\bm{e}}_\text{commit}^{(\ell)} \right)^\text{T}
|
||||
\left( \hat{\bm{e}}_\text{commit}^{(\ell)} \right)^\mathsf{T}
|
||||
.%
|
||||
\end{align*}
|
||||
|
||||
@@ -986,7 +986,7 @@ Note that the decoding procedure performed on the individual windows
|
||||
\State $\displaystyle\left(\hat{\bm{e}}^\text{total}\right)_{\mathcal{I}^{(\ell)}_\text{commit}} \leftarrow \hat{\bm{e}}^{(\ell)}_\text{commit}$
|
||||
\State $\displaystyle\left(\bm{s}\right)_{\mathcal{J}_\text{overlap}^{(\ell)}}
|
||||
\leftarrow \bm{H}_\text{overlap}^{(\ell)}
|
||||
\left( \hat{\bm{e}}_\text{commit}^{(\ell)} \right)^\text{T}$
|
||||
\left( \hat{\bm{e}}_\text{commit}^{(\ell)} \right)^\mathsf{T}$
|
||||
\If{$\ell < n_\text{win} - 1$}
|
||||
\State $L^{(\ell+1)}_{i\leftarrow j} \leftarrow
|
||||
L^{(\ell)}_{i\leftarrow j}
|
||||
@@ -1013,8 +1013,8 @@ the most reliable \ac{vn}, meaning we perform a hard decision and
|
||||
remove it from the following decoding process.
|
||||
|
||||
This means that when moving from one window to the next, we now have
|
||||
more information available: not just the \ac{bp} messages but also the
|
||||
information about what \acp{vn} were decimated and to what values.
|
||||
more information available: Not just the \ac{bp} messages but also the
|
||||
Information about what \acp{vn} were decimated and to what values.
|
||||
We call this \emph{decimation information} in the following.
|
||||
We can extend \Cref{alg:warm_start_bp} by additionally passing the
|
||||
decimation information after initializing the \ac{cn} to \ac{vn} messages.
|
||||
@@ -1184,7 +1184,7 @@ decimation information after initializing the \ac{cn} to \ac{vn} messages.
|
||||
% \State $\displaystyle\left(\hat{\bm{e}}^\text{total}\right)_{\mathcal{I}^{(\ell)}_\text{commit}} \leftarrow \hat{\bm{e}}^{(\ell)}_\text{commit}$
|
||||
% \State $\displaystyle\left(\bm{s}\right)_{\mathcal{J}_\text{overlap}^{(\ell)}}
|
||||
% \leftarrow \bm{H}_\text{overlap}^{(\ell)}
|
||||
% \left( \hat{\bm{e}}_\text{commit}^{(\ell)} \right)^\text{T}$
|
||||
% \left( \hat{\bm{e}}_\text{commit}^{(\ell)} \right)^\mathsf{T}$
|
||||
% \If{$\ell < n_\text{win} - 1$}
|
||||
% \State $L^{(\ell+1)}_{i\leftarrow j} \leftarrow
|
||||
% L^{(\ell)}_{i\leftarrow j}
|
||||
@@ -1404,7 +1404,7 @@ The fact that the $W = 5$ curve is already very close to the
|
||||
whole-block decoder indicates that the marginal benefit of enlarging
|
||||
the window saturates after a certain point.
|
||||
Thus, from a practical standpoint, the choice of $W$ represents a
|
||||
trade-off between decoding latency and accuracy: larger windows
|
||||
trade-off between decoding latency and accuracy: Larger windows
|
||||
delay the start of decoding by requiring more syndrome extraction
|
||||
rounds to be collected upfront, while the diminishing returns above
|
||||
$W = 4$ suggest that growing the window much further yields little
|
||||
@@ -1511,7 +1511,7 @@ The dashed colored curves reproduce the cold-start results from
|
||||
corresponding warm-start runs for the same window sizes
|
||||
$W \in \{3, 4, 5\}$.
|
||||
The remaining experimental parameters are unchanged:
|
||||
the step size is fixed to $F = 1$,
|
||||
The step size is fixed to $F = 1$,
|
||||
the inner \ac{bp} decoder is allowed up to $200$ iterations per
|
||||
window invocation, the black curve again gives the whole-block
|
||||
reference, and the physical error rate is swept from $p = 0.001$ to
|
||||
@@ -1707,7 +1707,7 @@ $n_\text{iter} \in [32, 512]$.
|
||||
|
||||
All curves decrease monotonically with the iteration budget, but
|
||||
contrary to our expectation, none of them appears to fully saturate
|
||||
within the swept range: even at $n_\text{iter} = 4096$, every curve
|
||||
within the swept range: Even at $n_\text{iter} = 4096$, every curve
|
||||
still exhibits a noticeable downward slope.
|
||||
At $n_\text{iter} = 32$, the whole-block curve lies below both the
|
||||
$W=4$ and $W=5$ sliding-window curves.
|
||||
@@ -1729,7 +1729,7 @@ mirroring the behavior already observed in \Cref{fig:whole_vs_cold_vs_warm}.
|
||||
These observations are largely consistent with the effective-iterations
|
||||
hypothesis put forward above.
|
||||
The whole-block decoder eventually overtaking every windowed scheme
|
||||
matches the prediction made there: with a sufficiently large
|
||||
matches the prediction made there: With a sufficiently large
|
||||
iteration budget, the whole-block decoder reaches an error rate
|
||||
that none of the windowed schemes can beat, because of the more global
|
||||
nature of the considered constraints.
|
||||
@@ -1767,7 +1767,7 @@ sliding-window approach is still at an advantage.
|
||||
Having examined the effect of the window size $W$, we next turn to
|
||||
the second windowing parameter, the step size $F$.
|
||||
We carry out an investigation analogous to the one above:
|
||||
we first compare warm- and cold-start decoding across the full range
|
||||
We first compare warm- and cold-start decoding across the full range
|
||||
of physical error rates at a fixed iteration budget, and then we
|
||||
examine the dependence on the iteration budget at a fixed physical
|
||||
error rate.
|
||||
@@ -1994,7 +1994,7 @@ At fixed $F$, the warm-start approach lies below
|
||||
cold-start across the entire sweep, and at fixed
|
||||
warm or cold start, smaller $F$ produces a lower \ac{ler}.
|
||||
Both gaps grow as the physical error rate decreases:
|
||||
the curves at $F = 1$ separate further from those at $F = 2$ and $F = 3$,
|
||||
The curves at $F = 1$ separate further from those at $F = 2$ and $F = 3$,
|
||||
and the warm-start curves separate further from the cold-start ones.
|
||||
In \Cref{fig:bp_f_over_iter}, all six curves again decrease
|
||||
monotonically with the iteration budget, with no clear saturation
|
||||
@@ -2016,7 +2016,7 @@ With $W$ held fixed, decreasing $F$ enlarges the overlap between
|
||||
consecutive windows from $W - F$ to $W - F + 1$ syndrome measurement rounds, so
|
||||
a smaller step size is beneficial for the same reason that a larger
|
||||
window size is:
|
||||
each \ac{vn} in an overlap region participates in more window
|
||||
Each \ac{vn} in an overlap region participates in more window
|
||||
invocations, and the warm-start modification effectively accumulates
|
||||
iterations on it across these invocations.
|
||||
The widening of the warm/cold gap towards low iteration counts and
|
||||
@@ -2281,7 +2281,7 @@ This is the opposite of what we observed for plain \ac{bp}, where
|
||||
warm-start improved upon cold-start at every parameter setting.
|
||||
The gap between the warm- and cold-start curves additionally widens
|
||||
as the physical error rate decreases:
|
||||
at the lowest sampled rate $p = 0.001$, the per-round \ac{ler} of the
|
||||
At the lowest sampled rate $p = 0.001$, the per-round \ac{ler} of the
|
||||
warm-start runs is more than two orders of magnitude above that of
|
||||
the corresponding cold-start runs.
|
||||
In \Cref{fig:bpgd_w}, larger window sizes yield lower per-round
|
||||
@@ -2300,13 +2300,13 @@ than its cold-start counterpart is surprising in light of the results
|
||||
for plain \ac{bp}, where the warm-start modification was uniformly beneficial.
|
||||
The dependence on the window size in \Cref{fig:bpgd_w} is, on its own,
|
||||
consistent with the same explanation that we gave for
|
||||
\Cref{fig:whole_vs_cold}: larger windows expose the inner decoder to
|
||||
\Cref{fig:whole_vs_cold}: Larger windows expose the inner decoder to
|
||||
a larger fraction of the constraints encoded in the detector error
|
||||
matrix at the time of decoding, and this benefits both warm- and
|
||||
cold-start decoding.
|
||||
The dependence on the step size in \Cref{fig:bpgd_f}, however, is the
|
||||
opposite of the corresponding dependence under plain \ac{bp}
|
||||
(\Cref{fig:bp_f_over_p}): for warm-start, smaller $F$ now degrades performance
|
||||
(\Cref{fig:bp_f_over_p}): For warm-start, smaller $F$ now degrades performance
|
||||
rather than helps, even though smaller $F$ implies a larger overlap
|
||||
in both cases.
|
||||
|
||||
@@ -2564,7 +2564,7 @@ the warm-start curves now show a clear reordering as $n_\text{iter}$
|
||||
grows.
|
||||
At low iteration budgets the warm-start ordering matches the
|
||||
cold-start ordering, with $F = 1$ best and $F = 3$ worst, but at the
|
||||
largest iteration budget this ordering is fully inverted: warm-start
|
||||
largest iteration budget this ordering is fully inverted: Warm-start
|
||||
$F = 1$ is now the worst and $F = 3$ the best.
|
||||
|
||||
% [Interpretation] Figure 4.11
|
||||
@@ -2596,7 +2596,7 @@ decoding performance.
|
||||
The same mechanism explains the inversion of the step-size ordering
|
||||
in \Cref{fig:bpgd_iter_F}.
|
||||
At low iteration budgets, the ordering is set by the same overlap
|
||||
argument as for plain \ac{bp}: smaller $F$ implies a larger overlap
|
||||
argument as for plain \ac{bp}: Smaller $F$ implies a larger overlap
|
||||
between consecutive windows, more shared messages, and therefore
|
||||
better warm-start performance.
|
||||
At large iteration budgets, the ordering is set by the premature hard
|
||||
@@ -2777,7 +2777,7 @@ since the decimation decisions were made based on the messages themselves.
|
||||
\Cref{fig:bpgd_msg} repeats the experiment of \Cref{fig:bpgd_wf}
|
||||
with the modified warm-start procedure that carries over only the
|
||||
\ac{bp} messages.
|
||||
All other experimental parameters are unchanged: the maximum number
|
||||
All other experimental parameters are unchanged: The maximum number
|
||||
of inner \ac{bp} iterations is $n_\text{iter} = 5000$, and the
|
||||
physical error rate is swept from $p = 0.001$ to $p = 0.004$ in steps
|
||||
of $0.0005$.
|
||||
@@ -2810,7 +2810,7 @@ the warm-start regression observed in \Cref{fig:bpgd_wf},
|
||||
and warm-start now consistently outperforms cold-start.
|
||||
The dependence on the window size and the step size also recovers
|
||||
the qualitative behavior we observed for plain \ac{bp} in
|
||||
\Cref{fig:whole_vs_cold_vs_warm,fig:bp_f_over_p}: a larger overlap
|
||||
\Cref{fig:whole_vs_cold_vs_warm,fig:bp_f_over_p}: A larger overlap
|
||||
between consecutive windows, achieved either by enlarging $W$ or by
|
||||
decreasing $F$, both improves the absolute decoding performance and
|
||||
increases the warm-start advantage over cold-start.
|
||||
@@ -2994,7 +2994,7 @@ cold-start curves across the entire range of $n_\text{iter}$ available to us.
|
||||
\Cref{fig:bpgd_msg_iter} repeats the experiment of
|
||||
\Cref{fig:bpgd_iter} with the modified warm-start procedure that
|
||||
carries over only the \ac{bp} messages.
|
||||
All other experimental parameters are unchanged: the physical error
|
||||
All other experimental parameters are unchanged: The physical error
|
||||
rate is fixed at $p = 0.0025$ and the iteration budget is swept over
|
||||
$n_\text{iter} \in \{32, 128, 256, 512, 1024, 1536, 2048, 2560,
|
||||
3072, 3584, 4096\}$.
|
||||
@@ -3026,7 +3026,7 @@ initialization no longer freezes any \acp{vn} in the next window.
|
||||
The dependence of this benefit on $W$ and $F$ also recovers the
|
||||
pattern observed for plain \ac{bp} in
|
||||
\Cref{fig:whole_vs_cold_vs_warm,fig:bp_f_over_p}:
|
||||
larger overlap, achieved by larger $W$ or smaller $F$, yields more
|
||||
Larger overlap, achieved by larger $W$ or smaller $F$, yields more
|
||||
effective extra iterations and therefore a larger warm-start gain.
|
||||
|
||||
% BPGD conclusion
|
||||
@@ -3048,7 +3048,7 @@ cold-start that follows the same behavior as for plain \ac{bp} with
|
||||
regard to overlap.
|
||||
A second observation specific to \ac{bpgd} is that its iteration
|
||||
requirements are substantially larger than those of plain \ac{bp}:
|
||||
the per-round \ac{ler} drops sharply only once the iteration budget
|
||||
The per-round \ac{ler} drops sharply only once the iteration budget
|
||||
is on the order of the number of \acp{vn} in each window.
|
||||
|
||||
Future work could include a softer treatment of the decimation state
|
||||
|
||||
@@ -7,7 +7,7 @@ This thesis investigates decoding under \acp{dem} for fault-tolerant
|
||||
\ac{qec}, with a focus on low-latency decoding methods for \ac{qldpc} codes.
|
||||
The repetition of the syndrome measurements, especially under
|
||||
consideration of circuit-level noise, leads to a significant increase
|
||||
in decoding complexity: in our experiments on the $\llbracket
|
||||
in decoding complexity: In our experiments on the $\llbracket
|
||||
144,12,12 \rrbracket$ \ac{bb} code with $12$ syndrome extraction
|
||||
rounds, the check matrix grows from 144 \acp{vn} and 72
|
||||
\acp{cn} to 9504 \acp{vn} and 1008 \acp{cn}.
|
||||
@@ -46,18 +46,18 @@ min-sum algorithm.
|
||||
For standard min-sum \ac{bp}, the warm start is consistently
|
||||
beneficial to the cold start, across the considered parameter ranges.
|
||||
The size of the gain depends on the overlap between consecutive
|
||||
windows: enlarging $W$ or shrinking $F$, both of which enlarge the
|
||||
windows: Enlarging $W$ or shrinking $F$, both of which enlarge the
|
||||
overlap, result in larger gains of the warm-start.
|
||||
We observe that the underlying mechanism is an effective increase in
|
||||
the number of \ac{bp} iterations spent on the \acp{vn} in the overlap
|
||||
region: each such \ac{vn} is processed by multiple consecutive window
|
||||
region: Each such \ac{vn} is processed by multiple consecutive window
|
||||
invocations, and the warm start lets these invocations accumulate
|
||||
iterations on the same \acp{vn} rather than restarting from scratch.
|
||||
The gain was most pronounced at low numbers of maximum iterations, where
|
||||
every additional iteration carries proportionally more information.
|
||||
|
||||
For \ac{bpgd}, we note that more information is available in the
|
||||
overlap region of a window: in addition to the \ac{bp} messages,
|
||||
overlap region of a window: In addition to the \ac{bp} messages,
|
||||
there is information about which \acp{vn} were decimated and to what value.
|
||||
Passing this decimation information to the next window in addition to
|
||||
the messages turned out to worsen the performance considerably, which
|
||||
@@ -66,7 +66,7 @@ overlap region.
|
||||
Restricting the warm start to the \ac{bp} messages alone, removed this effect.
|
||||
The resulting message-only warm start recovered a consistent
|
||||
improvement over cold-start that followed the same qualitative
|
||||
behaviour as for standard \ac{bp}: larger overlap, achieved by larger
|
||||
behaviour as for standard \ac{bp}: Larger overlap, achieved by larger
|
||||
$W$ or smaller $F$, yielded a larger gain, and the
|
||||
performance difference is most pronounced at low numbers of maximum iterations.
|
||||
|
||||
|
||||
@@ -49,7 +49,7 @@ For both standard \ac{bp} and \ac{bpgd} decoding, the warm-start
|
||||
initialization provides a consistent improvement across all examined
|
||||
parameter settings.
|
||||
We attribute this to an effective increase in \ac{bp} iterations on
|
||||
variable nodes in the overlap regions: each such VN is processed by
|
||||
variable nodes in the overlap regions: Each such VN is processed by
|
||||
multiple consecutive windows, and warm-starting lets these
|
||||
invocations accumulate iterations rather than restart from scratch.
|
||||
Crucially, the warm-start modification incurs no additional
|
||||
|
||||
@@ -90,10 +90,10 @@
|
||||
% \thesisHeadOfInstitute{Prof. Dr.-Ing. Peter Rost}
|
||||
%\thesisHeadOfInstitute{Prof. Dr.-Ing. Peter Rost\\Prof. Dr.-Ing.
|
||||
% Laurent Schmalen}
|
||||
\thesisSupervisor{M.Sc. Jonathan Mandelbaum}
|
||||
\thesisStartDate{01.11.2025}
|
||||
\thesisEndDate{04.05.2026}
|
||||
\thesisSignatureDate{04.05.2026}
|
||||
\thesisSupervisor{Dr.-Ing. Hedongliang Liu\\ && M.Sc. Jonathan Mandelbaum}
|
||||
\thesisStartDate{Nov. 1st, 2025}
|
||||
\thesisEndDate{May 4th, 2026}
|
||||
\thesisSignatureDate{May 4th, 2026}
|
||||
\thesisSignature{res/Unterschrift_AT_blue.png}
|
||||
\thesisSignatureHeight{2.4cm}
|
||||
\thesisLanguage{english}
|
||||
@@ -109,9 +109,11 @@
|
||||
\cleardoublepage
|
||||
\pagenumbering{arabic}
|
||||
|
||||
\newgeometry{a4paper,left=3cm,right=3cm,top=2cm,bottom=2.5cm}
|
||||
\addtocontents{toc}{\protect\vspace*{-9mm}}
|
||||
\tableofcontents
|
||||
\cleardoublepage
|
||||
\restoregeometry
|
||||
|
||||
\input{chapters/1_introduction.tex}
|
||||
\input{chapters/2_fundamentals.tex}
|
||||
@@ -124,10 +126,10 @@
|
||||
% \listoftables
|
||||
% \include{abbreviations}
|
||||
|
||||
% \cleardoublepage
|
||||
% \phantomsection
|
||||
% \addcontentsline{toc}{chapter}{List of Abbreviations}
|
||||
% \printacronyms
|
||||
\cleardoublepage
|
||||
\phantomsection
|
||||
\addcontentsline{toc}{chapter}{List of Abbreviations}
|
||||
\printacronyms
|
||||
|
||||
\bibliography{lib/cel-thesis/IEEEabrv,src/thesis/bibliography}
|
||||
|
||||
|
||||
Reference in New Issue
Block a user