17 Commits

Author SHA1 Message Date
4dfb3a7c35 Last changes 2026-05-06 01:58:49 +02:00
10d791fe04 Final readthrough corrections of quantum fundamentals 2026-05-04 23:04:28 +02:00
06852b8e62 Final readthrough corrections of classical fundamentals 2026-05-04 21:07:25 +02:00
400dc47df0 Incorporate Jonathan's corrections to classical fundamentals 2026-05-04 20:56:35 +02:00
ece8fc1715 Center error marker 2026-05-04 20:24:27 +02:00
56e3a0e5ca Consistently capitalize character after semicolon 2026-05-04 20:21:21 +02:00
8d6df8a79d Final readthrough corrections for fault tolerance chapter 2026-05-04 20:06:18 +02:00
c41ac9f61f Incorporate Jonathan's corrections to Fault Tolerance Chapter 2026-05-04 19:45:15 +02:00
a41e0b05fe Add Lia as supervisor 2026-05-04 19:20:08 +02:00
1edc3f301a Final readthrough corrections for decoding chapter 2026-05-04 18:42:39 +02:00
a977860ddb Incorporate Jonathan's correction to sliding-window decoding sections 2026-05-04 17:35:33 +02:00
7bf1b2f8d7 Incorporate Jonathan's corrections to numerical results section 2026-05-04 17:07:41 +02:00
72acea0321 Incorporate Jonathan's corrections to the introduction 2026-05-04 16:31:31 +02:00
f1a5aaf3f8 Make ToC be on one page 2026-05-04 16:20:37 +02:00
23828b671a Minor changes to conclusion 2026-05-04 16:08:56 +02:00
09893d527e Incorporate Jonathan's corrections to Abstract 2026-05-04 15:36:15 +02:00
25789a6bd3 Incorporate Jonathans's corrections to Conclusion 2026-05-04 15:28:11 +02:00
7 changed files with 521 additions and 489 deletions

View File

@@ -17,7 +17,7 @@ factorization \cite{shor_algorithms_1994}.
Similar to the way classical computers are built from bits and gates, Similar to the way classical computers are built from bits and gates,
quantum computers are built from \emph{qubits} and \emph{quantum gates}. quantum computers are built from \emph{qubits} and \emph{quantum gates}.
Because of quantum entanglement, it is not enough to consider the Because of quantum entanglement, it does not suffice to consider the
qubits individually, we also have to consider correlations between them. qubits individually, we also have to consider correlations between them.
For a system of $n$ qubits, this makes the state space grow with For a system of $n$ qubits, this makes the state space grow with
$2^n$ instead of linearly with $n$, as would be the case for a classical system $2^n$ instead of linearly with $n$, as would be the case for a classical system
@@ -30,12 +30,11 @@ what provides them with their power \cite[Sec.~2.1]{roffe_decoding_2020}.
Realizing algorithms that leverage these quantum-mechanical effects Realizing algorithms that leverage these quantum-mechanical effects
requires hardware that can execute long quantum computations reliably. requires hardware that can execute long quantum computations reliably.
This poses a problem, because the qubits making up current devices This poses a problem, because the qubits making up current devices
are difficult to sufficiently isolate from their environment consistently interact with their environment \cite[Sec.~1]{roffe_quantum_2019}.
\cite[Sec.~1]{roffe_quantum_2019}. This interaction acts as a continuous small-scale measurement, an
Their interaction with the environment acts as a continuous small-scale effect we call \emph{decoherence} of the stored quantum state, which
measurement, an effect we call \emph{decoherence} of the stored quantum results in errors on the qubits.
state. Decoherence is the reason large systems do not exhibit visible quantum
Decoherence is the reason large systems don't exhibit visible quantum
properties at human scales \cite[Sec.~1]{gottesman_stabilizer_1997}. properties at human scales \cite[Sec.~1]{gottesman_stabilizer_1997}.
% Intro to QEC % Intro to QEC
@@ -45,8 +44,8 @@ It addresses the issue by encoding the information of $k$
\emph{logical qubits} into a larger number $n>k$ of \emph{physical \emph{logical qubits} into a larger number $n>k$ of \emph{physical
qubits}, in close analogy to classical channel coding qubits}, in close analogy to classical channel coding
\cite[Sec.~1]{roffe_quantum_2019}. \cite[Sec.~1]{roffe_quantum_2019}.
The redundancy introduced this way can then be used to restore The redundancy introduced this way can then be used to detect and
the quantum state, should it be disturbed. correct a corrupted the quantum state.
The quantum setting imposes some important constraints that do not exist in the The quantum setting imposes some important constraints that do not exist in the
classical case, however \cite[Sec.~2.4]{roffe_quantum_2019}: classical case, however \cite[Sec.~2.4]{roffe_quantum_2019}:
\begin{itemize} \begin{itemize}
@@ -54,7 +53,7 @@ classical case, however \cite[Sec.~2.4]{roffe_quantum_2019}:
\item In addition to the bit-flip errors we know from the \item In addition to the bit-flip errors we know from the
classical setting, qubits are subject to \emph{phase-flips}. classical setting, qubits are subject to \emph{phase-flips}.
\item We are not allowed to directly measure the encoded qubits, \item We are not allowed to directly measure the encoded qubits,
as that would disturb their quantum states. as that would collapse their quantum states.
\end{itemize} \end{itemize}
We can deal with the first constraint by not duplicating information, instead We can deal with the first constraint by not duplicating information, instead
spreading the quantum state across the physical qubits spreading the quantum state across the physical qubits
@@ -74,8 +73,8 @@ subsequent decoding process on the measured syndrome.
Another difference between \ac{qec} and classical channel coding is Another difference between \ac{qec} and classical channel coding is
the resource constraints. the resource constraints.
For \ac{qec}, low latency matters more than low overall computational For \ac{qec}, achieving low latency matters more than having a low
complexity, due to the backlog problem overall computational complexity, due to the backlog problem
\cite[Sec.~II.G.3.]{terhal_quantum_2015}: Certain gates turn \cite[Sec.~II.G.3.]{terhal_quantum_2015}: Certain gates turn
single-qubit errors into multi-qubit ones, so errors must be single-qubit errors into multi-qubit ones, so errors must be
corrected beforehand. corrected beforehand.
@@ -83,7 +82,7 @@ A \ac{qec} system that is too slow accumulates a backlog at these points,
causing exponential slowdown. causing exponential slowdown.
Several code constructions have been proposed for \ac{qec} codes over the years. Several code constructions have been proposed for \ac{qec} codes over the years.
Topological codes such as surface codes have been the industry Topological codes, such as surface codes, have been the industry
standard for experimental applications for a long time standard for experimental applications for a long time
\cite[Sec.~I]{koutsioumpas_colour_2025}, due to their \cite[Sec.~I]{koutsioumpas_colour_2025}, due to their
reliance on only local connections between qubits reliance on only local connections between qubits
@@ -116,15 +115,15 @@ focusing only on the relationship between possible errors
and their effects on the syndrome \cite[Sec.~1.4.3]{higgott_practical_2024}. and their effects on the syndrome \cite[Sec.~1.4.3]{higgott_practical_2024}.
A \emph{detector error matrix} is generated from the circuit, which is A \emph{detector error matrix} is generated from the circuit, which is
used for decoding instead of the original check matrix. used for decoding instead of the original check matrix.
Decoding under a \ac{dem} poses a challenge with respect to the The detector error matrix is much larger than the
latency constraint.
This is because the detector error matrix is much larger than the
check matrix of the underlying code, since it needs to represent many check matrix of the underlying code, since it needs to represent many
more error locations. more error locations.
For example, in our experiments using the $\llbracket 144,12,12 For example, in our experiments using the $\llbracket 144,12,12
\rrbracket$ \ac{bb} code with $12$ syndrome measurement rounds, the \rrbracket$ \ac{bb} code with $12$ syndrome measurement rounds, the
number of \acp{vn} grew from $144$ to $9504$ and the number of number of \acp{vn} grew from $144$ to $9504$ and the number of
\acp{cn} grew from $72$ to $1008$. \acp{cn} grew from $72$ to $1008$.
Therefore, decoding under a \ac{dem} poses a challenge with respect to the
latency constraint.
To keep the latency of \ac{dem} decoding manageable, one approach is To keep the latency of \ac{dem} decoding manageable, one approach is
\emph{sliding-window decoding}. \emph{sliding-window decoding}.
@@ -154,7 +153,7 @@ We propose \emph{warm-start sliding-window decoding}, in which the
\ac{bp} messages from the overlap region of the previous window are \ac{bp} messages from the overlap region of the previous window are
reused to initialize \ac{bp} in the current window in place of the reused to initialize \ac{bp} in the current window in place of the
standard cold-start initialization. standard cold-start initialization.
We formulate the warm start first for plain \ac{bp} and then for We formulate the warm start for standard \ac{bp} and for
\ac{bpgd}, a variant of \ac{bp} with better convergence properties \ac{bpgd}, a variant of \ac{bp} with better convergence properties
for \ac{qec} codes. for \ac{qec} codes.
The decoders are evaluated by Monte Carlo simulation on the The decoders are evaluated by Monte Carlo simulation on the
@@ -166,6 +165,7 @@ low-latency operation.
% Outline of the Thesis % Outline of the Thesis
This thesis is structured as follows:
\Cref{ch:Fundamentals} reviews the fundamentals of classical and \Cref{ch:Fundamentals} reviews the fundamentals of classical and
quantum error correction. quantum error correction.
On the classical side, it covers binary linear block codes, On the classical side, it covers binary linear block codes,

View File

@@ -35,9 +35,9 @@ algorithm.
% Codewords, n, k, rate % Codewords, n, k, rate
% %
One particularly important class of coding schemes is that of binary Binary linear block codes form one particularly important class of
linear block codes. coding schemes.
The information to be protected takes the form of a sequence of The information to be protected is represented by a sequence of
binary symbols, which is split into separate blocks. binary symbols, which is split into separate blocks.
Each block is encoded, transmitted, and decoded separately. Each block is encoded, transmitted, and decoded separately.
The encoding step introduces redundancy by mapping input messages The encoding step introduces redundancy by mapping input messages
@@ -45,10 +45,11 @@ $\bm{u} \in \mathbb{F}_2^k$ of length $k \in \mathbb{N}$ (called the
\textit{information length}) onto \textit{codewords} $\bm{x} \in \textit{information length}) onto \textit{codewords} $\bm{x} \in
\mathbb{F}_2^n$ of length $n \in \mathbb{N}$ (called the \mathbb{F}_2^n$ of length $n \in \mathbb{N}$ (called the
\textit{block length}) with $n > k$. \textit{block length}) with $n > k$.
A measure of the amount of introduced redundancy is the \textit{code The \textit{code rate} $R = k/n$ is a measure of the amount of
rate} $R = k/n$. introduced redundancy.
We call the set of all codewords $\mathcal{C}$ the \textit{code} We call the set of all codewords
\cite[Sec.~3.1.1]{ryan_channel_2009}. $\mathcal{C} = \{\bm{x}^{(1)}, \bm{x}^{(2)}, \ldots, \bm{x}^{(2^k)}\}$
the \textit{code} \cite[Sec.~3.1.1]{ryan_channel_2009}.
% %
% d_min and the [] Notation % d_min and the [] Notation
@@ -77,7 +78,7 @@ $[n,k,d_\text{min}]$.
% Parity checks, H, and the syndrome % Parity checks, H, and the syndrome
% %
A particularly elegant way of describing the code space $C$ is the A particularly elegant way of describing the code space $\mathcal{C}$ is the
notion of \textit{parity checks}. notion of \textit{parity checks}.
Since $\lvert \mathcal{C} \rvert = 2^k$ and $\lvert \mathbb{F}_2^n Since $\lvert \mathcal{C} \rvert = 2^k$ and $\lvert \mathbb{F}_2^n
\rvert = 2^n$, there are $n-k$ conditions constrain the additional \rvert = 2^n$, there are $n-k$ conditions constrain the additional
@@ -86,17 +87,17 @@ These conditions, called parity checks, take the form of equations
over $\mathbb{F}_2^n$, linking the individual positions of each codeword. over $\mathbb{F}_2^n$, linking the individual positions of each codeword.
We can arrange the coefficients of these equations in a We can arrange the coefficients of these equations in a
\textit{parity-check matrix} (\acs{pcm}) $\bm{H} \in \textit{parity-check matrix} (\acs{pcm}) $\bm{H} \in
\mathbb{F}_2^{(n-k) \times n}$ and equivalently define the code as \mathbb{F}_2^{(n-k) \times n}$, $\text{rank}(\bm{H}) = n-k$, and
\cite[Sec.~3.1.1]{ryan_channel_2009} equivalently define the code as \cite[Sec.~3.1.1]{ryan_channel_2009}
\begin{align*} \begin{align*}
\mathcal{C} = \left\{ \bm{x} \in \mathbb{F}_2^n : \mathcal{C} := \text{kern}(\bm{H}) = \left\{ \bm{x} \in \mathbb{F}_2^n :
\bm{H}\bm{x}^\text{T} = \bm{0} \right\} \bm{H}\bm{x}^\mathsf{T} = \bm{0} \right\}
.% .%
\end{align*} \end{align*}
Note that in general we may have linearly dependent parity checks, In general, we have linearly dependent parity checks,
prompting us to define the \ac{pcm} as $\bm{H} \in prompting us to define the \ac{pcm} as $\bm{H} \in
\mathbb{F}_2^{m\times n}$ with $\hspace{2mm} m \ge n-k$ instead. \mathbb{F}_2^{m\times n}$ with $\hspace{2mm} m \ge n-k$ instead.
The \textit{syndrome} $\bm{s} = \bm{H} \bm{v}^\text{T}$ describes The \textit{syndrome} $\bm{s} = \bm{H} \bm{v}^\mathsf{T}$ describes
which parity checks a vector $\bm{v} \in \mathbb{F}_2^n$ violates. which parity checks a vector $\bm{v} \in \mathbb{F}_2^n$ violates.
The representation using the \ac{pcm} has the benefit of providing a The representation using the \ac{pcm} has the benefit of providing a
description of the code, the memory complexity of which does not grow description of the code, the memory complexity of which does not grow
@@ -118,9 +119,9 @@ $\bm{y} \in \mathbb{R}^n$, and \textit{hard-decision} decoding, where
$\bm{y} \in \mathbb{F}_2^n$ \cite[Sec.~1.5.1.3]{ryan_channel_2009}. $\bm{y} \in \mathbb{F}_2^n$ \cite[Sec.~1.5.1.3]{ryan_channel_2009}.
Finally, the decoder is responsible for obtaining an estimate Finally, the decoder is responsible for obtaining an estimate
$\hat{\bm{u}} \in \mathbb{F}_2^k$ of the original input message. $\hat{\bm{u}} \in \mathbb{F}_2^k$ of the original input message.
This is done by first finding an estimate $\hat{\bm{x}}$ of the sent This can be done by first finding an estimate $\hat{\bm{x}}$ of the sent
codeword and undoing the encoding. codeword and undoing the encoding.
The decoding problem that we generally attempt to solve thus consists The decoding problem that we attempt to solve thus consists
in finding the best estimate $\hat{\bm{x}}$ given $\bm{y}$. in finding the best estimate $\hat{\bm{x}}$ given $\bm{y}$.
\begin{figure}[t] \begin{figure}[t]
@@ -168,9 +169,9 @@ in finding the best estimate $\hat{\bm{x}}$ given $\bm{y}$.
% %
Shannon's noisy-channel coding theorem is stated for codes whose block Shannon's noisy-channel coding theorem is stated for codes whose block
length approaches infinity. This suggests that as the block length length $n$ approaches infinity.
becomes larger, the performance of the considered codes should This suggests that as the block length becomes larger, the
generally improve. performance of the considered codes should generally improve.
However, the size of the \ac{pcm} of a linear block code grows However, the size of the \ac{pcm} of a linear block code grows
quadratically with $n$. quadratically with $n$.
This would quickly render decoding intractable as we increase the This would quickly render decoding intractable as we increase the
@@ -189,13 +190,14 @@ This is exactly the motivation behind \ac{ldpc} codes
These differ from ``classical codes'' in their decoding algorithms: These differ from ``classical codes'' in their decoding algorithms:
Classical codes are usually decoded using one-step hard-decision decoding, Classical codes are usually decoded using one-step hard-decision decoding,
whereas modern codes are suitable for iterative soft-decision whereas modern codes are suitable for iterative soft-decision
decoding \cite[Preface]{ryan_channel_2009}. The iterative decoding algorithms decoding \cite[Preface]{ryan_channel_2009}.
For \ac{ldpc} codes, the iterative decoding algorithms
are generally defined in terms of message passing on the are generally defined in terms of message passing on the
\textit{Tanner graph} of a code. The Tanner graph is a bipartite \textit{Tanner graph} of a code. The Tanner graph is a bipartite
graph that constitutes an alternative representation of the \ac{pcm}. graph that constitutes an alternative representation of the \ac{pcm}.
We define two types of nodes: \acp{vn}, corresponding to codeword We define two types of nodes: \Acp{vn}, corresponding to codeword
bits, and \acp{cn}, corresponding to individual parity checks. bits, and \acp{cn}, corresponding to individual parity checks.
We then construct the Tanner graph by connecting each \ac{cn} to Then, we construct the Tanner graph by connecting each \ac{cn} to
the \acp{vn} that make up the corresponding parity check the \acp{vn} that make up the corresponding parity check
\cite[Sec.~5.1.2]{ryan_channel_2009}. \cite[Sec.~5.1.2]{ryan_channel_2009}.
\Cref{PCM and Tanner graph of the Hamming code} shows the Tanner \Cref{PCM and Tanner graph of the Hamming code} shows the Tanner
@@ -273,11 +275,11 @@ Mathematically, we represent a \ac{vn} using the index $i \in
and a \ac{cn} using the index $j \in \mathcal{J} and a \ac{cn} using the index $j \in \mathcal{J}
:= \left[ 0 : m-1 \right]$. := \left[ 0 : m-1 \right]$.
We can then encode the information contained in the graph by defining We can then encode the information contained in the graph by defining
the neighborhood of a variable node $i$ as the neighborhood of a \ac{vn} $i$ as
$\mathcal{N}_\text{V} (i) = \left\{ j \in \mathcal{J} : \bm{H}_{j,i} $\mathcal{N}_\text{V} (i) = \left\{ j \in \mathcal{J} : H_{j,i}
= 1 \right\}$ = 1 \right\}$
and that of a check node $j$ as and the neighborhood of a \ac{cn} $j$ as
$\mathcal{N}_\text{C} (j) = \left\{ i \in \mathcal{I} : \bm{H}_{j,i} $\mathcal{N}_\text{C} (j) = \left\{ i \in \mathcal{I} : H_{j,i}
= 1 \right\}$. = 1 \right\}$.
% %
@@ -379,15 +381,15 @@ the numbers of ones, of their rows and columns are constant
Already during their introduction, regular \ac{ldpc} codes were shown to have Already during their introduction, regular \ac{ldpc} codes were shown to have
a minimum distance scaling linearly with the block length $n$ for a minimum distance scaling linearly with the block length $n$ for
large values \cite[Ch.~2,~Theorem~1]{gallager_low_1960}, large values \cite[Ch.~2,~Theorem~1]{gallager_low_1960},
which leads to the fact that they do not exhibit an error floor under which leads to a more favorable behavior of the error rate for high
\ac{ml} decoding. signal-to-noise ratios.
Irregular codes, on the other hand, generally do exhibit an error floor, Irregular codes, on the other hand, have more severe error floor behavior.
while their redeeming quality is the ability to reach near-capacity However, they have the the ability to reach near-capacity performance
performance in the waterfall region \cite[Intro.]{costello_spatially_2014}. in the waterfall region \cite[Intro.]{costello_spatially_2014}.
\subsection{Spatially-Coupled LDPC Codes} \subsection{Spatially-Coupled LDPC Codes}
A recent development in the field of \ac{ldpc} codes is that of A more recent development in the field of \ac{ldpc} codes is that of
\ac{sc}-\ac{ldpc} codes. \ac{sc}-\ac{ldpc} codes.
Their key feature is that they combine the best properties of regular Their key feature is that they combine the best properties of regular
and irregular codes. and irregular codes.
@@ -399,11 +401,12 @@ waterfall region \cite[Intro.]{costello_spatially_2014}.
The essential property of \ac{sc}-\ac{ldpc} codes is that codewords The essential property of \ac{sc}-\ac{ldpc} codes is that codewords
from different \textit{spatial positions}, which would ordinarily be sent from different \textit{spatial positions}, which would ordinarily be sent
one after the other independently, are linked. one after the other independently, are linked.
This is achieved by connecting some \acp{vn} of one spatial position to This is achieved by introducing edges between \acp{vn} of one spatial
\acp{cn} of another, resulting in a \ac{pcm} of the form position and \acp{cn} of another, resulting in a \ac{pcm} of the form
\cite[Eq.~1]{hassan_fully_2016} \cite[Eq.~1]{hassan_fully_2016}
% %
\begin{align*} \begin{align}
\label{eq:PCM}
\bm{H} = \bm{H} =
\begin{pmatrix} \begin{pmatrix}
\bm{H}_0(1) & & \\ \bm{H}_0(1) & & \\
@@ -413,10 +416,11 @@ This is achieved by connecting some \acp{vn} of one spatial position to
& & \bm{H}_K(L) \\ & & \bm{H}_K(L) \\
\end{pmatrix} \end{pmatrix}
, ,
\end{align*} \end{align}
% %
where $K \in \mathbb{N}$ is the \textit{coupling width} and $L \in where $K \in \mathbb{N}$ is the \textit{coupling width} and $L \in
\mathbb{N}$ is the number of spatial positions. \mathbb{N}$ is the number of spatial positions.
The parts of the \ac{pcm} left empty in \Cref{eq:PCM} are filled with zeros.
This construction results in a Tanner graph as depicted in This construction results in a Tanner graph as depicted in
\Cref{fig:sc-ldpc-tanner}. \Cref{fig:sc-ldpc-tanner}.
@@ -513,7 +517,7 @@ Note that at the first few spatial positions some \acp{cn} have lower degrees.
This leads to more reliable information about the This leads to more reliable information about the
\acp{vn} that, as we will see, is \acp{vn} that, as we will see, is
later passed to subsequent spatial positions during decoding. later passed to subsequent spatial positions during decoding.
This is precisely the effect that leads to the good performance of This is precisely the effect that leads to the improved performance of
\ac{sc}-\ac{ldpc} codes in the waterfall region \cite{costello_spatially_2014}. \ac{sc}-\ac{ldpc} codes in the waterfall region \cite{costello_spatially_2014}.
\subsection{Iterative Decoding} \subsection{Iterative Decoding}
@@ -521,15 +525,14 @@ This is precisely the effect that leads to the good performance of
% Introduction % Introduction
\ac{ldpc} codes are generally decoded using efficient iterative Due to their sparse graphs, efficient iterative decoders exist for
algorithms, something that is possible due to their sparsity \ac{ldpc} codes \cite[Sec.~5.3]{ryan_channel_2009}.
\cite[Sec.~5.3]{ryan_channel_2009}. The decoding algorithm originally proposed alongside LDPC codes by
The algorithm originally proposed alongside LDPC codes for this Gallager in 1960 is now known as the \ac{spa}
purpose by Gallager in 1960 is now known as the \ac{spa}
\cite[5.4.1]{ryan_channel_2009}, also called \acf{bp}. \cite[5.4.1]{ryan_channel_2009}, also called \acf{bp}.
The optimality criterion the \ac{spa} is built around is a The optimality criterion the \ac{spa} is built around is a
symbol-wise \ac{map} decision \cite[Sec.~5.4.1]{ryan_channel_2009}. bit-wise \ac{map} decision \cite[Sec.~5.4.1]{ryan_channel_2009}.
The core idea of the resulting algorithm is to view \acp{cn} The core idea of the resulting algorithm is to view \acp{cn}
and \acp{vn} as representing individual local codes. and \acp{vn} as representing individual local codes.
A \ac{cn} represents a single parity check on the connected \acp{vn}, A \ac{cn} represents a single parity check on the connected \acp{vn},
@@ -539,11 +542,11 @@ should agree on its value; it can therefore be understood as a repetition code.
The algorithm alternates between consolidating soft information about The algorithm alternates between consolidating soft information about
the \acp{vn} in the \acp{cn}, and consolidating soft information about the \acp{vn} in the \acp{cn}, and consolidating soft information about
the \acp{cn} in the \acp{vn}. the \acp{cn} in the \acp{vn}.
To this end, messages are passed back and forth along the edges of To this end, messages computed in the nodes are passed back and forth
the Tanner graph. along the edges of the Tanner graph.
$L_{i\rightarrow j}$ represents a message passed from \ac{vn} $i$ to $L_{i\rightarrow j}$ represents a message passed from \ac{vn} $i$ to
\ac{cn} j, $L_{i\leftarrow j}$ represents a message passed from \ac{cn} $j$, $L_{i\leftarrow j}$ represents a message passed from
\ac{cn} j to \ac{vn} i. \ac{cn} $j$ to \ac{vn} $i$.
The \acp{vn} additionally receive messages \cite[5.4.2]{ryan_channel_2009} The \acp{vn} additionally receive messages \cite[5.4.2]{ryan_channel_2009}
\begin{align*} \begin{align*}
\tilde{L}_i = \log \frac{P(X=0 \vert Y=y)}{P(X=1 \vert Y=y)}, \tilde{L}_i = \log \frac{P(X=0 \vert Y=y)}{P(X=1 \vert Y=y)},
@@ -574,7 +577,7 @@ possible cycles and are thus especially problematic.
% Min-sum algorithm % Min-sum algorithm
A simplification of the \ac{spa} is the min-sum decoder. Here, the A simplification of the \ac{spa} is the min-sum algorithm. Here, the
\ac{cn} update is approximated as \cite[Sec.~5.5.1]{ryan_channel_2009} \ac{cn} update is approximated as \cite[Sec.~5.5.1]{ryan_channel_2009}
\begin{align*} \begin{align*}
L_{i \leftarrow j} = \prod_{i' \in \mathcal{N}_\text{C}(j)\setminus \{i\}} L_{i \leftarrow j} = \prod_{i' \in \mathcal{N}_\text{C}(j)\setminus \{i\}}
@@ -598,7 +601,7 @@ decoding of subsequent blocks \cite[Sec.~III.~C.]{hassan_fully_2016}.
\label{sec:Quantum Mechanics and Quantum Information Science} \label{sec:Quantum Mechanics and Quantum Information Science}
Designing codes and decoders for \ac{qec} is generally performed on a Designing codes and decoders for \ac{qec} is generally performed on a
layer of abstraction far removed from the quantum mechanical layer of mathematical abstraction far removed from the quantum mechanical
processes underlying the actual physics. processes underlying the actual physics.
Nevertheless, having a fundamental understanding of the related Nevertheless, having a fundamental understanding of the related
quantum mechanical concepts is useful to grasp the unique constraints quantum mechanical concepts is useful to grasp the unique constraints
@@ -618,39 +621,41 @@ function and the observable world:
$\lvert \psi (x,t) \rvert^2$ is the \ac{pdf} of finding a particle at $\lvert \psi (x,t) \rvert^2$ is the \ac{pdf} of finding a particle at
position $x$ and time $t$ \cite[Sec.~1.2]{griffiths_introduction_1995}. position $x$ and time $t$ \cite[Sec.~1.2]{griffiths_introduction_1995}.
Note that this presupposes a normalization of $\psi$ such that Note that this presupposes a normalization of $\psi$ such that
$\int_{-\infty}^{\infty} \lvert \psi(x,t) \rvert^2 dx = 1$. \begin{align*}
\int_{-\infty}^{\infty} \lvert \psi(x,t) \rvert^2 dx = 1
.%
\end{align*}
% Dirac notation % Dirac notation
Much of the related mathematics can be very elegantly expressed The language of linear algebra allows one to express the related
using the language of linear algebra. mathematics particularly elegantly.
The so-called Bra-ket or Dirac notation is especially appropriate, The so-called Bra-ket or Dirac notation, introducced
having been proposed by Paul Dirac in 1939 for the express purpose by Paul Dirac in 1939 for the express purpose of simplifying quantum
of simplifying quantum mechanical notation \cite{dirac_new_1939}. mechanical notation \cite{dirac_new_1939}, is especially appropriate.
Two new symbols are defined, \emph{bra}s $\bra{\cdot}$ and Two new symbols are defined, \emph{bra} $\bra{\cdot}$ and
\emph{ket}s $\ket{\cdot}$. \emph{ket} $\ket{\cdot}$.
Kets denote column vectors, while bras denote their Hermitian conjugates. Kets denote column vectors, while bras denote their Hermitian conjugates.
For example, two vectors specified by the labels $a$ and $b$ For example, two vectors specified by the labels $a$ and $b$,
respectively are written as $\ket{a}$ and $\ket{b}$. respectively, are written as $\ket{a}$ and $\ket{b}$.
Their inner product is $\braket{a\vert b}$. Their inner product is $\braket{a\vert b}$.
% Expressing wave functions using linear algebra % Expressing wave functions using linear algebra
The connection we will make between quantum mechanics and linear The connection we will make between quantum mechanics and linear
algebra is that we will model the state space of a system as a algebra is that we will model the state space of a system as a
\emph{function space}, the Hilbert space $L_2$. \emph{function space}, namely the Hilbert space $L_2$.
We will represent the state of a particle with wave function The state of a particle with wave function $\psi(x,t)$ is represented
$\psi(x,t)$ using the vector $\ket{\psi}$ by the vector $\ket{\psi}$ \cite[Sec.~3.3]{griffiths_introduction_1995}.
\cite[Sec.~3.3]{griffiths_introduction_1995}.
% Operators % Operators
Another important notion is that of an \emph{operator}, a transformation Another important notion is that of an \emph{operator}, a transformation
that takes a function as an input and returns another function as an that maps a function onto another function
output \cite[Sec.~3.2.2]{griffiths_introduction_1995}. \cite[Sec.~3.2.2]{griffiths_introduction_1995}.
A prominent example of this is the differential operator $\partial x$.
Operators are useful to describe the relations between different Operators are useful to describe the relations between different
quantities relating to a particle. quantities relating to a particle.
An example of this is the differential operator $\partial x$.
We define the \emph{commutator} of two operators $P_1$ and $P_2$ as We define the \emph{commutator} of two operators $P_1$ and $P_2$ as
\begin{align*} \begin{align*}
[P_1,P_2] = P_1P_2 - P_2P_1 [P_1,P_2] = P_1P_2 - P_2P_1
@@ -669,22 +674,21 @@ We say the two operators \emph{commute} iff $[P_1,P_2] = 0$, and they
% Observable quantities % Observable quantities
An \emph{observable quantity} $Q(x,p,t)$ is a quantity of a quantum An \emph{observable} $Q(x,p,t)$ is a quantity of a quantum
mechanical system that we can measure, such as the position $x$ or mechanical system that we can measure, e.g., the position $x$ or
momentum $p$ of a particle. momentum $p$ of a particle.
In general, such measurements are not deterministic, i.e., In general, such measurements are not deterministic, i.e.,
measurements on identically prepared states can yield different results. measurements on identically prepared states can yield different results.
There are some states, however, that are \emph{determinate} for a However, some states are \emph{determinate} for a
specific observable: measuring those will always yield identical specific observable: Measuring those will always yield identical
observations \cite[Sec.~3.3]{griffiths_introduction_1995}. outcomes \cite[Sec.~3.3]{griffiths_introduction_1995}.
% General expression for expected value of observable quantity % General expression for expected value of observable quantity
If we know the wave function of a particle, we should be able to If the wave function of a particle is known, the expected value
compute the expected value $\braket{Q}$ of any observable quantity we wish. $\braket{Q}$ of any observable quantity can be computed.
It can be shown that for any $Q$, we can find a Indeed, for any $Q$, there exists a corresponding Hermitian operator
corresponding Hermitian operator $\hat{Q}$ such that $\hat{Q}$ such that \cite[Sec.~3.3]{griffiths_introduction_1995}
\cite[Sec.~3.3]{griffiths_introduction_1995}
\begin{align} \begin{align}
\label{eq:gen_expr_Q_exp} \label{eq:gen_expr_Q_exp}
\braket{Q} = \int_{-\infty}^{\infty} \psi^*(x,t) \hat{Q} \psi(x,t) dx \braket{Q} = \int_{-\infty}^{\infty} \psi^*(x,t) \hat{Q} \psi(x,t) dx
@@ -700,9 +704,9 @@ operator to $\hat{Q} = x$, we can write
= \int_{-\infty}^{\infty} x \lvert \psi(x,t) \rvert ^2 dx = \int_{-\infty}^{\infty} x \lvert \psi(x,t) \rvert ^2 dx
.% .%
\end{align*} \end{align*}
Note that $\lvert \psi(x,t) \rvert^2 $ represents the \ac{pdf} of Note that $\lvert \psi(x,t) \rvert^2 $ is the \ac{pdf} of
finding a particle in a specific state. We immediately see that the finding a particle at position $x$. Hence, we immediately see that
formula simplifies to the direct calculation of the expected value. the formula simplifies to the direct calculation of the expected value.
% Determinate states and eigenvalues % Determinate states and eigenvalues
@@ -716,40 +720,40 @@ We begin by translating \Cref{eq:gen_expr_Q_exp} into linear algebra as
.% .%
\end{align} \end{align}
\Cref{eq:gen_expr_Q_exp_lin} expresses an inherently probabilistic \Cref{eq:gen_expr_Q_exp_lin} expresses an inherently probabilistic
relationship. relationship, whereas the determinate states are inherently deterministic.
The determinate states are inherently deterministic.
To relate the two, we note that since determinate states should To relate the two, we note that since determinate states should
always yield the same measurement results, the variance of the always yield the same measurement results, the variance of the
observable should be zero. observable must be zero.
We thus compute \cite[Eq.~3.116]{griffiths_introduction_1995} We thus compute \cite[Eq.~3.116]{griffiths_introduction_1995}
\begin{align} \begin{align}
0 &\overset{!}{=} \braket{(Q - \braket{Q})^2} 0 &\overset{!}{=} \braket{(Q - \braket{Q})^2}
= \braket{e_n \vert (\hat{Q} - \braket{Q})^2 e_n} \nonumber\\ = \braket{e_n \vert (\hat{Q} - \braket{Q})^2 e_n} \nonumber\\
&= \braket{(Q - \braket{Q})e_n \vert (\hat{Q} - \braket{Q}) &= \braket{(\hat{Q} - \braket{Q})e_n \vert (\hat{Q} - \braket{Q})
e_n} \nonumber\\ e_n} \nonumber\\
&= \lVert (Q - \braket{Q}) e_n \rVert^2 \nonumber\\[3mm] &= \lVert (\hat{Q} - \braket{Q}) e_n \rVert^2 \nonumber\\[3mm]
&\hspace{-8mm}\Leftrightarrow (\hat{Q} - \braket{Q}) \ket{e_n} = &\hspace{-14mm}\iff (\hat{Q} - \braket{Q}) \ket{e_n}
0 \nonumber\\ = 0 \nonumber\\
\label{eq:observable_eigenrelation} \label{eq:observable_eigenrelation}
&\hspace{-8mm}\Leftrightarrow \hat{Q}\ket{e_n} &\hspace{-14mm}\iff \hat{Q}\ket{e_n}
= \underbrace{\braket{Q}}_{\lambda_n} \ket{e_n} = \underbrace{\braket{Q}}_{\lambda_n} \ket{e_n}
.% .%
\end{align}% \end{align}%
% %
Because we have assumed the variance to be zero, the expected value By setting the variance to zero, the expected value
$\braket{Q}$ is now the deterministic measurement result $\braket{Q}$ becomes a deterministic measurement result
corresponding to the determinate state corresponding to the determinate state
$\ket{e_n},~n\in \mathbb{N}$. $\ket{e_n},~n\in \mathbb{N}$.
We can see that the determinate states are the \emph{eigenstates} of The determinate states are precisely the \emph{eigenstates} of
the observable operator $\hat{Q}$ and that the measurement values are the observable operator $\hat{Q}$, and the associated measurement
the corresponding \emph{eigenvalues} $\lambda_n$ values are the corresponding \emph{eigenvalues} $\lambda_n$
\cite[Sec.~3.3]{griffiths_introduction_1995}. \cite[Sec.~3.3]{griffiths_introduction_1995}.
% Determinate states as a basis % Determinate states as a basis
As we are modelling the wave function $\psi(x,t)$ as a vector As we model the wave function $\psi(x,t)$ as a vector
$\ket{\psi}$, we can find a set of basis vectors to decompose it into. $\ket{\psi}$, we can find a set of basis vectors to decompose it into.
We can use the determinate states for this purpose, expressing the state as% In particular, we can use the determinate states for this purpose,
expressing the state as%
\footnote{ \footnote{
We only consider the case of having a \emph{discrete We only consider the case of having a \emph{discrete
spectrum} here, i.e., having a discrete set of eigenvalues and vectors. spectrum} here, i.e., having a discrete set of eigenvalues and vectors.
@@ -787,7 +791,7 @@ $Q(x,t,p)$ using a corresponding operator $\hat{Q}$, which allows us
to compute the expected value as $\braket{Q} = \braket{\psi to compute the expected value as $\braket{Q} = \braket{\psi
\vert \hat{Q} \psi}$. \vert \hat{Q} \psi}$.
The eigenvectors of $\hat{Q}$ are the determinate states The eigenvectors of $\hat{Q}$ are the determinate states
$\ket{e_n},~n\in \mathbb{N}$ and the eigenvalues are the respective $\ket{e_n},~n\in \mathbb{N}$, and the eigenvalues are the respective
measurement outcomes. measurement outcomes.
We can decompose an arbitrary state as $\ket{\psi} = \sum_{n=1}^{\infty} c_n We can decompose an arbitrary state as $\ket{\psi} = \sum_{n=1}^{\infty} c_n
\ket{e_n}$, where $\lvert c_n \rvert ^2$ represents the probability \ket{e_n}$, where $\lvert c_n \rvert ^2$ represents the probability
@@ -805,16 +809,16 @@ The measurements we considered in the previous section, for which
\Cref{eq:gen_expr_Q_exp_lin} holds, belong to the category of \Cref{eq:gen_expr_Q_exp_lin} holds, belong to the category of
\emph{projective measurements}. \emph{projective measurements}.
For these, certain restrictions such as repeatability apply: the act For these, certain restrictions such as repeatability apply: the act
of measuring a quantum state should \emph{collapse} it onto one of of measuring a quantum state \emph{collapses} it onto one of
the determinate states. the determinate states.
Further measurements should then yield the same value. Further measurements then yield the same value.
More general methods of modelling measurements exist, e.g., describing More general methods of modelling measurements exist, e.g.,
destructive measurements \cite[Box~2.5]{nielsen_quantum_2010}, but destructive measurements \cite[Box~2.5]{nielsen_quantum_2010}, but
they are not relevant to this work. they are not relevant to this work.
% Projection operators % Projection operators
We can model the collapse of the original state onto one of the We model the collapse of the original state onto one of the
superimposed basis states as a \emph{projection}. superimposed basis states as a \emph{projection}.
To see this, we use To see this, we use
\Cref{eq:determinate_basis,eq:observable_eigenrelation} to compute \Cref{eq:determinate_basis,eq:observable_eigenrelation} to compute
@@ -833,9 +837,9 @@ the separate components as
using \emph{projection operators} \cite[Eq.~3.160]{griffiths_introduction_1995} using \emph{projection operators} \cite[Eq.~3.160]{griffiths_introduction_1995}
\begin{align*} \begin{align*}
\hat{P}_n := \ket{e_n}\bra{e_n}, \hspace{3mm} n\in \mathbb{N} \hat{P}_n := \ket{e_n}\bra{e_n}, \hspace{3mm} n\in \mathbb{N}
. ,
\end{align*}% \end{align*}%
These project a vector onto the subspace spanned by $\ket{e_n}$. which project a vector onto the subspace spanned by $\ket{e_n}$.
% % Using projection operators to measure if a state has a component % % Using projection operators to measure if a state has a component
% % along a basis vector % % along a basis vector
@@ -861,10 +865,9 @@ These project a vector onto the subspace spanned by $\ket{e_n}$.
% Intro % Intro
% TODO: Make sure `quantum gate` is proper terminology
A central concept for quantum computing is that of the \emph{qubit}. A central concept for quantum computing is that of the \emph{qubit}.
We employ it analogously to the classical \emph{bit}. It takes the place of the classical \emph{bit}.
For classical computers, we alter bits' states using \emph{gates}. For classical computers, we alter the state of a bit using \emph{gates}.
We can chain multiple of these gates together to build up more complex logic, We can chain multiple of these gates together to build up more complex logic,
such as half-adders or eventually a full processor. such as half-adders or eventually a full processor.
In principle, quantum computers work in a similar fashion, only that In principle, quantum computers work in a similar fashion, only that
@@ -895,10 +898,10 @@ A qubit is defined to be a system with quantum state
\alpha \\ \alpha \\
\beta \beta
\end{pmatrix} \end{pmatrix}
= \alpha \ket{0} + \beta \ket{1} = \alpha \ket{0} + \beta \ket{1}, \hspace{5mm} \alpha,\beta \in \mathbb{C}
.% .%
\end{align} \end{align}
The overall state of a composite quantum system is described using The overall state of a multi-qubit quantum system is described using
the \emph{tensor product}, denoted as $\otimes$ the \emph{tensor product}, denoted as $\otimes$
\cite[Sec.~2.2.8]{nielsen_quantum_2010}. \cite[Sec.~2.2.8]{nielsen_quantum_2010}.
Take for example the two qubits Take for example the two qubits
@@ -927,9 +930,9 @@ i.e.,
.% .%
\end{split} \end{split}
\end{align} \end{align}
We call $\ket{x_0, \ldots, x_n}~, x_i \in \{0,1\}$ the We call $\ket{x_0, \ldots, x_n},~x_i \in \{0,1\}$ the
\emph{computational basis states} \cite[Sec.~4.6]{nielsen_quantum_2010}. \emph{computational basis states} \cite[Sec.~4.6]{nielsen_quantum_2010}.
To additionally simplify set notation, we define To simplify set notation, we define
\begin{align*} \begin{align*}
\mathcal{M}^{\otimes n} := \underbrace{\mathcal{M}\otimes \ldots \mathcal{M}^{\otimes n} := \underbrace{\mathcal{M}\otimes \ldots
\otimes \mathcal{M}}_{n \text{ times}} \otimes \mathcal{M}}_{n \text{ times}}
@@ -938,7 +941,7 @@ To additionally simplify set notation, we define
% Entanglement % Entanglement
States that are not able to be decomposed into such products States that are not able to be decomposed into products of single-qubit states
are called \emph{entangled} \cite[Sec.~2.2.8]{nielsen_quantum_2010}. are called \emph{entangled} \cite[Sec.~2.2.8]{nielsen_quantum_2010}.
An example of such states are the \emph{Bell states} An example of such states are the \emph{Bell states}
\begin{align*} \begin{align*}
@@ -976,7 +979,7 @@ we now shift our focus to describing the evolution of their states.
We model state changes as operators. We model state changes as operators.
Unlike classical systems, where there are only two possible states and Unlike classical systems, where there are only two possible states and
thus the only possible state change is a bit-flip, a general qubit thus the only possible state change is a bit-flip, a general qubit
state as shown in \Cref{eq:gen_qubit_state} lives on a continuum of values. state as shown in \Cref{eq:gen_qubit_state} lies on a continuum of values.
We thus technically also have an infinite number of possible state changes. We thus technically also have an infinite number of possible state changes.
Fortunately, we can express any operator as a linear combination of the Fortunately, we can express any operator as a linear combination of the
\emph{Pauli operators} \cite[Sec.~2.2]{gottesman_stabilizer_1997} \emph{Pauli operators} \cite[Sec.~2.2]{gottesman_stabilizer_1997}
@@ -1013,13 +1016,15 @@ Fortunately, we can express any operator as a linear combination of the
In fact, if we allow for complex coefficients, the $X$ and $Z$ In fact, if we allow for complex coefficients, the $X$ and $Z$
operators are sufficient to express any other operator as a linear operators are sufficient to express any other operator as a linear
combination \cite[Sec.~2.2]{roffe_quantum_2019}. combination \cite[Sec.~2.2]{roffe_quantum_2019}.
$I$ is the identity operator and $X$ and $Z$ are referred to as Hereby, $I$ is the identity operator and $X$ and $Z$ are referred to as
\emph{bit-flips} and \emph{phase-flips} respectively. \emph{bit-flips} and \emph{phase-flips} respectively.
We call the set $\mathcal{G}_n = \left\{ \pm I,\pm \mathrm{i}I, \pm We call the set
\begin{align}
\mathcal{G}_n = \left\{ \pm I,\pm \mathrm{i}I, \pm
X,\pm \mathrm{i}X, X,\pm \mathrm{i}X,
\pm Y,\pm \mathrm{i}Y, \pm Z, \pm \mathrm{i}Z \right\}^{\otimes n}$ \pm Y,\pm \mathrm{i}Y, \pm Z, \pm \mathrm{i}Z \right\}^{\otimes n}
the \emph{Pauli \end{align}
group} over $n$ qubits. the \emph{Pauli group} over $n$ qubits.
In the context of modifying qubit states, we also call operators \emph{gates}. In the context of modifying qubit states, we also call operators \emph{gates}.
When working with multi-qubit systems, we can also apply Pauli gates When working with multi-qubit systems, we can also apply Pauli gates
@@ -1049,7 +1054,7 @@ Other important operators include the \emph{Hadamard} and
\centering \centering
\begin{align*} \begin{align*}
\begin{array}{c} \begin{array}{c}
CNOT\text{ Operator} \\ \text{CNOT Operator} \\
\hline\\ \hline\\
\ket{00} \mapsto \ket{00} \\ \ket{00} \mapsto \ket{00} \\
\ket{01} \mapsto \ket{01} \\ \ket{01} \mapsto \ket{01} \\
@@ -1060,7 +1065,9 @@ Other important operators include the \emph{Hadamard} and
\end{minipage}% \end{minipage}%
\end{figure} \end{figure}
\vspace{-4mm} \vspace{-4mm}
\noindent Many more operators relevant to quantum computing exist, but they are \noindent The CNOT operator is a 2-qubit gate that applies a bit-flip to the
second qubit conditioned on the state of the first one.
Many more operators relevant to quantum computing exist, but they are
not covered here as they are not central to this work. not covered here as they are not central to this work.
%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%
@@ -1093,9 +1100,8 @@ The control connection is represented by a vertical line connecting
the gate to the corresponding qubit, where a filled dot is placed. the gate to the corresponding qubit, where a filled dot is placed.
A controlled gate applies the respective operation only if the A controlled gate applies the respective operation only if the
control qubit is in state $\ket{1}$. control qubit is in state $\ket{1}$.
An example of this is the CNOT gate introduced in \Cref{fig:cnot_circuit} depicts an example of this: The CNOT gate
\Cref{subsec:Qubits and Multi-Qubit States}, which is depicted in introduced in \Cref{subsec:Qubits and Multi-Qubit States}.
\Cref{fig:cnot_circuit}.
\begin{figure}[t] \begin{figure}[t]
\centering \centering
@@ -1117,7 +1123,7 @@ An example of this is the CNOT gate introduced in
% General motivation behind QEC % General motivation behind QEC
One of the major barriers on the road to building a functioning One of the major barriers on the road to building a functioning and scalable
quantum computer is the inevitability of errors during quantum quantum computer is the inevitability of errors during quantum
computation. These arise due to the difficulty in sufficiently isolating the computation. These arise due to the difficulty in sufficiently isolating the
qubits from external noise \cite[Sec.~1]{roffe_quantum_2019}. qubits from external noise \cite[Sec.~1]{roffe_quantum_2019}.
@@ -1126,7 +1132,7 @@ with the environment act as small measurements, an effect called
\emph{decoherence} of the quantum state \emph{decoherence} of the quantum state
\cite[Sec.~1]{gottesman_stabilizer_1997}. \cite[Sec.~1]{gottesman_stabilizer_1997}.
\ac{qec} is one approach of dealing with this problem, by protecting \ac{qec} is one approach of dealing with this problem, by protecting
the quantum state in a similar fashion to information in classical error a quantum state in a similar fashion to information in classical error
correction. correction.
% The unique challenges of QEC % The unique challenges of QEC
@@ -1146,9 +1152,10 @@ Three main restrictions apply \cite[Sec.~2.4]{roffe_quantum_2019}:
% General idea (logical vs. physical gates) + notation % General idea (logical vs. physical gates) + notation
Much like in classical error correction, in \ac{qec} information Much like in classical error correction, \ac{qec} protects information by
is protected by mapping it onto codewords in a higher-dimensional space, introducing redundancy.
thereby introducing redundancy. The information, represented by a state in a low-dimensional space,
is mapped onto an encoded state in a higher-dimensional space.
To this end, $k \in \mathbb{N}$ \emph{logical qubits} are mapped onto To this end, $k \in \mathbb{N}$ \emph{logical qubits} are mapped onto
$n \in \mathbb{N}$ \emph{physical qubits}, $n>k$. $n \in \mathbb{N}$ \emph{physical qubits}, $n>k$.
We circumvent the no-cloning restriction by not copying the state of any of We circumvent the no-cloning restriction by not copying the state of any of
@@ -1169,8 +1176,9 @@ This is due to the \emph{backlog problem}
\cite[Sec.~II.G.3.]{terhal_quantum_2015}: There are certain gates \cite[Sec.~II.G.3.]{terhal_quantum_2015}: There are certain gates
at which the effect of existing errors on single qubits may be at which the effect of existing errors on single qubits may be
exacerbated by transforming them to multi-qubit errors. exacerbated by transforming them to multi-qubit errors.
We wish to correct the errors before passing qubits through such gates. If we ensure decoding with sufficiently low latency, we can correct
If the \ac{qec} system is not fast enough, there will be an increasing the errors before passing qubits through such gates.
However, if the \ac{qec} system is not fast enough, there will be an increasing
backlog of information at this point in the circuit, leading to an backlog of information at this point in the circuit, leading to an
exponential slowdown in computation. exponential slowdown in computation.
@@ -1200,8 +1208,8 @@ Note that this code is only able to detect single $X$-type errors.
% Measuring stabilizers % Measuring stabilizers
To determine if an error occurred, we want to measure To determine if an error occurred, we aim at to measuring whether a
whether a state belongs state belongs
% TODO: Remove footnote? % TODO: Remove footnote?
% \footnote{ % \footnote{
% It is possible for a state to not completely lie in either subspace. % It is possible for a state to not completely lie in either subspace.
@@ -1210,11 +1218,12 @@ whether a state belongs
% } % }
to $\mathcal{C}$ or $\mathcal{F}$. to $\mathcal{C}$ or $\mathcal{F}$.
As explained in \Cref{subsec:Observables}, physical measurements As explained in \Cref{subsec:Observables}, physical measurements
can be mathematically described using operators whose eigenvalues can be mathematically described using operators, whose eigenvalues
are the possible measurement results. are the possible measurement results.
Here, we need an operator with two eigenvalues and the corresponding Here, we need an operator with two eigenvalues and the corresponding
eigenspaces should be $\mathcal{C}$ and $\mathcal{F}$ respectively. eigenspaces should be $\mathcal{C}$ and $\mathcal{F}$ respectively.
For the two-qubit code, $Z_1Z_2$ is such an operator: For the two-qubit repetition code, $Z_1Z_2 \in \mathcal{G}_2$ is such
an operator:
\begin{align} \begin{align}
Z_1Z_2 E \ket{\psi}_\text{L} &= (+1) E \ket{\psi}_\text{L} Z_1Z_2 E \ket{\psi}_\text{L} &= (+1) E \ket{\psi}_\text{L}
\hspace*{3mm} \forall \hspace*{3mm} \forall
@@ -1225,13 +1234,14 @@ For the two-qubit code, $Z_1Z_2$ is such an operator:
.% .%
\end{align} \end{align}
$E \in \left\{ X,I \right\}$ is an operator describing a possible $E \in \left\{ X,I \right\}$ is an operator describing a possible
error and $E \ket{\psi}_\text{L}$ is the resulting state after that error. single-qubit error and $E \ket{\psi}_\text{L}$ is the resulting state
after that error.
By measuring the corresponding eigenvalue, we can determine if By measuring the corresponding eigenvalue, we can determine if
$E\ket{\psi}_\text{L}$ lies in $\mathcal{C}$ or $\mathcal{F}$. $E\ket{\psi}_\text{L}$ lies in $\mathcal{C}$ or $\mathcal{F}$.
% TODO: If necessary, cite \cite[Sec.~3]{roffe_quantum_2019} for the % TODO: If necessary, cite \cite[Sec.~3]{roffe_quantum_2019} for the
% non-compromising meausrement of the information % non-compromising meausrement of the information
To do this without directly observing (and thus potentially To do this without directly observing and, thus potentially
collapsing) the logical state $\ket{\psi}_\text{L}$, we prepare an collapsing, the logical state $\ket{\psi}_\text{L}$, we prepare an
ancilla qubit with state $\ket{0}_\text{A}$ and entangle it with ancilla qubit with state $\ket{0}_\text{A}$ and entangle it with
$\ket{\psi}_\text{L}$ in such a way that the eigenvalue is indicated $\ket{\psi}_\text{L}$ in such a way that the eigenvalue is indicated
by measuring the ancilla qubit instead. by measuring the ancilla qubit instead.
@@ -1296,11 +1306,11 @@ This effect is referred to as error \emph{digitization}
% The stabilizer group % The stabilizer group
Operators such as $Z_1Z_2$ above are called \emph{stabilizers}. Operators such as $Z_1Z_2$ above are called \emph{stabilizers}.
More generally, an operator $P_i \in \mathcal{G}_n$ is called a stabilizer of an More generally, an operator $P_i \in \mathcal{G}_n$ is a stabilizer of an
$\llbracket n, k, d_\text{min} \rrbracket$ code $\mathcal{C}$, if $\llbracket n, k, d_\text{min} \rrbracket$ code $\mathcal{C}$, if
\begin{itemize} \begin{itemize}
\item It stabilizes all logical states, i.e., \item It stabilizes all logical states, i.e.,
$P_i\ket{\psi}_\text{L} = (+1)\ket{\psi}_\text{L} ~\forall~ $P_i\ket{\psi}_\text{L} = (+1)\ket{\psi}_\text{L}, ~\forall~
\ket{\psi}_\text{L} \in \mathcal{C}$. \ket{\psi}_\text{L} \in \mathcal{C}$.
\item It commutes with all other stabilizers $P_j$ of the code, \item It commutes with all other stabilizers $P_j$ of the code,
i.e., $[P_i, P_j] = 0$. i.e., $[P_i, P_j] = 0$.
@@ -1316,8 +1326,8 @@ Formally, we define the \emph{stabilizer group} $\mathcal{S}$ as
[P_i,P_j] = 0 \forall i,j\right\} [P_i,P_j] = 0 \forall i,j\right\}
.% .%
\end{align*} \end{align*}
We care in particular about the commuting properties of stabilizers We care about the commuting properties of stabilizers with respect to
with respect to possible errors. possible errors, in particular.
The measurement circuit for an arbitrary stabilizer $P_i$ modifies The measurement circuit for an arbitrary stabilizer $P_i$ modifies
the state as \cite[Eq.~29]{roffe_quantum_2019} the state as \cite[Eq.~29]{roffe_quantum_2019}
\begin{align*} \begin{align*}
@@ -1350,6 +1360,7 @@ If a given error $E$ anticommutes with $P_i$, we have
\end{align*} \end{align*}
and measuring the ancilla $\text{A}_i$ corresponding to stabilizer and measuring the ancilla $\text{A}_i$ corresponding to stabilizer
$P_i$ returns 1. $P_i$ returns 1.
Similarly, if it commutes, the ancilla measurement returns 0.
%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%
\subsection{Stabilizer Codes} \subsection{Stabilizer Codes}
@@ -1357,9 +1368,10 @@ $P_i$ returns 1.
% Structure of a stabilizer code % Structure of a stabilizer code
For classical binary linear block codes, we use $n-k$ parity-checks Stabilizer codes are the quantum analogue of classical binary linear
block codes, for which we use $n-k$ parity checks
to reduce the degrees of freedom introduced by the encoding operation. to reduce the degrees of freedom introduced by the encoding operation.
Effectively, each parity-check defines a local code splitting the Effectively, each parity check defines a local code splitting the
vector space in half, with only one part containing valid codewords. vector space in half, with only one part containing valid codewords.
The global code is the intersection of all local codes. The global code is the intersection of all local codes.
We can do the same in the quantum case. We can do the same in the quantum case.
@@ -1377,19 +1389,23 @@ operators $P_i$, each using a circuit as explained in
\Cref{subsec:Stabilizer Measurements}. \Cref{subsec:Stabilizer Measurements}.
Note that this is an abstract representation of the syndrome extraction. Note that this is an abstract representation of the syndrome extraction.
For the actual implementation in hardware, we can transform this into For the actual implementation in hardware, we can transform this into
a circuit that requires only CNOT and H-gates a circuit that requires only CNOT and $H$-gates
\cite[Sec.~10.5.8]{nielsen_quantum_2010}. \cite[Sec.~10.5.8]{nielsen_quantum_2010}.
% Logical operators % Logical operators
In order to modify the logical state encoded using the physical In order to modify the logical state encoded using the physical
qubits, we can use \emph{logical operators} \cite[Sec.~4.2]{roffe_quantum_2019}. qubits, we can use \emph{logical operators} \cite[Sec.~4.2]{roffe_quantum_2019}.
For each qubit, there are two logical operators, $X_i$ and $Z_j$. For a $\llbracket n,k \rrbracket$ stabilizer code, there exist
These are operators that logical operators generated by $2k$ representatives $X_i,
Z_j,~i,j\in[1:k]$ such that
\begin{itemize} \begin{itemize}
\item Commute with all the stabilizers in $\mathcal{S}$. \item They commute with all stabilizers in $\mathcal{S}$.
\item Anti-commute with one another, i.e., $[ \overline{X}_i, \item For $i=j$, they anti-commute with one another, i.e., $[
\overline{Z}_i ]_{+} = \overline{X}_i \overline{Z}_i + \overline{X}_i, \overline{Z}_i ]_{+} = \overline{X}_i
\overline{Z}_i + \overline{Z}_i \overline{X}_i = 0$.
\item For $i\neq j$, they commute with one another, i.e., $[ \overline{X}_i,
\overline{Z}_i ] = \overline{X}_i \overline{Z}_i -
\overline{Z}_i \overline{X}_i = 0$. \overline{Z}_i \overline{X}_i = 0$.
\end{itemize} \end{itemize}
We can also measure these operators to find out the logical state a We can also measure these operators to find out the logical state a
@@ -1399,22 +1415,22 @@ physical state corresponds to \cite[Sec.~2.6]{derks_designing_2025}.
% TODO: Do I have to introduce before that stabilizers only need X % TODO: Do I have to introduce before that stabilizers only need X
% and Z operators? % and Z operators?
We can represent stabilizer codes using a \emph{check matrix} We can represent stabilizer codes using a binary \emph{check matrix}
\cite[Sec.~10.5.1]{nielsen_quantum_2010} $\bm{H} \in \mathbb{F}_2^{(n-k)\times(2n)}$
\cite[Sec.~10.5.1]{nielsen_quantum_2010} with
\begin{align*} \begin{align*}
\bm{H} = \left[ \bm{H} = \left[
\begin{array}{c|c} \begin{array}{c|c}
\bm{H}_X & \bm{H}_Z \bm{H}_X & \bm{H}_Z
\end{array} \end{array}
\right] \right]
,% .%
\end{align*} \end{align*}
with $\bm{H} \in \mathbb{F}_2^{(n-k)\times(2n)}$.
This is similar to a classical \ac{pcm} in that it contains $n-k$ This is similar to a classical \ac{pcm} in that it contains $n-k$
rows, each describing one constraint. Each constraint restricts an additional rows, each describing one constraint. Each constraint restricts an additional
degree of freedom of the higher-dimensional space we use to introduce degree of freedom of the higher-dimensional space we use to introduce
redundancy. redundancy.
In contrast to the classical case, this matrix now has $2n$ columns, In contrast to the classical case, this matrix has $2n$ columns,
as we have to consider both the $X$ and $Z$ type operators that make up as we have to consider both the $X$ and $Z$ type operators that make up
the stabilizers. the stabilizers.
Take for example the Steane code \cite[Eq.~10.83]{nielsen_quantum_2010}. Take for example the Steane code \cite[Eq.~10.83]{nielsen_quantum_2010}.
@@ -1433,8 +1449,8 @@ We can describe it using the check matrix
\right] \right]
.% .%
\end{align} \end{align}
The first $n$ columns correspond to $X$ operators acting on the The first $n$ columns correspond to $X$ stabilizers acting on the
corresponding physical qubit, the rest to the $Z$ operators. corresponding physical qubit, the rest to the $Z$ stabilizers.
\begin{figure}[t] \begin{figure}[t]
\centering \centering
@@ -1463,27 +1479,27 @@ corresponding physical qubit, the rest to the $Z$ operators.
% Intro % Intro
Stabilizer codes are especially practical to work with when they can Stabilizer codes are especially practical to work with when the
handle $X$ and $Z$ type errors independently. stabilizers can be split into one subset consisting only of
$Z$ stabilizers and one consisting only of $X$ stabilizers.
As $Z$ errors anti-commute with $X$ operators in the stabilizers and As $Z$ errors anti-commute with $X$ operators in the stabilizers and
vice versa, this property translates into being able to split the vice versa, this property translates into being able to correct $X$
stabilizers into a subset being made up of only $X$ or $Z$ errors independently.
operators and the rest only of $Z$ operators.
We call such codes \ac{css} codes. We call such codes \ac{css} codes.
We can see this property in \Cref{eq:steane} in the check matrix We can see this property in \Cref{eq:steane} in the check matrix
of the Steane code. of the Steane code.
% Construction % Construction
We can exploit this separate consideration of $X$ and $Z$ errors in We can exploit this separate consideration of $X$ and $Z$ stabilizers in
the construction of \ac{css} codes. the construction of \ac{css} codes.
We combine two binary linear codes $\mathcal{C}_1$ and We combine two binary linear codes $\mathcal{C}_1$ and
$\mathcal{C}_2$, each responsible for correcting one type of error $\mathcal{C}_2$, each responsible for correcting either $Z$ or $X$ errors
\cite[Sec.~10.5.6]{nielsen_quantum_2010}. \cite[Sec.~10.5.6]{nielsen_quantum_2010}.
Using the dual code of $\mathcal{C}_2$ \cite[Eq.~3.4]{ryan_channel_2009} Using the dual code of $\mathcal{C}_2$ \cite[Eq.~3.4]{ryan_channel_2009}
\begin{align*} \begin{align*}
\mathcal{C}_2^\perp := \left\{ \bm{x}' \in \mathbb{F}^2 : \mathcal{C}_2^\perp := \left\{ \bm{x}' \in \mathbb{F}^2 :
\bm{x}' \bm{x}^\text{T} = 0 ~\forall \bm{x} \in \mathcal{C}_2 \right\} \bm{x}' \bm{x}^\mathsf{T} = 0 ~\forall \bm{x} \in \mathcal{C}_2 \right\}
,% ,%
\end{align*} \end{align*}
we define $\bm{H}_X$ as the \ac{pcm} of $\mathcal{C}_2^\perp$ and $\bm{H}_Z$ we define $\bm{H}_X$ as the \ac{pcm} of $\mathcal{C}_2^\perp$ and $\bm{H}_Z$
@@ -1501,7 +1517,7 @@ In order to yield a valid stabilizer code, $\mathcal{C}_1$ and
$\mathcal{C}_2$ must satisfy the commutativity condition $\mathcal{C}_2$ must satisfy the commutativity condition
\begin{align} \begin{align}
\label{eq:css_condition} \label{eq:css_condition}
\bm{H}_X \bm{H}_Z^\text{T} = \bm{0} \bm{H}_X \bm{H}_Z^\mathsf{T} = \bm{0}
.% .%
\end{align} \end{align}
We can ensure this by choosing $\mathcal{C}_1$ and $\mathcal{C}_2$ We can ensure this by choosing $\mathcal{C}_1$ and $\mathcal{C}_2$
@@ -1516,15 +1532,15 @@ such that $\mathcal{C}_2 \subset \mathcal{C}_1$.
Various methods of constructing \ac{qec} codes exist Various methods of constructing \ac{qec} codes exist
\cite{swierkowska_eccentric_2025}. \cite{swierkowska_eccentric_2025}.
Topological codes, for example, encode information in the features of Topological codes, for example, encode information in the features of
a lattice and are intrinsically robust against local errors. a lattice in a way that allows for local interactions between qubits.
Among these, the \emph{surface code} is the most widely studied. Among these, the \emph{surface code} is the most widely studied.
Another example are concatenated codes, which nest one code within Another example are concatenated codes, which nest one code within
another, allowing for especially simple and flexible constructions another, allowing for especially simple and flexible constructions
\cite[Sec.~3.2]{swierkowska_eccentric_2025}. \cite[Sec.~3.2]{swierkowska_eccentric_2025}.
An area of research that has recently seen more attention is that of An area of research that has recently seen more attention is that of
quantum \ac{ldpc} (\acs{qldpc}) codes. quantum \ac{ldpc} (\acs{qldpc}) codes.
They have much better encoding efficiency than, e.g., the surface They have much higher rate than, e.g., surface codes, scaling up of
code, scaling up of which would be prohibitively expensive which would be prohibitively expensive
\cite[Sec.~I]{bravyi_high-threshold_2024}. \cite[Sec.~I]{bravyi_high-threshold_2024}.
% Bivariate Bicycle codes % Bivariate Bicycle codes
@@ -1536,7 +1552,7 @@ $\bm{H}_Z$ are constructed from two matrices $\bm{A}$ and $\bm{B}$ as
\begin{align*} \begin{align*}
\bm{H}_X = [\bm{A} \vert \bm{B}] \bm{H}_X = [\bm{A} \vert \bm{B}]
\hspace*{5mm} \text{and} \hspace*{5mm} \hspace*{5mm} \text{and} \hspace*{5mm}
\bm{H}_Z = [\bm{B}^\text{T} \vert \bm{A}^\text{T}] \bm{H}_Z = [\bm{B}^\mathsf{T} \vert \bm{A}^\mathsf{T}]
.% .%
\end{align*} \end{align*}
This way, we can guarantee the satisfaction of the commutativity This way, we can guarantee the satisfaction of the commutativity
@@ -1576,16 +1592,17 @@ This necessitates a modification of the standard \ac{bp} algorithm
introduced in \Cref{subsec:Iterative Decoding} introduced in \Cref{subsec:Iterative Decoding}
\cite[Sec.~3.1]{yao_belief_2024}. \cite[Sec.~3.1]{yao_belief_2024}.
Instead of attempting to find the most likely codeword directly, the Instead of attempting to find the most likely codeword directly, the
algorithm will now try to find an error pattern $\hat{\bm{e}} \in syndrome-based decoding algorithm tries to find an error pattern
\mathbb{F}_2^n$ that satisfies $\hat{\bm{e}} \in \mathbb{F}_2^n$ that satisfies
\begin{align*} \begin{align*}
\bm{H} \hat{\bm{e}}^\text{T} = \bm{s} \bm{H} \hat{\bm{e}}^\mathsf{T} = \bm{s}
.% .%
\end{align*} \end{align*}
To this end, we initialize the channel \acp{llr} as To this end, we initialize the channel \acp{llr} as
\begin{align*} \begin{align*}
\tilde{L}_i = \log{\frac{P(X_i = 0)}{P(X_i = 1)}} = \log{\frac{1 \tilde{L}_i = \log{\frac{P(X_i = 0)}{P(X_i = 1)}} = \log{
- p_i}{p_i}} \left( \frac{1 - p_i}{p_i} \right)
}
,% ,%
\end{align*} \end{align*}
where $p_i$ is the prior probability of error of \ac{vn} $i$. where $p_i$ is the prior probability of error of \ac{vn} $i$.
@@ -1642,7 +1659,7 @@ The resulting syndrome-based \ac{bp} algorithm is shown in
\right\}$ \right\}$
\EndFor \EndFor
\If{$\bm{H}\hat{\bm{e}}^\text{T} = \bm{s}$} \If{$\bm{H}\hat{\bm{e}}^\mathsf{T} = \bm{s}$}
\State \textbf{break} \State \textbf{break}
\EndIf \EndIf
@@ -1721,7 +1738,7 @@ This way, we obtain the \ac{ler}.
\mathbbm{1}\left\{ L^\text{total}_i \right\}$ \mathbbm{1}\left\{ L^\text{total}_i \right\}$
\EndFor \EndFor
\If{$\bm{H}\hat{\bm{e}}^\text{T} = \bm{s}$} \If{$\bm{H}\hat{\bm{e}}^\mathsf{T} = \bm{s}$}
\State \textbf{break} \State \textbf{break}
\Else \Else
\State $i_\text{max} \leftarrow \argmax_{i \in \mathcal{I}'} \lvert L^\text{total}_i \rvert $ \State $i_\text{max} \leftarrow \argmax_{i \in \mathcal{I}'} \lvert L^\text{total}_i \rvert $

View File

@@ -16,17 +16,19 @@ using qubits.
While the use of error correcting codes may facilitate this, it also While the use of error correcting codes may facilitate this, it also
introduces two new challenges \cite[Sec.~4]{gottesman_introduction_2009}: introduces two new challenges \cite[Sec.~4]{gottesman_introduction_2009}:
\begin{itemize} \begin{itemize}
\item We must be able to perform operations on the encoded state \item To realize a quantum algorithm, we must be able to
in such a way that we do not lose the protection against errors. perform operations on the encoded state in such a way that we
\item \ac{qec} systems are themselves partially implemented in do not lose the protection against errors.
quantum hardware. In addition to the errors we have \item \ac{qec} systems, in particular the syndrome extraction
originally introduced them for, these systems must circuit, are themselves partially implemented in
be able to account for the fact they are implemented on noisy quantum hardware.
hardware themselves. In addition to the errors we have originally introduced them
for, these systems must therefore be able to account for the
fact they are implemented on noisy hardware themselves.
\end{itemize} \end{itemize}
In the literature, both of these points are viewed under the umbrella In the literature, both of these points are viewed under the umbrella
of \emph{fault-tolerant} quantum computing. of \emph{fault-tolerant} quantum computing.
We focus only on the second aspect in this work. In this thesis, we focus on the second aspect.
It was recognized early on as a challenge of \ac{qec} that the correction It was recognized early on as a challenge of \ac{qec} that the correction
machinery itself may introduce new faults \cite[Sec.~III]{shor_scheme_1995}. machinery itself may introduce new faults \cite[Sec.~III]{shor_scheme_1995}.
@@ -43,16 +45,16 @@ address both.
We model the possible occurrence of errors during any processing We model the possible occurrence of errors during any processing
stage as different \emph{error locations} $E_i,~i\in [1:N]$ stage as different \emph{error locations} $E_i,~i\in [1:N]$
in the circuit. in the circuit.
$N \in \mathbb{N}$ is the total number of considered error locations. The parameter $N \in \mathbb{N}$ is the total number of considered
error locations.
The \emph{circuit error vector} $\bm{e} \in \{0,1\}^N$ is a vector The \emph{circuit error vector} $\bm{e} \in \{0,1\}^N$ is a vector
indicating which errors occurred, with indicating which errors occurred, with
\begin{align*} \begin{align*}
e_i := e_i :=
\begin{cases} \begin{cases}
1, & \text{Error $E_i$ occurred} \\ 1, & \text{error $E_i$ occurred}, \\
0, & \text{otherwise} 0, & \text{otherwise}.
\end{cases} \end{cases}
.%
\end{align*} \end{align*}
\Cref{fig:fault_tolerance_overview} illustrates the flow of errors. \Cref{fig:fault_tolerance_overview} illustrates the flow of errors.
Specifically for \ac{css} codes, a \ac{qec} procedure is deemed Specifically for \ac{css} codes, a \ac{qec} procedure is deemed
@@ -72,12 +74,14 @@ fault-tolerant, if \cite[Def.~4.2]{derks_designing_2025}
where $t = \lfloor (d_\text{min} -1)/2 \rfloor$ is the number of where $t = \lfloor (d_\text{min} -1)/2 \rfloor$ is the number of
errors the code is able to correct. errors the code is able to correct.
The vectors $\bm{e}_{\text{output},X}$ and $\bm{e}_{\text{output},Z}$ The vectors $\bm{e}_{\text{output},X}$ and $\bm{e}_{\text{output},Z}$
denote only $X$ and $Z$ errors respectively. denote only $X$ and $Z$ errors, respectively.
% TODO: Properly introduce d_min for QEC, specifically for CSS codes % TODO: Properly introduce d_min for QEC, specifically for CSS codes
In order to deal with internal errors that flip syndrome bits, In order to deal with internal errors that flip syndrome bits,
multiple rounds of syndrome measurements must be performed. multiple rounds of syndrome measurements are performed.
Typically, the number of syndrome extraction rounds is chosen as $d_\text{min}$. Typically, the number of syndrome extraction rounds is chosen as
$d_\text{min}$, e.g., \cite{gong_toward_2024}
\cite{koutsioumpas_automorphism_2025}.
% % This is the definition of a fault-tolerant QEC gadget % % This is the definition of a fault-tolerant QEC gadget
% A \ac{qec} procedure is deemed fault tolerant if % A \ac{qec} procedure is deemed fault tolerant if
@@ -150,7 +154,7 @@ Typically, the number of syndrome extraction rounds is chosen as $d_\text{min}$.
% Intro % Intro
We collect the probabilities of error at each location in the We collect the probabilities of error at each location in the
\emph{noise model}, a vector $\bm{p} \in [0,1]^N$. \emph{noise model}, represented by a vector $\bm{p} \in [0,1]^N$.
There are different types of noise models, each allowing for There are different types of noise models, each allowing for
different error locations in the circuit. different error locations in the circuit.
@@ -178,8 +182,7 @@ $\ket{\psi}_\text{L}$ as \emph{data qubits}.
Note that this is a concrete implementation using CNOT gates, as Note that this is a concrete implementation using CNOT gates, as
opposed to the system-level view introduced in opposed to the system-level view introduced in
\Cref{subsec:Stabilizer Codes}. \Cref{subsec:Stabilizer Codes}.
We visualize the different types of noise models in \Cref{fig:noise_model_types} visualizes the different types of noise models.
\Cref{fig:noise_model_types}.
%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%
\subsection{Bit-Flip Noise} \subsection{Bit-Flip Noise}
@@ -190,7 +193,7 @@ This corresponds to the classical \ac{bsc}, i.e., only $X$ errors on the
data qubits are possible \cite[Appendix~A]{gidney_new_2023}. data qubits are possible \cite[Appendix~A]{gidney_new_2023}.
The occurrence of bit-flip errors is modeled as a Bernoulli process The occurrence of bit-flip errors is modeled as a Bernoulli process
$\text{Bern}(p)$. $\text{Bern}(p)$.
This type of noise model is shown in \Cref{subfig:bit_flip}. \Cref{subfig:bit_flip} shows this type of noise model.
Note that bit-flip noise is not suitable for developing fault-tolerant Note that bit-flip noise is not suitable for developing fault-tolerant
systems, as it does not account for errors during the syndrome extraction. systems, as it does not account for errors during the syndrome extraction.
@@ -223,7 +226,7 @@ Here, we consider multiple rounds of syndrome measurements with a
depolarizing channel before each round. depolarizing channel before each round.
Additionally, we allow for measurement errors by having $X$ error Additionally, we allow for measurement errors by having $X$ error
locations right before each measurement \cite[Appendix~A]{gidney_new_2023}. locations right before each measurement \cite[Appendix~A]{gidney_new_2023}.
Note that it is enough to only consider $X$ errors at these points, Note that it is enough to only consider $X$ errors before measuring,
since that is the only type of error directly affecting the since that is the only type of error directly affecting the
measurement outcomes. measurement outcomes.
This model is depicted in \Cref{subfig:phenomenological}. This model is depicted in \Cref{subfig:phenomenological}.
@@ -253,7 +256,7 @@ While phenomenological noise is useful for some design aspects of
fault-tolerant circuitry, for simulations, circuit-level noise should fault-tolerant circuitry, for simulations, circuit-level noise should
always be used \cite[Sec.~4.2]{derks_designing_2025}. always be used \cite[Sec.~4.2]{derks_designing_2025}.
Note that this introduces new challenges during the decoding process, Note that this introduces new challenges during the decoding process,
as the decoding complexity is increased considerably due to the many as the decoding complexity is considerably increased due to the many
error locations. error locations.
\begin{figure}[t] \begin{figure}[t]
@@ -284,11 +287,11 @@ error locations.
framework for framework for
passing information about a circuit used for \ac{qec} to a decoder. passing information about a circuit used for \ac{qec} to a decoder.
They are also useful as a theoretical tool to aid in the design of They are also useful as a theoretical tool to aid in the design of
fault-tolerant \ac{qec} schemes. fault-tolerant \ac{qec} schemes, e.g., they can be used to easily
E.g., they can be used to easily determine whether a measurement determine whether a measurement schedule is fault-tolerant
schedule is fault-tolerant \cite[Example~12]{derks_designing_2025}. \cite[Example~12]{derks_designing_2025}.
Other approaches of implementing fault-tolerance circuits exist, such as Other approaches of implementing fault-tolerance circuits exist, e.g.,
flag error correction, which uses additional ancilla qubits to detect flag error correction, which uses additional ancilla qubits to detect
potentially damaging high-weight errors \cite[Sec.~1]{chamberland_flag_2018}. potentially damaging high-weight errors \cite[Sec.~1]{chamberland_flag_2018}.
However, \acp{dem} offer some unique advantages However, \acp{dem} offer some unique advantages
@@ -310,7 +313,7 @@ To achieve fault tolerance, the goal we strive towards is to
consider the internal errors in addition to the input errors during consider the internal errors in addition to the input errors during
the decoding process. the decoding process.
The core idea behind detector error models is to do this by defining The core idea behind detector error models is to do this by defining
a new \emph{circuit code} that describes the circuit. a new \emph{circuit code} describing the whole circuit.
Each \ac{vn} of this new code corresponds to an error location in the Each \ac{vn} of this new code corresponds to an error location in the
circuit and each \ac{cn} corresponds to a syndrome measurement. circuit and each \ac{cn} corresponds to a syndrome measurement.
% This circuit code, combined with the prior probabilities of error % This circuit code, combined with the prior probabilities of error
@@ -446,12 +449,11 @@ matrix} $\bm{\Omega} \in \mathbb{F}_2^{M\times N}$, with
\begin{align*} \begin{align*}
\Omega_{\ell,i} = \Omega_{\ell,i} =
\begin{cases} \begin{cases}
1, & \text{Error $i$ flips measurement $\ell$}\\ 1, & \text{error $i$ flips measurement $\ell$},\\
0, & \text{otherwise} 0, & \text{otherwise},
\end{cases} \end{cases}
,%
\end{align*} \end{align*}
where $M \in \mathbb{N}$ is the number of measurements. where $M \in \mathbb{N}$ is the number of performed syndrome measurements.
To obtain $\bm{\Omega}$, we must propagate Pauli errors through the To obtain $\bm{\Omega}$, we must propagate Pauli errors through the
circuit, tracking which measurements they affect circuit, tracking which measurements they affect
\cite[Sec.~2.4]{derks_designing_2025}. \cite[Sec.~2.4]{derks_designing_2025}.
@@ -466,8 +468,8 @@ Each round yields an additional set of syndrome bits,
and we combine them by stacking them in a new vector and we combine them by stacking them in a new vector
$\bm{s} \in \mathbb{F}_2^{R(n-k)}$, where $R \in \mathbb{N}$ is the $\bm{s} \in \mathbb{F}_2^{R(n-k)}$, where $R \in \mathbb{N}$ is the
number of syndrome measurement rounds. number of syndrome measurement rounds.
We thus have to replicate the rows of $\bm{H}_Z$, once for each Thus, we have to replicate the rows of $\bm{H}_Z$, once for each
additional syndrome measurement, to obtain additional syndrome measurement, and obtain
\begin{align*} \begin{align*}
\bm{\Omega}_0 = \bm{\Omega}_0 =
\begin{pmatrix} \begin{pmatrix}
@@ -493,11 +495,11 @@ extraction circuitry, so we still consider only bit flip noise at this stage.
Recall that $\bm{\Omega}_0$ describes which \ac{vn} is connected to Recall that $\bm{\Omega}_0$ describes which \ac{vn} is connected to
which parity check and the syndrome indicates which parity checks which parity check and the syndrome indicates which parity checks
are violated. are violated.
This means that if an error exists at only a single \ac{vn}, we can Therefore, if an error occurs that corresponds to a single \ac{vn},
read off the syndrome in the corresponding column. the measured syndrome is the corresponding column.
If errors occur at multiple locations, the resulting syndrome will be If errors occur at multiple locations, the resulting syndrome will be
the linear combination of the respective columns. the linear combination of the respective columns.
We thus have Thus, we have
\begin{align*} \begin{align*}
\bm{s} \in \text{span} \{\bm{\Omega}_0\} \bm{s} \in \text{span} \{\bm{\Omega}_0\}
.% .%
@@ -505,13 +507,13 @@ We thus have
% Expand to phenomenological % Expand to phenomenological
We now wish to expand the error model to phenomenological noise, though Next, we expand the error model to phenomenological noise, though
only considering $X$ errors in this case. only considering $X$ errors in this case.
We introduce new error locations at the appropriate positions, We introduce new error locations at the appropriate positions,
arriving at the circuit depicted in resulting in the circuit depicted in
\Cref{fig:rep_code_multiple_rounds_phenomenological}. \Cref{fig:rep_code_multiple_rounds_phenomenological}.
For each additional error location, we extend $\bm{\Omega}_0$ by For each additional error location, we extend $\bm{\Omega}_0$ by
appending the corresponding syndrome vector as a column. appending the corresponding syndrome vector as a column, yielding
\begin{gather} \begin{gather}
\label{eq:syndrome_matrix_ex} \label{eq:syndrome_matrix_ex}
\bm{\Omega}_1 = \bm{\Omega}_1 =
@@ -668,7 +670,7 @@ extraction round.
\begin{figure}[t] \begin{figure}[t]
\begin{gather*} \begin{gather*}
\hspace*{-33.3mm}% \hspace*{-31.8mm}%
\begin{array}{c} \begin{array}{c}
E_6 \\ E_6 \\
\downarrow \downarrow
@@ -790,15 +792,14 @@ to a detector.
We should note at this point that the combination of measurements We should note at this point that the combination of measurements
into detectors has no bearing on the actual construction of the into detectors has no bearing on the actual construction of the
syndrome extraction circuitry. syndrome extraction circuitry.
It is something that happens ``virtually'' after the fact and only It is something that happens ``virtually'' and only affects the decoder.
affects the decoder.
Note that we can use the detector matrix $\bm{D}$ to describe the set Note that we can use the detector matrix $\bm{D}$ to describe the set
of possible measurement outcomes under the absence of noise. of possible measurement outcomes under the absence of noise.
Similar to the we use a \ac{pcm} to describe the code space as Similar to the we use a \ac{pcm} to describe the code space as
\begin{equation*} \begin{equation*}
\mathcal{C} \mathcal{C}
= \{ \bm{x} \in \mathbb{F}_2^{n} : \bm{H}\bm{x}^\text{T} = \bm{0} \} = \{ \bm{x} \in \mathbb{F}_2^{n} : \bm{H}\bm{x}^\mathsf{T} = \bm{0} \}
,% ,%
\end{equation*} \end{equation*}
the set of possible measurement outcomes is simply $\text{kern}\{\bm{D}\}$ the set of possible measurement outcomes is simply $\text{kern}\{\bm{D}\}$
@@ -815,7 +816,7 @@ affect the measurements (through $\bm{\Omega}$), and we know how the
measurements relate to the detectors (through $\bm{D}$). measurements relate to the detectors (through $\bm{D}$).
For decoding, we are interested in the effect of the errors on the For decoding, we are interested in the effect of the errors on the
detectors directly. detectors directly.
We thus construct the \emph{detector error matrix} $\bm{H} \in Thus, we construct the \emph{detector error matrix} $\bm{H} \in
\mathbb{F}_2^{D\times N}$ \cite[Def.~2.9]{derks_designing_2025} as \mathbb{F}_2^{D\times N}$ \cite[Def.~2.9]{derks_designing_2025} as
\begin{align*} \begin{align*}
\bm{H} := \bm{D}\bm{\Omega} \bm{H} := \bm{D}\bm{\Omega}
@@ -843,10 +844,10 @@ violate the same set of detectors, i.e.,
\begin{align*} \begin{align*}
\hspace{-15mm} \hspace{-15mm}
% tex-fmt: off % tex-fmt: off
&& \bm{H} \bm{e}_1^\text{T} & \neq \bm{H} \bm{e}_2^\text{T} \\ && \bm{H} \bm{e}_1^\mathsf{T} & \neq \bm{H} \bm{e}_2^\mathsf{T} \\
\iff \hspace{-33mm} && \bm{H} \left( \bm{e}_1 - \bm{e}_2 \right)^\text{T} & \neq 0 \\ \iff \hspace{-33mm} && \bm{H} \left( \bm{e}_1 - \bm{e}_2 \right)^\mathsf{T} & \neq 0 \\
\iff \hspace{-33mm} && \bm{D} \bm{\Omega} \left( \bm{e}_1 - \bm{e}_2 \right)^\text{T} & \neq 0 \\ \iff \hspace{-33mm} && \bm{D} \bm{\Omega} \left( \bm{e}_1 - \bm{e}_2 \right)^\mathsf{T} & \neq 0 \\
\iff \hspace{-33mm} && \bm{\Omega} \left( \bm{e}_1 - \bm{e}_2 \right)^\text{T} & \notin \text{kern} \{\bm{D}\} \iff \hspace{-33mm} && \bm{\Omega} \left( \bm{e}_1 - \bm{e}_2 \right)^\mathsf{T} & \notin \text{kern} \{\bm{D}\}
% tex-fmt: on % tex-fmt: on
.% .%
\end{align*} \end{align*}
@@ -859,7 +860,7 @@ It may, however, change the decoding performance when using a practical decoder.
What constitutes a good set of detectors is difficult to assess What constitutes a good set of detectors is difficult to assess
without performing explicit decoding simulations, since it ultimately without performing explicit decoding simulations, since it ultimately
depends on the decoder employed. depends on the employed decoder.
For iterative decoders, high sparsity is generally beneficial, but For iterative decoders, high sparsity is generally beneficial, but
finding detectors that maximize sparsity is an NP-complete problem finding detectors that maximize sparsity is an NP-complete problem
\cite[Sec.~2.6]{derks_designing_2025}. \cite[Sec.~2.6]{derks_designing_2025}.
@@ -868,7 +869,7 @@ at a later stage.
To the measurement results from each syndrome extraction round we To the measurement results from each syndrome extraction round we
can add the results from the previous round, as illustrated in can add the results from the previous round, as illustrated in
\Cref{fig:detectors_from_measurements_general}. \Cref{fig:detectors_from_measurements_general}.
We thus have $D=n-k$. Thus, we have $D=n-k$.
Concretely, we denote the outcome of Concretely, we denote the outcome of
measurement $\ell \in [1:n-k]$ in round $r \in [1:R]$ by measurement $\ell \in [1:n-k]$ in round $r \in [1:R]$ by
$m_\ell^{(r)} \in \mathbb{F}_2$ $m_\ell^{(r)} \in \mathbb{F}_2$
@@ -935,9 +936,10 @@ note that the error $E_6$ in
\Cref{fig:rep_code_multiple_rounds_phenomenological} has not only \Cref{fig:rep_code_multiple_rounds_phenomenological} has not only
triggered the measurements in the syndrome extraction round immediately triggered the measurements in the syndrome extraction round immediately
afterwards, but all subsequent ones as well. afterwards, but all subsequent ones as well.
To only see errors in the rounds immediately following them, we To only see the effect of errors in the syndrome measurement round
consider our newly defined detectors instead of the measurements, immediately following them, we consider our newly defined detectors
that effectively compute the difference between the measurements. instead of the measurements.
These effectively compute the difference between the measurements.
Each error can only trigger syndrome bits that follow it. Each error can only trigger syndrome bits that follow it.
This is reflected in the triangular structure of $\bm{\Omega}$ in This is reflected in the triangular structure of $\bm{\Omega}$ in
@@ -945,7 +947,7 @@ This is reflected in the triangular structure of $\bm{\Omega}$ in
Combining the measurements into detectors according to Combining the measurements into detectors according to
\Cref{eq:measurement_combination}, we are effectively performing \Cref{eq:measurement_combination}, we are effectively performing
row additions in such a way as to clear the bottom left of the matrix. row additions in such a way as to clear the bottom left of the matrix.
The detector error matrix The resulting detector error matrix
\begin{align*} \begin{align*}
\bm{H} = \bm{H} =
\left( \left(
@@ -959,7 +961,7 @@ The detector error matrix
\end{array} \end{array}
\right) \right)
\end{align*} \end{align*}
obtained this way has a block-diagonal structure. has a block-diagonal structure.
Note that we exploit the fact that each syndrome measurement round is Note that we exploit the fact that each syndrome measurement round is
identical to obtain this structure. identical to obtain this structure.
@@ -1008,9 +1010,8 @@ error matrix $\bm{H}$ and the noise model $\bm{p}$.
\cite[Sec.~6]{derks_designing_2025}. \cite[Sec.~6]{derks_designing_2025}.
It serves as an abstract representation of a circuit and can be used It serves as an abstract representation of a circuit and can be used
both to transfer information to a decoder but also to aid in the both to transfer information to a decoder but also to aid in the
design of fault-tolerant systems. design of fault-tolerant systems, e.g., it can be used to investigate
E.g., it can be used to investigate the properties of a circuit with the properties of a circuit with respect to fault tolerance.
respect to fault tolerance.
It contains all information necessary for the decoding process. It contains all information necessary for the decoding process.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
@@ -1052,7 +1053,7 @@ value, the physical error rate $p_\text{phys}$.
% Per-round LER % Per-round LER
Another aspect that is important to consider is the meaning of the Another important aspect to consider is the meaning of the
\ac{ler} in the context of a \ac{qec} system with multiple \ac{ler} in the context of a \ac{qec} system with multiple
rounds of syndrome measurements. rounds of syndrome measurements.
In order to facilitate the comparability of results obtained from In order to facilitate the comparability of results obtained from
@@ -1063,7 +1064,7 @@ The simplest way of calculating the per-round \ac{ler} is by modeling
each round as an independent experiment. each round as an independent experiment.
For each experiment, an error might occur with a certain probability For each experiment, an error might occur with a certain probability
$p_\text{e,round}$. $p_\text{e,round}$.
The overall probability of error is then Then the overall probability of error is
\begin{align} \begin{align}
\hspace{-12mm} \hspace{-12mm}
p_\text{e,total} &= 1 - (1 - p_\text{e,round})^{R} \nonumber\\ p_\text{e,total} &= 1 - (1 - p_\text{e,round})^{R} \nonumber\\
@@ -1073,13 +1074,14 @@ The overall probability of error is then
.% .%
\hspace{12mm} \hspace{12mm}
\end{align} \end{align}
We approximate $p_\text{e,total}$ using a Monte Carlo simulation and To this end, we approximate $p_\text{e,total}$ using a Monte Carlo
compute the per-round-\ac{ler} using \Cref{eq:per_round_ler}. simulation and
compute the per-round-\ac{ler} according to \Cref{eq:per_round_ler}.
This is the approach taken in \cite{gong_toward_2024}\cite{wang_fully_2025}. This is the approach taken in \cite{gong_toward_2024}\cite{wang_fully_2025}.
Another approach \cite{chen_exponential_2021}% Another approach \cite{chen_exponential_2021}%
\cite{bausch_learning_2024}\cite{beni_tesseract_2025} is to assume an \cite{bausch_learning_2024}\cite{beni_tesseract_2025} is to assume an
exponential decay for the decoder's \emph{logical fidelity} exponential decay for the \emph{logical fidelity} of the decoder
\cite[Eq.~(2)]{bausch_learning_2024} \cite[Eq.~(2)]{bausch_learning_2024}
\begin{align*} \begin{align*}
F_\text{total} = (F_\text{round})^{R} F_\text{total} = (F_\text{round})^{R}
@@ -1104,10 +1106,10 @@ topic to our own work.
\subsection{Stim} \subsection{Stim}
\label{subsec:Stim} \label{subsec:Stim}
It is not immediately apparent how the \ac{dem} will look from looking It is not immediately apparent how the \ac{dem} will look from
at a code's \ac{pcm}, because it heavily depends on the exact circuit considering the \ac{pcm} of a code, because it heavily depends on the
construction and choice of noise model. exact circuit construction and choice of noise model.
As we noted in \Cref{subsec:Measurement Syndrome Matrix}, we can As we noted in \Cref{subsec:Measurement Syndrome Matrix}, we
obtain a measurement syndrome matrix by propagating Pauli frames obtain a measurement syndrome matrix by propagating Pauli frames
through the circuit. through the circuit.
The standard choice of simulation tool used for this purpose is The standard choice of simulation tool used for this purpose is
@@ -1118,16 +1120,16 @@ pypi package.
In fact, it was in this tool that the concept of the \ac{dem} was In fact, it was in this tool that the concept of the \ac{dem} was
first introduced. first introduced.
One capability of stim, and \acp{dem} in general, that we didn't go One capability of stim, and \acp{dem} in general, that we did not
into detail about in this chapter is the merging of error mechanisms. explain in detail in this chapter, is the merging of error mechanisms.
Since \acp{dem} differentiate errors based on their effect on the Since \acp{dem} differentiate errors based on their effect on the
measurements and not on their Pauli type and location measurements and not on their Pauli type and location
\cite[Sec.~1.4.3]{higgott_practical_2024}, it is natural to group \cite[Sec.~1.4.3]{higgott_practical_2024}, it is natural to group
errors that have the same effect. errors that have the same effect, i.e., syndrome.
This slightly lowers the computational complexity of decoding, as the This slightly lowers the computational complexity of decoding, as the
number of resulting \acp{vn} is reduced. number of resulting \acp{vn} is reduced.
While stim is a useful tool for circuit simulation, it doesn't While stim is a useful tool for circuit simulation, it does not
include many utilities for building syndrome extraction circuitry automatically. include many utilities for building syndrome extraction circuitry automatically.
The user has to define most, if not all, of the circuit manually, The user has to define most, if not all, of the circuit manually,
depending on the code in question. depending on the code in question.

View File

@@ -2,35 +2,34 @@
\chapter{Decoding under Detector Error Models} \chapter{Decoding under Detector Error Models}
\label{ch:Decoding} \label{ch:Decoding}
In \Cref{ch:Fundamentals} we introduced the fundamentals of classical In \Cref{ch:Fundamentals}, we introduced the fundamentals of classical
error correction, before moving on to quantum information science and error correction, before turning to quantum information science and
finally combining the two in \acf{qec}. finally combining the two in \acf{qec}.
In \Cref{ch:Fault tolerance} we then turned to fault-tolerance, with In \Cref{ch:Fault tolerance}, we then considered fault-tolerance, with
a focus on a specific way of implementing it, called \acfp{dem}. a focus on a specific way of implementing it, called \acfp{dem}.
In this chapter, we move on from the fundamental concepts and examine In this chapter, we move on from the fundamental concepts and examine
how to apply them in practice. how to apply them in practice.
Specifically, we concern ourselves with the practical aspects of decoding Specifically, we consider the practical aspects of decoding under \acp{dem}.
under \acp{dem}.
We investigate decoding \acf{qldpc} codes under \acp{dem} in particular. In particular, we investigate decoding \acf{qldpc} codes under \acp{dem}.
We focus on \ac{qldpc} codes, as they have emerged as leading We focus on \ac{qldpc} codes, as they have emerged as leading
candidates for practical quantum error correction, offering candidates for practical quantum error correction, offering
comparable thresholds with substantially improved encoding rates good thresholds with substantially improved encoding rates
\cite[Sec.~1]{bravyi_high-threshold_2024}. \cite[Sec.~1]{bravyi_high-threshold_2024}.
Because of this, the decoding algorithms we consider will all be Because of this, the decoding algorithms we consider will all be
related to \acf{bp} in some way. based on \acf{bp}.
Our aim is to build a fault-tolerant \ac{qec} system that works well Our aim is to build a fault-tolerant \ac{qec} system that works well
even in the presence of circuit-level noise. even in the presence of circuit-level noise.
We must overcome two main challenges to achieve this. We must overcome two main challenges to achieve this.
First, recall the problems related to degeneracy, which is inherent First, recall the problems related to degeneracy, which is inherent
to quantum codes. to quantum codes.
Because multiple minimum-weight codewords exist, the \ac{bp} Because multiple minimum-weight solutions to the decoding problem may
algorithm becomes uncertain of the direction to proceed in. exist, the \ac{bp} algorithm becomes uncertain of the direction to proceed in.
Additionally, the commutativity conditions of the stabilizers Additionally, the commutativity conditions of the stabilizers
necessitate the existence of short cycles. necessitate the existence of short cycles.
Together, these two aspects lead to substantial convergence problems Together, these two aspects lead to substantial convergence problems
of \ac{bp} for quantum codes, when it is used on its own. of \ac{bp} for quantum codes, when employed on its own.
Second, the consideration of circuit-level noise introduces many more Second, the consideration of circuit-level noise introduces many more
error locations into the circuit. error locations into the circuit.
@@ -40,28 +39,28 @@ We also perform multiple rounds of syndrome measurements,
exacerbating the problem. exacerbating the problem.
This leads to a massively increased computational complexity and This leads to a massively increased computational complexity and
latency of the decoding process. latency of the decoding process.
In our experiments using the $\llbracket 144,12,12 \rrbracket$ For example, in our experiments using the $\llbracket 144,12,12
\acf{bb} code with $12$ syndrome measurement rounds, for example, the \rrbracket$ \acf{bb} code with $12$ syndrome measurement rounds, the
number of \acp{vn} grew from $144$ to $9504$, and the number of \acp{vn} grew from $144$ to $9504$, and the number of
number of \acfp{cn} grew from $72$ to $1008$. \acfp{cn} grew from $72$ to $1008$.
The first problem is not inherent to \acp{dem} or fault-tolerance, The first problem is not inherent to \acp{dem} or fault-tolerance,
but rather quantum codes in general. but rather quantum codes in general.
Many different approaches to solving it exist, usually centered Many different approaches to solving it exist, usually centered
around somehow modifying \ac{bp}. around modifying \ac{bp}.
The most popular approach is combining a few initial The most popular approach is combining a few initial iterations of
iterations of \ac{bp} with a second decoding algorithm, \ac{osd} \ac{bp} with a second decoding algorithm, \ac{osd}
\cite{roffe_decoding_2020}. \cite{roffe_decoding_2020}.
Other approaches exist, such as \ac{aed} Other approaches exist, such as \ac{aed}
\cite{koutsioumpas_automorphism_2025}, where multiple variations of \cite{koutsioumpas_automorphism_2025}, where multiple variations of
the code are decoded simultaneously to increase the chances of convergence. the syndrome, based on graph and code symmetries, are decoded
simultaneously to increase the chances of convergence.
Here, we will focus on the \acf{bpgd} algorithm Here, we will focus on the \acf{bpgd} algorithm
\cite{yao_belief_2024} we already introduced in \Cref{ch:Fundamentals}, \cite{yao_belief_2024} introduced in \Cref{ch:Fundamentals}.
for reasons that will become clear later in the chapter.
The second problem is inherent to decoding using \acp{dem}. The second problem is inherent to decoding using \acp{dem}.
This is an area that has received less attention. This is an area that has so far received less attention in the literature.
As we saw in \Cref{sec:Quantum Error Correction}, for \ac{qec}, As discerned in \Cref{sec:Quantum Error Correction}, for \ac{qec},
latency is the main constraint, not raw computational complexity. latency is the main constraint, not raw computational complexity.
The main way this is addressed in the literature is \emph{sliding The main way this is addressed in the literature is \emph{sliding
window decoding}, which attempts to divide the overall decoding window decoding}, which attempts to divide the overall decoding
@@ -70,7 +69,7 @@ problem into many smaller ones that can be solved more efficiently.
% TODO: This could potentially be a bit more text (e.g., go into % TODO: This could potentially be a bit more text (e.g., go into
% SC-LDPC like structure that serves as the inspiration for the % SC-LDPC like structure that serves as the inspiration for the
% warm-start decoding. Or just go into warm-start decoding) % warm-start decoding. Or just go into warm-start decoding)
Our own work will focus mostly on the the solution of the second In this thesis, we will focus mostly on the the solution of the second
problem using sliding-window decoding. problem using sliding-window decoding.
We will start by briefly reviewing the existing work related to We will start by briefly reviewing the existing work related to
sliding-window decoding, sliding-window decoding,
@@ -200,7 +199,7 @@ Each of these windows is then decoded separately.
% Some general notes % Some general notes
\Cref{fig:literature} gives an overview over the existing body of work \Cref{fig:literature} gives an overview over the existing works
related to sliding-window decoding. related to sliding-window decoding.
The papers \cite{huang_improved_2023} and \cite{huang_increasing_2024} are The papers \cite{huang_improved_2023} and \cite{huang_increasing_2024} are
lumped together, as they share the same content; lumped together, as they share the same content;
@@ -217,9 +216,9 @@ software freely available online%
\footnote{ \footnote{
\url{https://github.com/gongaa/SlidingWindowDecoder} \url{https://github.com/gongaa/SlidingWindowDecoder}
}. }.
A final thing to note is that \cite{dennis_topological_2002} never Finally, note that \cite{dennis_topological_2002} never explicitly
explicitly mentions sliding windows; the authors call their scheme mentions sliding windows; the authors call their scheme ``overlapping
``overlapping recovery''. recovery''.
% Topological vs QLDPC % Topological vs QLDPC
@@ -244,7 +243,7 @@ Finally, \cite{gong_toward_2024} explores \ac{bb} codes.
% Sequential vs parallel % Sequential vs parallel
After having divided the whole circuit into separate windows, the question After having divided the whole circuit into separate windows, the question
arises of how exactly to realize the decoding. arises of how to make use of the window-like structure for decoding.
There are two main approaches, with differing mechanisms of reducing There are two main approaches, with differing mechanisms of reducing
the latency. the latency.
Some papers decode the sliding windows in a parallel fashion. Some papers decode the sliding windows in a parallel fashion.
@@ -252,7 +251,8 @@ The benefit in this case is
is that classical hardware can be utilized more effectively. is that classical hardware can be utilized more effectively.
Others choose a sequential approach. Others choose a sequential approach.
Here, decoding can start earlier, as there is no need to wait for the Here, decoding can start earlier, as there is no need to wait for the
syndrome measurements of all windows before beginning with the decoding. syndrome measurements of subsequent windows before beginning with the
decoding of earlier windows.
With the exception of \cite{dennis_topological_2002}, literature With the exception of \cite{dennis_topological_2002}, literature
treating topological codes has mostly focused on parallel decoding treating topological codes has mostly focused on parallel decoding
while literature treating \ac{qldpc} codes has wholly considered while literature treating \ac{qldpc} codes has wholly considered
@@ -261,7 +261,7 @@ sequential decoding.
% Deep-dive into QLDPC methods % Deep-dive into QLDPC methods
For this work, the publications treating \ac{qldpc} codes are For this work, the publications treating \ac{qldpc} codes are
especially interesting. particularly interesting.
The experimental conditions for these are summarized in The experimental conditions for these are summarized in
\Cref{table:experimental_conditions}. \Cref{table:experimental_conditions}.
As we noted above, \ac{hgp} and \ac{lp} codes are considered in As we noted above, \ac{hgp} and \ac{lp} codes are considered in
@@ -274,7 +274,7 @@ The employed noise models also differ;
Finally, in \cite{gong_toward_2024} the authors introduce their own variation of Finally, in \cite{gong_toward_2024} the authors introduce their own variation of
\ac{bpgd}, \ac{bp} with \ac{gdg}, while \cite{huang_increasing_2024} \ac{bpgd}, \ac{bp} with \ac{gdg}, while \cite{huang_increasing_2024}
and \cite{kang_quits_2025} use \ac{bp} + \ac{osd}. and \cite{kang_quits_2025} use \ac{bp} + \ac{osd}.
We would additionally like to note that only in We would additionally like to note that only
\cite{gong_toward_2024} and \cite{kang_quits_2025} \cite{gong_toward_2024} and \cite{kang_quits_2025}
explicitly work with the \ac{dem} formalism. explicitly work with the \ac{dem} formalism.
@@ -286,12 +286,12 @@ explicitly work with the \ac{dem} formalism.
sliding-window decoding for \ac{qldpc} codes.} sliding-window decoding for \ac{qldpc} codes.}
\vspace*{3mm} \vspace*{3mm}
\label{table:experimental_conditions} \label{table:experimental_conditions}
\begin{tabular}{l|ccc} \begin{tabular}{lccc}\toprule
% tex-fmt: off % tex-fmt: off
Publication & Code & Noise Model & Decoder \\ \hline Publication & Code & Noise Model & Decoder \\ \midrule
\hspace{-2.5mm}\cite{huang_improved_2023},\cite{huang_increasing_2024} & \acs{hgp}, \acs{lp} & Phenomenological noise & \acs{bp} + \acs{osd} \\ \hspace{-2.5mm}\cite{huang_improved_2023},\cite{huang_increasing_2024} & \acs{hgp}, \acs{lp} & Phenomenological noise & \acs{bp} + \acs{osd} \\
\hspace{-2.5mm}\cite{gong_toward_2024} & \acs{bb} & Circuit-level noise & \acs{bp} + \acs{gdg} \\ \hspace{-2.5mm}\cite{gong_toward_2024} & \acs{bb} & Circuit-level noise & \acs{bp} + \acs{gdg} \\
\hspace{-2.5mm}\cite{kang_quits_2025} & \acs{hgp}, \acs{lp}, \acs{bpc} & Circuit-level noise & \acs{bp} + \ac{osd} \hspace{-2.5mm}\cite{kang_quits_2025} & \acs{hgp}, \acs{lp}, \acs{bpc} & Circuit-level noise & \acs{bp} + \acs{osd} \\ \bottomrule
% tex-fmt: on % tex-fmt: on
\end{tabular} \end{tabular}
\end{table} \end{table}
@@ -382,7 +382,7 @@ explicitly work with the \ac{dem} formalism.
\subsection{Window Splitting and Sequential Sliding-Window Decoding} \subsection{Window Splitting and Sequential Sliding-Window Decoding}
\label{subsec:Window Splitting and Sequential Sliding-Window Decoding} \label{subsec:Window Splitting and Sequential Sliding-Window Decoding}
In this section, we will examine the methodology by which a detector In this section, we examine the methodology by which a detector
error matrix is divided into overlapping windows. error matrix is divided into overlapping windows.
The algorithm detailed here follows \cite{kang_quits_2025}, which The algorithm detailed here follows \cite{kang_quits_2025}, which
is in turn based on \cite{huang_increasing_2024}. is in turn based on \cite{huang_increasing_2024}.
@@ -392,7 +392,7 @@ is in turn based on \cite{huang_increasing_2024}.
Sliding-window decoding is made possible by the time-like structure Sliding-window decoding is made possible by the time-like structure
of the syndrome extraction circuitry. of the syndrome extraction circuitry.
This is especially clearly visible under the \ac{dem} formalism, where This is especially clearly visible under the \ac{dem} formalism, where
this manifests as a block-diagonal structure of the detector it manifests as a block-diagonal structure of the detector
error matrix $\bm{H}$. error matrix $\bm{H}$.
Note that this presupposes a choice of detectors as seen in Note that this presupposes a choice of detectors as seen in
\Cref{subsec:Detector Error Matrix}. \Cref{subsec:Detector Error Matrix}.
@@ -411,11 +411,10 @@ After decoding a window, there is a subset of \acp{cn} that
no longer contribute to decoding, since none of their no longer contribute to decoding, since none of their
neighboring \acp{vn} appear in subsequent windows. neighboring \acp{vn} appear in subsequent windows.
We call the set of \acp{vn} connected to those \acp{cn} the We call the set of \acp{vn} connected to those \acp{cn} the
\emph{commit region} and we wish to commit them before moving to the \emph{commit region} and we commit them before moving to the
next window, i.e., fix the values we estimate for the corresponding bits. next window, i.e., we fix the values we estimate for the corresponding bits.
As mentioned above, the benefit of this sequential sliding-window The benefit of this sequential sliding-window decoding approach is
decoding approach that the decoding process can begin as soon as the syndrome
is that the decoding process can begin as soon as the syndrome
measurements for the first window are complete. measurements for the first window are complete.
% W and F and why we look at rows, not columns % W and F and why we look at rows, not columns
@@ -425,15 +424,15 @@ The \emph{window size} $W \in \mathbb{N}$ represents the number of
syndrome extraction rounds lumped into one window, while syndrome extraction rounds lumped into one window, while
the \emph{step size} $F \in \mathbb{N}$ represents the number of the \emph{step size} $F \in \mathbb{N}$ represents the number of
syndrome extraction rounds skipped before starting the next window. syndrome extraction rounds skipped before starting the next window.
$W$ controls the size of the windows while $F$ controls the overlap The parameter $W$ controls the size of the windows while $F$ controls
between them. the overlap between them.
As illustrated in \Cref{fig:windowing_pcm}, $W$ and $F$ control the As illustrated in \Cref{fig:windowing_pcm}, $W$ and $F$ control the
window dimensions and locations by defining the related \acp{cn}, window dimensions and locations by defining the related \acp{cn},
not the \acp{vn}. not the \acp{vn}.
This is because while the number of overall \acp{cn} is only affected This is because the number of overall \acp{cn} is only affected
by the choice of the underlying code and the number of syndrome by the choice of the underlying code and the number of syndrome
measurement rounds, the number of \acp{vn} depends on the noise model measurement rounds, while the number of \acp{vn} depends on the noise
and is difficult to predict beforehand. model and is difficult to predict beforehand.
\begin{figure}[t] \begin{figure}[t]
\centering \centering
@@ -469,18 +468,16 @@ and is difficult to predict beforehand.
matrix generated from the $\llbracket 72, 6, 6 \rrbracket$ matrix generated from the $\llbracket 72, 6, 6 \rrbracket$
BB code under circuit-level noise. BB code under circuit-level noise.
The block-diagonal structure reflects the time-like locality The block-diagonal structure reflects the time-like locality
of the syndrome extraction circuit., with each block of the syndrome extraction circuit, with each block
corresponding to one syndrome measurement round. corresponding to one syndrome measurement round.
Two consecutive windows are highlighted: the window size $W$ Two consecutive windows are highlighted: The window size $W
controls the number of syndrome rounds included in each \in \mathbb{N}$ controls the number of syndrome rounds
window, while the step size $F$ controls how many rounds included in each window, while the step size $F \in
separate the start of one window from the next. \mathbb{N}$ controls how many rounds separate the start of
one window from the next.
The bracketed region indicates the commit The bracketed region indicates the commit
region of the first window, i.e., the \acp{vn} that are committed region of the first window, i.e., the \acp{vn} that are committed
before moving to the second window. before moving to the decoding of the second window.
% Visualization of the windowing process on a detector
% error matrix generated from the $\llbracket 72, 6, 6
% \rrbracket$ BB code.
} }
\label{fig:windowing_pcm} \label{fig:windowing_pcm}
\end{figure} \end{figure}
@@ -493,52 +490,53 @@ We use the variables $n,m \in \mathbb{N}$ to describe the number of
We index the \acp{vn} using the variable $i \in \mathcal{I} := We index the \acp{vn} using the variable $i \in \mathcal{I} :=
[0:n-1]$ and the \acp{cn} using the variable $j \in \mathcal{J} := [ 0 : m-1]$. [0:n-1]$ and the \acp{cn} using the variable $j \in \mathcal{J} := [ 0 : m-1]$.
Finally, we call $\mathcal{N}_\text{V}(i) = \left\{ j\in \mathcal{J}: Finally, we call $\mathcal{N}_\text{V}(i) = \left\{ j\in \mathcal{J}:
\bm{H}_{j,i} = 1 \right\}$ and $\mathcal{N}_\text{C}(j) := \left\{ i H_{j,i} = 1 \right\}$ and $\mathcal{N}_\text{C}(j) := \left\{ i
\in \mathcal{I} : \bm{H}_{j,i} = 1 \right\}$ the neighborhoods of the \in \mathcal{I} : H_{j,i} = 1 \right\}$ the neighborhoods of the
corresponding nodes. respective nodes.
In this case, we take $\bm{H} \in \mathbb{F}_2^{m\times n}$ to be the In this case, we take $\bm{H} \in \mathbb{F}_2^{m\times n}$ to be the
check matrix of the underlying code, from which the \ac{dem} was generated. check matrix of the underlying code, from which the \ac{dem} was generated.
We use $m_\text{DEM}, \mathcal{I}_\text{DEM}$, and $\mathcal{J}_\text{DEM}$ We use $m_\text{DEM}, \mathcal{I}_\text{DEM}$, and $\mathcal{J}_\text{DEM}$
to refer to the respective values defined from the detector error matrix. to refer to the respective values defined for the detector error matrix.
% How we get the corresponding rows % How we get the corresponding rows
We begin by describing the sets of \acp{cn} relevant to each window. First, we describe the sets of \acp{cn} relevant to each window.
For indexing, we use the variable $\ell \in [0:n_\text{win} - 1]$, For indexing, we use the variable $\ell \in [0:n_\text{win} - 1]$,
where $n_\text{win} \in \mathbb{N}$ is the number of windows. where $n_\text{win} \in \mathbb{N}$ is the number of windows.
Because we defined the step size $F$ as the number of syndrome Since we define the step size $F$ as the number of syndrome
extraction rounds to skip, the first \ac{cn} of window $\ell$ should have index extraction rounds to skip, the first \ac{cn} of window $\ell$ has index
$\ell F m$. $\ell F m$.
Similarly, because of the way we defined the window size $W$, the Similarly, due to the definition of the window size $W$, the
number of \acp{cn} should be $Wm$ for all but the last window. number of \acp{cn} per window is $Wm$ for all but the last window.
The number of \acp{cn} in the last window may differ if there are The number of \acp{cn} in the last window may differ if there are
not enough \acp{cn} left to completely fill it. not enough \acp{cn} left to completely fill it.
We thus define Thus, we define
\begin{align*} \begin{align*}
\mathcal{J}_\text{win}^{(\ell)} &:= \left\{ j\in \mathcal{J}_\text{DEM}:~ \mathcal{J}_\text{win}^{(\ell)} &:= \left\{ j\in \mathcal{J}_\text{DEM}:~
\ell F m \le j < \min \left\{m_\text{DEM}, (\ell F + W) m \right\} \ell F m \le j < \min \left\{m_\text{DEM}, (\ell F + W) m \right\}
\right\} \\[2mm] \right\} \\[2mm]
& \hspace{30mm} \text{and} \\[2mm] & \hspace{37mm} \text{and} \\[2mm]
\mathcal{J}_\text{commit}^{(\ell)} &:= \left\{ j\in \mathcal{J}_\text{DEM}:~ \mathcal{J}_\text{commit}^{(\ell)} &:= \left\{ j\in \mathcal{J}_\text{DEM}:~
\ell F m \le j < \min \left\{m_\text{DEM}, (\ell + 1) F m \right\} \ell F m \le j < \min \left\{m_\text{DEM}, (\ell + 1) F m \right\}
\right\} \right\}
.% ,%
\end{align*} \end{align*}
$\mathcal{J}_\text{win}^{(\ell)}$ is the set of all \acp{cn} in the where $\mathcal{J}_\text{win}^{(\ell)}$ is the set of all \acp{cn} in the
window while $\mathcal{J}_\text{commit}^{(\ell)}$ is the set of \acp{cn} window and $\mathcal{J}_\text{commit}^{(\ell)}$ is the set of \acp{cn}
that do not contribute to the next window and whose neighboring that do not contribute to the next window and whose neighboring
\acp{vn} will thus be committed. \acp{vn} will thus be committed.
We can additionally define the set of \acp{cn} that are shared between windows Additionally, we can define the set of \acp{cn} that are shared between windows
$\ell$ and $\ell + 1$ as $\mathcal{J}_\text{overlap}^{(\ell)} := $\ell$ and $\ell + 1$ as $\mathcal{J}_\text{overlap}^{(\ell)} :=
\mathcal{J}_\text{win}^{(\ell)}\setminus \mathcal{J}_\text{commit}^{(\ell)}$. \mathcal{J}_\text{win}^{(\ell)}\setminus \mathcal{J}_\text{commit}^{(\ell)}$.
% How we get the corresponding columns % How we get the corresponding columns
We can now turn our attention to defining the sets of \acp{vn} relevant We now turn our attention to defining the sets of \acp{vn} relevant
to each window. to each window.
We first introduce a helper function $i_\text{max} : We first introduce a helper function $i_\text{max} :
\mathcal{P}(\mathbb{N}) \to \mathbb{N}$, which takes a set of \mathcal{P}(\mathbb{N}) \to \mathbb{N}$, which takes a set
\ac{cn} indices and returns the largest neighboring \ac{vn} index. $\mathcal{S} \in \mathcal{P}(\mathbb{N})$ of \ac{cn} indices and
returns the largest neighboring \ac{vn} index.
We define We define
\begin{align*} \begin{align*}
i_\text{max}\left( \mathcal{S} \right) := \max \left\{ i\in i_\text{max}\left( \mathcal{S} \right) := \max \left\{ i\in
@@ -552,13 +550,13 @@ where we set $i_\text{max} (\emptyset) = -1$ by convention%
and $\mathcal{I}_\text{win}^{(\ell)}$ appropriately. and $\mathcal{I}_\text{win}^{(\ell)}$ appropriately.
}% }%
. .
The commit region of window $\ell$ should include all of the \acp{vn} The commit region of window $\ell$ includes all of the \acp{vn}
neighboring any of the \acp{cn} in $\mathcal{J}_\text{commit}^{(\ell)}$. neighboring any of the \acp{cn} in $\mathcal{J}_\text{commit}^{(\ell)}$.
Consequently, the maximum index of the \acp{vn} we consider should be Consequently, the maximum index of the \acp{vn} we consider is
$i_\text{max}(\mathcal{J}_\text{commit}^{(\ell)})$. $i_\text{max}(\mathcal{J}_\text{commit}^{(\ell)})$.
Additionally, the set of \acp{vn} committed in the next window should Additionally, the set of \acp{vn} committed in the next window should
start immediately afterwards. contain the next largest index.
We thus define Thus we define
\begin{align*} \begin{align*}
\mathcal{I}_\text{commit}^{(\ell)} \mathcal{I}_\text{commit}^{(\ell)}
&:= \left\{i \in \mathcal{I}_\text{DEM} :~ &:= \left\{i \in \mathcal{I}_\text{DEM} :~
@@ -680,7 +678,7 @@ and after decoding all windows we will therefore have committed all \acp{vn}.
% Syndrome update % Syndrome update
\Cref{fig:vis_rep} illustrates the meaning of the various sets of nodes. \Cref{fig:vis_rep} illustrates the the various sets of nodes.
We can also see a subtlety we must handle carefully when We can also see a subtlety we must handle carefully when
moving on to decode the next window. moving on to decode the next window.
While the \acp{vn} in $\mathcal{J}_\text{commit}^{(\ell)}$ have no While the \acp{vn} in $\mathcal{J}_\text{commit}^{(\ell)}$ have no
@@ -690,9 +688,12 @@ This is the case because these \acp{vn} have neighboring \acp{cn} in
the next window. the next window.
The part of the detector error matrix $\bm{H}_\text{DEM}$ describing The part of the detector error matrix $\bm{H}_\text{DEM}$ describing
these connections is these connections is
$\bm{H}_\text{overlap}^{(\ell)} = \begin{align*}
\bm{H}_\text{overlap}^{(\ell)} :=
\left(\bm{H}_\text{DEM}\right)_{\mathcal{J}_\text{overlap}^{(\ell)}, \left(\bm{H}_\text{DEM}\right)_{\mathcal{J}_\text{overlap}^{(\ell)},
\mathcal{I}_\text{commit}^{(\ell)}}$. \mathcal{I}_\text{commit}^{(\ell)}}
.%
\end{align*}
We have to account for this fact by updating the syndrome $\bm{s}$ We have to account for this fact by updating the syndrome $\bm{s}$
based on the committed bit values. based on the committed bit values.
Specifically, if $\hat{\bm{e}}_\text{commit}^{(\ell)}$ describes the error Specifically, if $\hat{\bm{e}}_\text{commit}^{(\ell)}$ describes the error
@@ -700,7 +701,7 @@ estimates committed after decoding window $\ell$, we have to set
\begin{align*} \begin{align*}
\left(\bm{s}\right)_{\mathcal{J}_\text{overlap}^{(\ell)}} = \left(\bm{s}\right)_{\mathcal{J}_\text{overlap}^{(\ell)}} =
\bm{H}_\text{overlap}^{(\ell)} \bm{H}_\text{overlap}^{(\ell)}
\left( \hat{\bm{e}}_\text{commit}^{(\ell)} \right)^\text{T} \left( \hat{\bm{e}}_\text{commit}^{(\ell)} \right)^\mathsf{T}
.% .%
\end{align*} \end{align*}
@@ -711,7 +712,7 @@ estimates committed after decoding window $\ell$, we have to set
% Intro: Problem with above procedure % Intro: Problem with above procedure
The sliding-window structure visible in \Cref{fig:windowing_pcm} is The sliding-window structure visible in \Cref{fig:windowing_pcm} is
highly reminiscent of windowed decoding for \ac{sc}-\ac{ldpc} codes. reminiscent of windowed decoding for \ac{sc}-\ac{ldpc} codes.
Switching our viewpoint to the Tanner graph depicted in Switching our viewpoint to the Tanner graph depicted in
\Cref{fig:messages_decimation_tanner}, however, we can see an important \Cref{fig:messages_decimation_tanner}, however, we can see an important
difference between \ac{sc}-\ac{ldpc} decoding and the difference between \ac{sc}-\ac{ldpc} decoding and the
@@ -719,7 +720,7 @@ sliding-window decoding procedure detailed above.
While the windowing process is similar, the algorithm above While the windowing process is similar, the algorithm above
reinitializes the decoder to start from a clean state when moving to reinitializes the decoder to start from a clean state when moving to
the next window. the next window.
It therefore does not make use of the integral property of Therefore, it does not make use of the integral property of
windowed \ac{sc}-\ac{ldpc} decoding of exploiting the spatially coupled windowed \ac{sc}-\ac{ldpc} decoding of exploiting the spatially coupled
structure by passing soft information from earlier to later spatial positions. structure by passing soft information from earlier to later spatial positions.
@@ -731,8 +732,9 @@ still relevant to the decoding of the next.
This may somewhat limit the variety of \emph{inner decoders}, i.e., This may somewhat limit the variety of \emph{inner decoders}, i.e.,
the decoders decoding the individual windows, the warm-start the decoders decoding the individual windows, the warm-start
initialization can be used with. initialization can be used with.
E.g., \ac{bp}+\ac{osd} does not immediately seem suitable, though For instance, \ac{bp}+\ac{osd} does not immediately seem suitable, as
this remains to be investigated. it performs a hard decision on the \acp{vn}, though this remains to
be investigated.
We chose to investigate first plain \ac{bp} due to its simplicity and We chose to investigate first plain \ac{bp} due to its simplicity and
then \ac{bpgd} because of the availability of recently computed messages. then \ac{bpgd} because of the availability of recently computed messages.
@@ -900,7 +902,8 @@ To see how we realize this in practice, we reiterate the steps of the
\right) \\[3mm] \right) \\[3mm]
\text{\ac{cn} Update (Min-Sum): }& \text{\ac{cn} Update (Min-Sum): }&
\displaystyle L_{i \leftarrow j} = (-1)^{s_j}\cdot \prod_{i' \displaystyle L_{i \leftarrow j} = (-1)^{s_j}\cdot \prod_{i'
\in \mathcal{N}_\text{C}(j)\setminus \{i\}} \sign \left( L_{i' \rightarrow j} \in \mathcal{N}_\text{C}(j)\setminus \{i\}} \sign \left( L_{i'
\rightarrow j}
\right) \cdot \min_{i' \in \mathcal{N}_\text{C}(j)\setminus \{i\}} \lvert \right) \cdot \min_{i' \in \mathcal{N}_\text{C}(j)\setminus \{i\}} \lvert
L_{i'\rightarrow j} \rvert \\[3mm] L_{i'\rightarrow j} \rvert \\[3mm]
\label{eq:vn_update} \label{eq:vn_update}
@@ -943,7 +946,7 @@ We can then continue decoding the next window as usual.
We can further simplify the algorithm. We can further simplify the algorithm.
Looking carefully at \Cref{eq:vn_update} we notice that when the Looking carefully at \Cref{eq:vn_update} we notice that when the
\ac{cn} to \ac{vn} messages $L_{i\leftarrow j}$ have been zero-initialized, \ac{cn} to \ac{vn} messages $L_{i\leftarrow j}$ have been initialized to zero,
the \ac{vn} update degenerates to the \ac{vn} update degenerates to
\begin{align*} \begin{align*}
\displaystyle L_{i \rightarrow j} = \displaystyle L_{i \rightarrow j} =
@@ -971,7 +974,7 @@ Note that the decoding procedure performed on the individual windows
\label{alg:warm_start_bp} \label{alg:warm_start_bp}
\begin{algorithmic}[1] \begin{algorithmic}[1]
\State \textbf{Initialize:} $\hat{\bm{e}}^\text{total} \leftarrow \bm{0}$ \State \textbf{Initialize:} $\hat{\bm{e}}^\text{total} \leftarrow \bm{0}$
\State \textbf{Initialize:} $L_{i\leftarrow j} = 0 \State \textbf{Initialize:} $L_{i\leftarrow j} = 0,
~\forall~ i\in \mathcal{I}, j\in \mathcal{J}$ ~\forall~ i\in \mathcal{I}, j\in \mathcal{J}$
\For{$\ell = 0, \ldots, n_\text{win}-1$} \For{$\ell = 0, \ldots, n_\text{win}-1$}
\For{$\nu = 0, \ldots, n_\text{iter}-1$} \For{$\nu = 0, \ldots, n_\text{iter}-1$}
@@ -983,7 +986,7 @@ Note that the decoding procedure performed on the individual windows
\State $\displaystyle\left(\hat{\bm{e}}^\text{total}\right)_{\mathcal{I}^{(\ell)}_\text{commit}} \leftarrow \hat{\bm{e}}^{(\ell)}_\text{commit}$ \State $\displaystyle\left(\hat{\bm{e}}^\text{total}\right)_{\mathcal{I}^{(\ell)}_\text{commit}} \leftarrow \hat{\bm{e}}^{(\ell)}_\text{commit}$
\State $\displaystyle\left(\bm{s}\right)_{\mathcal{J}_\text{overlap}^{(\ell)}} \State $\displaystyle\left(\bm{s}\right)_{\mathcal{J}_\text{overlap}^{(\ell)}}
\leftarrow \bm{H}_\text{overlap}^{(\ell)} \leftarrow \bm{H}_\text{overlap}^{(\ell)}
\left( \hat{\bm{e}}_\text{commit}^{(\ell)} \right)^\text{T}$ \left( \hat{\bm{e}}_\text{commit}^{(\ell)} \right)^\mathsf{T}$
\If{$\ell < n_\text{win} - 1$} \If{$\ell < n_\text{win} - 1$}
\State $L^{(\ell+1)}_{i\leftarrow j} \leftarrow \State $L^{(\ell+1)}_{i\leftarrow j} \leftarrow
L^{(\ell)}_{i\leftarrow j} L^{(\ell)}_{i\leftarrow j}
@@ -1010,8 +1013,8 @@ the most reliable \ac{vn}, meaning we perform a hard decision and
remove it from the following decoding process. remove it from the following decoding process.
This means that when moving from one window to the next, we now have This means that when moving from one window to the next, we now have
more information available: not just the \ac{bp} messages but also the more information available: Not just the \ac{bp} messages but also the
information about what \acp{vn} were decimated and to what values. Information about what \acp{vn} were decimated and to what values.
We call this \emph{decimation information} in the following. We call this \emph{decimation information} in the following.
We can extend \Cref{alg:warm_start_bp} by additionally passing the We can extend \Cref{alg:warm_start_bp} by additionally passing the
decimation information after initializing the \ac{cn} to \ac{vn} messages. decimation information after initializing the \ac{cn} to \ac{vn} messages.
@@ -1181,7 +1184,7 @@ decimation information after initializing the \ac{cn} to \ac{vn} messages.
% \State $\displaystyle\left(\hat{\bm{e}}^\text{total}\right)_{\mathcal{I}^{(\ell)}_\text{commit}} \leftarrow \hat{\bm{e}}^{(\ell)}_\text{commit}$ % \State $\displaystyle\left(\hat{\bm{e}}^\text{total}\right)_{\mathcal{I}^{(\ell)}_\text{commit}} \leftarrow \hat{\bm{e}}^{(\ell)}_\text{commit}$
% \State $\displaystyle\left(\bm{s}\right)_{\mathcal{J}_\text{overlap}^{(\ell)}} % \State $\displaystyle\left(\bm{s}\right)_{\mathcal{J}_\text{overlap}^{(\ell)}}
% \leftarrow \bm{H}_\text{overlap}^{(\ell)} % \leftarrow \bm{H}_\text{overlap}^{(\ell)}
% \left( \hat{\bm{e}}_\text{commit}^{(\ell)} \right)^\text{T}$ % \left( \hat{\bm{e}}_\text{commit}^{(\ell)} \right)^\mathsf{T}$
% \If{$\ell < n_\text{win} - 1$} % \If{$\ell < n_\text{win} - 1$}
% \State $L^{(\ell+1)}_{i\leftarrow j} \leftarrow % \State $L^{(\ell+1)}_{i\leftarrow j} \leftarrow
% L^{(\ell)}_{i\leftarrow j} % L^{(\ell)}_{i\leftarrow j}
@@ -1227,7 +1230,7 @@ model, both of which depend on the code and noise model in question.
% Software stack: Layer 3 % Software stack: Layer 3
Even further up, given an already constructed syndrome extraction Even further up, given an already constructed syndrome extraction
circuit and the resulting \acf{dem}, we must split the detector error circuit and the resulting \acf{dem}, we split the detector error
matrix into separate windows and manage the interplay between the matrix into separate windows and manage the interplay between the
inner decoders acting on those individual windows. inner decoders acting on those individual windows.
@@ -1246,12 +1249,11 @@ For the circuit generation, we employed utilities from QUITS
\cite{kang_quits_2025}, which provides syndrome extraction circuitry \cite{kang_quits_2025}, which provides syndrome extraction circuitry
generation for a number of different \ac{qldpc} codes. generation for a number of different \ac{qldpc} codes.
We initially created a Python implementation, which used QUITS for the window We initially created a Python implementation, which used QUITS for the window
splitting and subsequent sliding-window decoding as well. splitting and subsequent sliding-window decoding as well, before
The \ac{bp} and \ac{bpgd} decoders were also initially implemented in Python. reimplementing in Rust.
After a preliminary investigation, we opted for a complete The \ac{bp} and \ac{bpgd} are implemented in Rust to achieve
reimplementation in Rust to achieve higher simulation speeds leveraging higher simulation speeds leveraging the compiled nature of the
the compiled nature of the language. language.
We reimplemented both the window splitting and the decoders.
% Global experimental setup % Global experimental setup
@@ -1267,7 +1269,7 @@ For the generation of the \ac{dem} we set the number of syndrome
extraction rounds to $12$, similarly to \cite{gong_toward_2024}, and extraction rounds to $12$, similarly to \cite{gong_toward_2024}, and
we defined our detectors as in the example in we defined our detectors as in the example in
\Cref{subsec:Detector Error Matrix}. \Cref{subsec:Detector Error Matrix}.
We employed circuit-lose noise as described in We employed circuit-level noise as described in
\Cref{subsec:Choice of Noise Model} as our noise model, specifically standard \Cref{subsec:Choice of Noise Model} as our noise model, specifically standard
ciruit-based depolarizing noise \cite[Sec.~VIII]{fowler_high-threshold_2009}, ciruit-based depolarizing noise \cite[Sec.~VIII]{fowler_high-threshold_2009},
i.e., all error locations in the circuit get assigned the same i.e., all error locations in the circuit get assigned the same
@@ -1282,21 +1284,22 @@ generated by simulating at least $200$ logical error events.
% Local experimental setup % Local experimental setup
We began our investigation by using \ac{bp} with no further We begin our investigation by using \ac{bp} with no further
modifications as the inner decoder. modifications as the inner decoder.
We chose the min-sum variant of \ac{bp} due to its low computational complexity. We choose the min-sum variant of \ac{bp} due to its low computational
complexity.
% [Thread] Get impression for max gain % [Thread] Get impression for max gain
We initially wanted to gain an impression for the performance gain we could We initially want to gain an impression for the performance gain we could
expect from a modification to the sliding-window decoding procedure. expect from a modification to the sliding-window decoding procedure.
To this end, we began by analyzing the decoding performance of the To this end, we begin by analyzing the decoding performance of the
original process, without our warm-start modification. original process, without our warm-start modification.
We will call this \emph{cold-start} decoding in the following. We will call this \emph{cold-start} decoding in the following.
Because we expected more global decoding to work better (the inner Because we expect more global decoding to work better (the inner
decoder then has access to a larger portion of the long-range decoder has access to a larger portion of the long-range
correlations encoded in the detector error matrix before any commit correlations encoded in the detector error matrix before any commit
is made) we initially decided to use decoding on the whole detector is made) we initially decide to use decoding on the whole detector
error matrix as a proxy for the attainable decoding performance. error matrix as a proxy for the attainable decoding performance.
\begin{figure}[t] \begin{figure}[t]
@@ -1400,8 +1403,8 @@ this trend and, as expected, achieves the strongest performance.
The fact that the $W = 5$ curve is already very close to the The fact that the $W = 5$ curve is already very close to the
whole-block decoder indicates that the marginal benefit of enlarging whole-block decoder indicates that the marginal benefit of enlarging
the window saturates after a certain point. the window saturates after a certain point.
From a practical standpoint, the choice of $W$ thus represents a Thus, from a practical standpoint, the choice of $W$ represents a
trade-off between decoding latency and accuracy: larger windows trade-off between decoding latency and accuracy: Larger windows
delay the start of decoding by requiring more syndrome extraction delay the start of decoding by requiring more syndrome extraction
rounds to be collected upfront, while the diminishing returns above rounds to be collected upfront, while the diminishing returns above
$W = 4$ suggest that growing the window much further yields little $W = 4$ suggest that growing the window much further yields little
@@ -1409,7 +1412,7 @@ additional accuracy in return.
% [Thread] First comparison with warm start % [Thread] First comparison with warm start
Next, we additionally generated error rate curves for warm-start Next, we additionally simulate error rate curves for warm-start
sliding-window decoding to assess how much of the gap between sliding-window decoding to assess how much of the gap between
cold-start and whole-block decoding can be recovered by our modification. cold-start and whole-block decoding can be recovered by our modification.
We chose the same window sizes as before, so that the warm- and We chose the same window sizes as before, so that the warm- and
@@ -1508,7 +1511,7 @@ The dashed colored curves reproduce the cold-start results from
corresponding warm-start runs for the same window sizes corresponding warm-start runs for the same window sizes
$W \in \{3, 4, 5\}$. $W \in \{3, 4, 5\}$.
The remaining experimental parameters are unchanged: The remaining experimental parameters are unchanged:
the step size is fixed to $F = 1$, The step size is fixed to $F = 1$,
the inner \ac{bp} decoder is allowed up to $200$ iterations per the inner \ac{bp} decoder is allowed up to $200$ iterations per
window invocation, the black curve again gives the whole-block window invocation, the black curve again gives the whole-block
reference, and the physical error rate is swept from $p = 0.001$ to reference, and the physical error rate is swept from $p = 0.001$ to
@@ -1537,16 +1540,15 @@ consecutive windows spans $W - F = W - 1$ syndrome rounds, so larger
$W$ implies that more messages are carried over and a larger fraction $W$ implies that more messages are carried over and a larger fraction
of the next window starts in a warm state. of the next window starts in a warm state.
% TODO: Possibly insert explanation for higher gain at lowre error rates % TODO: Possibly insert explanation for higher gain at lowre error rates
A perhaps surprising observation is that the warm-start curve for A perhaps surprising observation is that the warm-start for
$W = 5$ actually lies below the whole-block reference across the $W = 5$ outperforms the whole-block reference across the
entire range of physical error rates, even though warm-start entire range of physical error rates, even though warm-start
sliding-window decoding is, by construction, more local than sliding-window decoding is, by construction, more local than
whole-block decoding. whole-block decoding.
A possible explanation for this effect is discussed in the following.
% [Thread] Warm start is better than whole due to more effective iterations % [Thread] Warm start is better than whole due to more effective iterations
A possible explanation for this surprising behavior lies in the A possible explanation for this behavior lies in the
number of \ac{bp} iterations effectively spent on the \acp{vn} number of \ac{bp} iterations effectively spent on the \acp{vn}
inside the overlap region. inside the overlap region.
Each \ac{vn} in such an overlap is processed by multiple consecutive Each \ac{vn} in such an overlap is processed by multiple consecutive
@@ -1705,7 +1707,7 @@ $n_\text{iter} \in [32, 512]$.
All curves decrease monotonically with the iteration budget, but All curves decrease monotonically with the iteration budget, but
contrary to our expectation, none of them appears to fully saturate contrary to our expectation, none of them appears to fully saturate
within the swept range: even at $n_\text{iter} = 4096$, every curve within the swept range: Even at $n_\text{iter} = 4096$, every curve
still exhibits a noticeable downward slope. still exhibits a noticeable downward slope.
At $n_\text{iter} = 32$, the whole-block curve lies below both the At $n_\text{iter} = 32$, the whole-block curve lies below both the
$W=4$ and $W=5$ sliding-window curves. $W=4$ and $W=5$ sliding-window curves.
@@ -1727,9 +1729,9 @@ mirroring the behavior already observed in \Cref{fig:whole_vs_cold_vs_warm}.
These observations are largely consistent with the effective-iterations These observations are largely consistent with the effective-iterations
hypothesis put forward above. hypothesis put forward above.
The whole-block decoder eventually overtaking every windowed scheme The whole-block decoder eventually overtaking every windowed scheme
matches the prediction made there: with a sufficiently large matches the prediction made there: With a sufficiently large
iteration budget, the whole-block decoder reaches an error rate iteration budget, the whole-block decoder reaches an error rate
that nonone of the windowed schemes can beat, because of the more global that none of the windowed schemes can beat, because of the more global
nature of the considered constraints. nature of the considered constraints.
Furthermore, the pronounced advantage of warm- over cold-start decoding at low Furthermore, the pronounced advantage of warm- over cold-start decoding at low
numbers of iterations makes sense if we consider the overall trend of the plots. numbers of iterations makes sense if we consider the overall trend of the plots.
@@ -1742,15 +1744,15 @@ initialization diminishes, and the curves approach each other.
The fact that no curve clearly saturates within the swept range is The fact that no curve clearly saturates within the swept range is
itself worth noting. itself worth noting.
We know that \ac{bp} on \ac{qldpc} codes suffers from poor We know that \ac{bp} on \ac{qldpc} codes suffers from poor
convergence due to the short cycles in the underlying Tanner graph, convergence due to degeneracy and short cycles in the underlying
so even after several thousand iterations the Tanner graph, so even after several thousand iterations the decoder
decoder may continue to slowly refine its message estimates rather may continue to slowly refine its message estimates rather than
than settle into a stable fixed point. settle into a stable fixed point.
This is one of the core motivations for moving from plain \ac{bp} to This is one of the core motivations for moving from plain \ac{bp} to
the guided-decimation variant studied in the guided-decimation variant studied in
\Cref{subsec:Belief Propagation with Guided Decimation}. \Cref{subsec:Belief Propagation with Guided Decimation}.
Another thing to note is that setting the per-invocation iteration Furthermore, note that setting the per-invocation iteration
budget of the inner decoder equal to the iteration budget of the budget of the inner decoder equal to the iteration budget of the
whole-block decoder is not a fair comparison in terms of total whole-block decoder is not a fair comparison in terms of total
computational effort. computational effort.
@@ -1762,14 +1764,14 @@ sliding-window approach is still at an advantage.
% [Thread] Exploration of the effect of the step size % [Thread] Exploration of the effect of the step size
Having examined the effect of the window size $W$, we next turned to Having examined the effect of the window size $W$, we next turn to
the second windowing parameter, the step size $F$. the second windowing parameter, the step size $F$.
We carried out an investigation analogous to the one above: We carry out an investigation analogous to the one above:
we first compared warm- and cold-start decoding across the full range We first compare warm- and cold-start decoding across the full range
of physical error rates at a fixed iteration budget, and then we of physical error rates at a fixed iteration budget, and then we
examined the dependence on the iteration budget at a fixed physical examine the dependence on the iteration budget at a fixed physical
error rate. error rate.
The window size was held fixed at $W = 5$ throughout, the value at The window size is fixed at $W = 5$ throughout, the value at
which the warm-start variant produced the strongest performance in the which the warm-start variant produced the strongest performance in the
previous experiments. previous experiments.
@@ -1968,9 +1970,9 @@ previous experiments.
% [Experimental parameters] Figure 4.9 % [Experimental parameters] Figure 4.9
\Cref{fig:bp_f} summarizes the results of this investigation. \Cref{fig:bp_f} summarizes the results of this investigation.
In both panels the dashed colored curves correspond to cold-start In both panels, the dashed curves correspond to cold-start
sliding-window decoding for $F \in \{1, 2, 3\}$ and the solid colored sliding-window decoding for $F \in \{1, 2, 3\}$ and the solid
curves to the corresponding warm-start runs. curves to warm-start decoding.
The window size is fixed to $W = 5$ throughout. The window size is fixed to $W = 5$ throughout.
\Cref{fig:bp_f_over_p} sweeps the physical error rate over \Cref{fig:bp_f_over_p} sweeps the physical error rate over
$p \in [0.001, 0.004]$ in steps of $0.0005$ at a fixed maximum of $p \in [0.001, 0.004]$ in steps of $0.0005$ at a fixed maximum of
@@ -1990,9 +1992,9 @@ monotonic increase of the per-round \ac{ler} with the physical
error rate. error rate.
At fixed $F$, the warm-start approach lies below At fixed $F$, the warm-start approach lies below
cold-start across the entire sweep, and at fixed cold-start across the entire sweep, and at fixed
warm- or cold-start, smaller $F$ produces a lower \ac{ler}. warm or cold start, smaller $F$ produces a lower \ac{ler}.
Both gaps grow as the physical error rate decreases: Both gaps grow as the physical error rate decreases:
the curves at $F = 1$ separate further from those at $F = 2$ and $F = 3$, The curves at $F = 1$ separate further from those at $F = 2$ and $F = 3$,
and the warm-start curves separate further from the cold-start ones. and the warm-start curves separate further from the cold-start ones.
In \Cref{fig:bp_f_over_iter}, all six curves again decrease In \Cref{fig:bp_f_over_iter}, all six curves again decrease
monotonically with the iteration budget, with no clear saturation monotonically with the iteration budget, with no clear saturation
@@ -2014,7 +2016,7 @@ With $W$ held fixed, decreasing $F$ enlarges the overlap between
consecutive windows from $W - F$ to $W - F + 1$ syndrome measurement rounds, so consecutive windows from $W - F$ to $W - F + 1$ syndrome measurement rounds, so
a smaller step size is beneficial for the same reason that a larger a smaller step size is beneficial for the same reason that a larger
window size is: window size is:
each \ac{vn} in an overlap region participates in more window Each \ac{vn} in an overlap region participates in more window
invocations, and the warm-start modification effectively accumulates invocations, and the warm-start modification effectively accumulates
iterations on it across these invocations. iterations on it across these invocations.
The widening of the warm/cold gap towards low iteration counts and The widening of the warm/cold gap towards low iteration counts and
@@ -2032,7 +2034,7 @@ Similarly, assuming the decoder is fast enough to keep up with the
incoming syndrome measurements corresponding to the \acp{cn} of incoming syndrome measurements corresponding to the \acp{cn} of
subsequent windows, the time at which decoding is complete depends only subsequent windows, the time at which decoding is complete depends only
on the amount of time spent on decoding the very last window. on the amount of time spent on decoding the very last window.
A smaller $F$ thus only costs additional total compute and not Thus, smaller $F$ only costs additional total compute and not
additional latency, which is favorable for a warm-start additional latency, which is favorable for a warm-start
sliding-window implementation. sliding-window implementation.
This is especially favorable for our warm-start modification, as it This is especially favorable for our warm-start modification, as it
@@ -2062,8 +2064,8 @@ both schemes process the same windows for the same number of
iterations and differ only in the initialization of the \ac{bp} iterations and differ only in the initialization of the \ac{bp}
messages of each new window. messages of each new window.
We also observed that plain \ac{bp} did not saturate even at $4096$ We also observed that plain \ac{bp} did not saturate even at $4096$
iterations, which we attribute to the short cycles in the underlying iterations, which we attribute to the degeneracy and short cycles in
Tanner graph. the underlying Tanner graph.
This motivates the next subsection, in which we replace the inner This motivates the next subsection, in which we replace the inner
\ac{bp} decoder by its guided-decimation variant. \ac{bp} decoder by its guided-decimation variant.
@@ -2261,7 +2263,7 @@ that can occur before every \ac{vn} in the window has been decimated.
A preliminary investigation showed that \ac{bpgd} only delivers its A preliminary investigation showed that \ac{bpgd} only delivers its
intended performance gain once most \acp{vn} have actually been decimated, intended performance gain once most \acp{vn} have actually been decimated,
which motivated this choice. which motivated this choice.
The physical error rate was swept from $p = 0.001$ to $p = 0.004$ The physical error rate is swept from $p = 0.001$ to $p = 0.004$
in steps of $0.0005$. in steps of $0.0005$.
\Cref{fig:bpgd_w} sweeps over the window size with \Cref{fig:bpgd_w} sweeps over the window size with
$W \in \{3, 4, 5\}$ at fixed step size $F = 1$, and $W \in \{3, 4, 5\}$ at fixed step size $F = 1$, and
@@ -2279,7 +2281,7 @@ This is the opposite of what we observed for plain \ac{bp}, where
warm-start improved upon cold-start at every parameter setting. warm-start improved upon cold-start at every parameter setting.
The gap between the warm- and cold-start curves additionally widens The gap between the warm- and cold-start curves additionally widens
as the physical error rate decreases: as the physical error rate decreases:
at the lowest sampled rate $p = 0.001$, the per-round \ac{ler} of the At the lowest sampled rate $p = 0.001$, the per-round \ac{ler} of the
warm-start runs is more than two orders of magnitude above that of warm-start runs is more than two orders of magnitude above that of
the corresponding cold-start runs. the corresponding cold-start runs.
In \Cref{fig:bpgd_w}, larger window sizes yield lower per-round In \Cref{fig:bpgd_w}, larger window sizes yield lower per-round
@@ -2298,13 +2300,13 @@ than its cold-start counterpart is surprising in light of the results
for plain \ac{bp}, where the warm-start modification was uniformly beneficial. for plain \ac{bp}, where the warm-start modification was uniformly beneficial.
The dependence on the window size in \Cref{fig:bpgd_w} is, on its own, The dependence on the window size in \Cref{fig:bpgd_w} is, on its own,
consistent with the same explanation that we gave for consistent with the same explanation that we gave for
\Cref{fig:whole_vs_cold}: larger windows expose the inner decoder to \Cref{fig:whole_vs_cold}: Larger windows expose the inner decoder to
a larger fraction of the constraints encoded in the detector error a larger fraction of the constraints encoded in the detector error
matrix at the time of decoding, and this benefits both warm- and matrix at the time of decoding, and this benefits both warm- and
cold-start decoding. cold-start decoding.
The dependence on the step size in \Cref{fig:bpgd_f}, however, is the The dependence on the step size in \Cref{fig:bpgd_f}, however, is the
opposite of the corresponding dependence under plain \ac{bp} opposite of the corresponding dependence under plain \ac{bp}
(\Cref{fig:bp_f_over_p}): for warm-start, smaller $F$ now hurts (\Cref{fig:bp_f_over_p}): For warm-start, smaller $F$ now degrades performance
rather than helps, even though smaller $F$ implies a larger overlap rather than helps, even though smaller $F$ implies a larger overlap
in both cases. in both cases.
@@ -2314,18 +2316,18 @@ Recall from
that the warm start for \ac{bpgd} carries over not only the \ac{bp} that the warm start for \ac{bpgd} carries over not only the \ac{bp}
messages on the edges of the overlap region but also the decimation messages on the edges of the overlap region but also the decimation
information. information.
Because we run with an iteration budget large enough to decimate Because we decode with an iteration budget large enough to decimate
every \ac{vn} in a window, by the time window $\ell$ ends, all every \ac{vn} in a window, by the time window $\ell$ ends, all
of its \acp{vn} have already been hard-decided. of its \acp{vn} have already been hard-decided.
For the \acp{vn} that lie in the overlap region with window $\ell + 1$ For the \acp{vn} that lie in the overlap region with window $\ell + 1$
this hard decision is then carried into the next window through the this hard decision is then carried into the next window through the
warm-start initialization, and the next window thus begins decoding warm-start initialization, and the next window begins decoding
with a substantial fraction of its \acp{vn} already frozen, before with a substantial fraction of its \acp{vn} already fixed, before
its own parity checks have had any chance to influence the its own parity checks have had any chance to influence the
corresponding bit estimates. corresponding bit estimates.
This identifies one of two competing effects on the warm-start performance. This identifies one of two competing effects on the warm-start performance.
The larger the overlap, the more such prematurely frozen \acp{vn} the The larger the overlap, the more such prematurely fixed \acp{vn} the
next window inherits, which hurts performance. next window inherits, which degrades performance.
On the other hand, a larger window still exposes the inner decoder to On the other hand, a larger window still exposes the inner decoder to
a larger set of constraints, which helps performance. a larger set of constraints, which helps performance.
The two effects together are consistent with what we observe in The two effects together are consistent with what we observe in
@@ -2346,7 +2348,7 @@ $n_\text{iter}$ should reduce the maximum number of \acp{vn} that can
be decimated before window $\ell$ commits, and the warm-start be decimated before window $\ell$ commits, and the warm-start
performance should approach that of warm-start under plain \ac{bp} as performance should approach that of warm-start under plain \ac{bp} as
$n_\text{iter}$ is lowered. $n_\text{iter}$ is lowered.
We therefore now vary $n_\text{iter}$ at fixed window parameters and Therefore, we vary $n_\text{iter}$ at fixed window parameters and
fixed physical error rate. fixed physical error rate.
\begin{figure}[t] \begin{figure}[t]
@@ -2515,10 +2517,10 @@ fixed physical error rate.
\Cref{fig:bpgd_iter} shows the per-round \ac{ler} of \ac{bpgd} \Cref{fig:bpgd_iter} shows the per-round \ac{ler} of \ac{bpgd}
sliding-window decoding as a function of the maximum number of inner sliding-window decoding as a function of the maximum number of inner
\ac{bp} iterations $n_\text{iter}$. \ac{bp} iterations $n_\text{iter}$.
The dashed colored curves correspond to cold-start sliding-window The dashed curves correspond to cold-start sliding-window
decoding and the solid colored curves to warm-start, again carrying decoding and the solid curves to warm-start, which again
over both the \ac{bp} messages and the channel \acp{llr} on the retains both the \ac{bp} messages and the decimaiton information on
overlap region. the overlap region.
The physical error rate is fixed at $p = 0.0025$ and the iteration The physical error rate is fixed at $p = 0.0025$ and the iteration
budget is swept over $n_\text{iter} \in \{32, 128, 256, 512, 1024, budget is swept over $n_\text{iter} \in \{32, 128, 256, 512, 1024,
1536, 2048, 2560, 3072, 3584, 4096\}$. 1536, 2048, 2560, 3072, 3584, 4096\}$.
@@ -2533,7 +2535,7 @@ For low iteration budgets, all curves in both panels behave similarly
to the plain-\ac{bp} curves in to the plain-\ac{bp} curves in
\Cref{fig:bp_w_over_iter,fig:bp_f_over_iter}. \Cref{fig:bp_w_over_iter,fig:bp_f_over_iter}.
The per-round \ac{ler} decreases gradually with $n_\text{iter}$, and The per-round \ac{ler} decreases gradually with $n_\text{iter}$, and
the warm-start curves lie below their cold-start counterparts at the warm-start configurations now outperform their cold-start counterparts at
matching window parameters. matching window parameters.
As $n_\text{iter}$ continues to grow, however, the cold-start curves As $n_\text{iter}$ continues to grow, however, the cold-start curves
undergo a sharp drop, after which they lie roughly an order of undergo a sharp drop, after which they lie roughly an order of
@@ -2562,7 +2564,7 @@ the warm-start curves now show a clear reordering as $n_\text{iter}$
grows. grows.
At low iteration budgets the warm-start ordering matches the At low iteration budgets the warm-start ordering matches the
cold-start ordering, with $F = 1$ best and $F = 3$ worst, but at the cold-start ordering, with $F = 1$ best and $F = 3$ worst, but at the
largest iteration budget this ordering is fully inverted: warm-start largest iteration budget this ordering is fully inverted: Warm-start
$F = 1$ is now the worst and $F = 3$ the best. $F = 1$ is now the worst and $F = 3$ the best.
% [Interpretation] Figure 4.11 % [Interpretation] Figure 4.11
@@ -2594,7 +2596,7 @@ decoding performance.
The same mechanism explains the inversion of the step-size ordering The same mechanism explains the inversion of the step-size ordering
in \Cref{fig:bpgd_iter_F}. in \Cref{fig:bpgd_iter_F}.
At low iteration budgets, the ordering is set by the same overlap At low iteration budgets, the ordering is set by the same overlap
argument as for plain \ac{bp}: smaller $F$ implies a larger overlap argument as for plain \ac{bp}: Smaller $F$ implies a larger overlap
between consecutive windows, more shared messages, and therefore between consecutive windows, more shared messages, and therefore
better warm-start performance. better warm-start performance.
At large iteration budgets, the ordering is set by the premature hard At large iteration budgets, the ordering is set by the premature hard
@@ -2607,8 +2609,8 @@ of the warm-start curves and limit ourselves to noting it.
The natural consequence of the previous diagnosis is to drop the The natural consequence of the previous diagnosis is to drop the
problematic part of the warm-start initialization for \ac{bpgd} and problematic part of the warm-start initialization for \ac{bpgd} and
to carry over only the \ac{bp} messages on the edges of the overlap to carry over only the \ac{bp} messages on the edges of the overlap
region, as in \Cref{fig:messages_tanner}, while leaving the channel region, as in \Cref{fig:messages_tanner}, while leaving the
\acp{llr} of the next window in their original cold-start state. decimation information of the next window in its original cold-start state.
Note that some information about the previous window's decimation Note that some information about the previous window's decimation
state is still implicitly carried over through the \ac{bp} messages, state is still implicitly carried over through the \ac{bp} messages,
since the decimation decisions were made based on the messages themselves. since the decimation decisions were made based on the messages themselves.
@@ -2775,7 +2777,7 @@ since the decimation decisions were made based on the messages themselves.
\Cref{fig:bpgd_msg} repeats the experiment of \Cref{fig:bpgd_wf} \Cref{fig:bpgd_msg} repeats the experiment of \Cref{fig:bpgd_wf}
with the modified warm-start procedure that carries over only the with the modified warm-start procedure that carries over only the
\ac{bp} messages. \ac{bp} messages.
All other experimental parameters are unchanged: the maximum number All other experimental parameters are unchanged: The maximum number
of inner \ac{bp} iterations is $n_\text{iter} = 5000$, and the of inner \ac{bp} iterations is $n_\text{iter} = 5000$, and the
physical error rate is swept from $p = 0.001$ to $p = 0.004$ in steps physical error rate is swept from $p = 0.001$ to $p = 0.004$ in steps
of $0.0005$. of $0.0005$.
@@ -2803,12 +2805,12 @@ as $F$ grows.
% [Description] Interpretation 4.12 % [Description] Interpretation 4.12
Removing the channel \acp{llr} from the warm-start initialization lifts Removing the decimation information from the warm-start initialization lifts
the warm-start regression observed in \Cref{fig:bpgd_wf}, the warm-start regression observed in \Cref{fig:bpgd_wf},
and warm-start now consistently outperforms cold-start. and warm-start now consistently outperforms cold-start.
The dependence on the window size and the step size also recovers The dependence on the window size and the step size also recovers
the qualitative behavior we observed for plain \ac{bp} in the qualitative behavior we observed for plain \ac{bp} in
\Cref{fig:whole_vs_cold_vs_warm,fig:bp_f_over_p}: a larger overlap \Cref{fig:whole_vs_cold_vs_warm,fig:bp_f_over_p}: A larger overlap
between consecutive windows, achieved either by enlarging $W$ or by between consecutive windows, achieved either by enlarging $W$ or by
decreasing $F$, both improves the absolute decoding performance and decreasing $F$, both improves the absolute decoding performance and
increases the warm-start advantage over cold-start. increases the warm-start advantage over cold-start.
@@ -2992,7 +2994,7 @@ cold-start curves across the entire range of $n_\text{iter}$ available to us.
\Cref{fig:bpgd_msg_iter} repeats the experiment of \Cref{fig:bpgd_msg_iter} repeats the experiment of
\Cref{fig:bpgd_iter} with the modified warm-start procedure that \Cref{fig:bpgd_iter} with the modified warm-start procedure that
carries over only the \ac{bp} messages. carries over only the \ac{bp} messages.
All other experimental parameters are unchanged: the physical error All other experimental parameters are unchanged: The physical error
rate is fixed at $p = 0.0025$ and the iteration budget is swept over rate is fixed at $p = 0.0025$ and the iteration budget is swept over
$n_\text{iter} \in \{32, 128, 256, 512, 1024, 1536, 2048, 2560, $n_\text{iter} \in \{32, 128, 256, 512, 1024, 1536, 2048, 2560,
3072, 3584, 4096\}$. 3072, 3584, 4096\}$.
@@ -3020,11 +3022,11 @@ and at $F = 1$, respectively.
These observations match our expectations. These observations match our expectations.
With only the \ac{bp} messages carried over, the warm-start With only the \ac{bp} messages carried over, the warm-start
initialization no longer freezes any \acp{vn} in the next window initialization no longer freezes any \acp{vn} in the next window.
The dependence of this benefit on $W$ and $F$ also recovers the The dependence of this benefit on $W$ and $F$ also recovers the
pattern observed for plain \ac{bp} in pattern observed for plain \ac{bp} in
\Cref{fig:whole_vs_cold_vs_warm,fig:bp_f_over_p}: \Cref{fig:whole_vs_cold_vs_warm,fig:bp_f_over_p}:
larger overlap, achieved by larger $W$ or smaller $F$, yields more Larger overlap, achieved by larger $W$ or smaller $F$, yields more
effective extra iterations and therefore a larger warm-start gain. effective extra iterations and therefore a larger warm-start gain.
% BPGD conclusion % BPGD conclusion
@@ -3034,9 +3036,9 @@ sliding-window decoding under \ac{bpgd} by summarizing our findings.
Warm-starting the inner decoder still provides a consistent Warm-starting the inner decoder still provides a consistent
performance gain when the inner decoder is upgraded from plain performance gain when the inner decoder is upgraded from plain
\ac{bp} to its guided-decimation variant, but only if some care is \ac{bp} to its guided-decimation variant, but only if some care is
taken in choosing what to carry over. taken in choosing what to information carry over.
Passing the channel \acp{llr} along with the \ac{bp} messages, Passing the channel \acp{llr} along with the \ac{bp} messages,
as suggested by naively carrying over the warm-start idea to \ac{bpgd}, as suggested by naively transferring the warm-start idea to \ac{bpgd},
leads to premature hard decisions on \acp{vn} in the overlap region. leads to premature hard decisions on \acp{vn} in the overlap region.
This leads to warm-start initialization actually worsening the This leads to warm-start initialization actually worsening the
performance compared to cold-start initialization. performance compared to cold-start initialization.
@@ -3046,6 +3048,20 @@ cold-start that follows the same behavior as for plain \ac{bp} with
regard to overlap. regard to overlap.
A second observation specific to \ac{bpgd} is that its iteration A second observation specific to \ac{bpgd} is that its iteration
requirements are substantially larger than those of plain \ac{bp}: requirements are substantially larger than those of plain \ac{bp}:
the per-round \ac{ler} drops sharply only once the iteration budget The per-round \ac{ler} drops sharply only once the iteration budget
is on the order of the number of \acp{vn} in each window. is on the order of the number of \acp{vn} in each window.
Future work could include a softer treatment of the decimation state
in \ac{bpgd}.
Rather than discarding the decimation information of the previous
window entirely, as in the message-only warm start used here, one
could encode the decimation decisions as strong but finite biases on
the channel \acp{llr} of the next window, allowing the new window's parity
checks to override them if the syndrome calls for it.
This would interpolate between the two warm-start variants studied here and
might combine the benefits of both.
A related question is whether the decimation schedule itself should
be aware of the window structure, for instance by deferring
decimation of \acp{vn} in the overlap region until they have been
visited by the next window.

View File

@@ -3,24 +3,24 @@
% Recap of motivation % Recap of motivation
This thesis investigated decoding under \acp{dem} for fault-tolerant This thesis investigates decoding under \acp{dem} for fault-tolerant
\ac{qec}, with a focus on low-latency decoding methods for \ac{qldpc} codes. \ac{qec}, with a focus on low-latency decoding methods for \ac{qldpc} codes.
The repetition of the syndrome measurements, especially under The repetition of the syndrome measurements, especially under
consideration of circuit-level noise, leads to a significant increase consideration of circuit-level noise, leads to a significant increase
in decoding complexity: in our experiments on the $\llbracket in decoding complexity: In our experiments on the $\llbracket
144,12,12 \rrbracket$ \ac{bb} code with $12$ syndrome extraction 144,12,12 \rrbracket$ \ac{bb} code with $12$ syndrome extraction
rounds, the check matrix grew from 144 \acp{vn} and 72 rounds, the check matrix grows from 144 \acp{vn} and 72
\acp{cn} to 9504 \acp{vn} and 1008 \acp{cn}. \acp{cn} to 9504 \acp{vn} and 1008 \acp{cn}.
% Recap of research gap and own work % Recap of research gap and own work
Sliding-window decoding addresses the latency constraint by Sliding-window decoding addresses the latency constraint by
exploiting the time-like locality of the syndrome extraction circuit, exploiting the time-like locality of the syndrome extraction circuit.
which manifests as a block-diagonal structure in the detector error This manifests as a block-diagonal structure in the detector error
matrix when detectors are defined as the difference of consecutive matrix when detectors are defined as the difference of consecutive
syndrome measurement rounds. syndrome measurement rounds.
We drew a comparison to windowed decoding for \ac{sc}-\ac{ldpc} We draw a comparison to windowed decoding for \ac{sc}-\ac{ldpc}
codes, but noted that the existing realizations of sliding-window codes, but note that the existing realizations of sliding-window
decoding discard the soft information produced inside one window decoding discard the soft information produced inside one window
before moving to the next. before moving to the next.
Building on this observation, we proposed warm-start sliding-window Building on this observation, we proposed warm-start sliding-window
@@ -29,34 +29,35 @@ the overlap region of the previous window are reused to initialise
the corresponding messages of the next window in place of the the corresponding messages of the next window in place of the
standard cold-start initialisation. standard cold-start initialisation.
We formulated the warm start first for plain \ac{bp} and then for We formulate the warm start for standard \ac{bp} and for
\ac{bpgd}, the latter being attractive as an inner decoder because it \ac{bpgd}.
The latter is particularly attractive as an inner decoder because it
addresses the convergence problems caused by short cycles and addresses the convergence problems caused by short cycles and
degeneracy in \ac{qldpc} Tanner graphs. degeneracy in \ac{qldpc} Tanner graphs.
The decoders were evaluated by Monte Carlo simulation on the The decoders are evaluated by conducting Monte Carlo simulations on the
$\llbracket 144,12,12 \rrbracket$ \ac{bb} code over $12$ syndrome $\llbracket 144,12,12 \rrbracket$ \ac{bb} code over $12$ syndrome
extraction rounds under standard circuit-based depolarizing noise. extraction rounds under standard circuit-based depolarizing noise.
We focused on a qualitative analysis, refraining from further We focus on a qualitative analysis, refraining from further
optimizations such as introducing a normalization parameter for the optimizations such as introducing a normalization parameter for the
min-sum algorithm. min-sum algorithm.
% Recap of experimental conclusions % Recap of experimental conclusions
For plain min-sum \ac{bp}, the warm start was consistently beneficial For standard min-sum \ac{bp}, the warm start is consistently
across the parameter ranges we examined. The size of the gain depended beneficial to the cold start, across the considered parameter ranges.
on the overlap between consecutive windows: enlarging $W$ or The size of the gain depends on the overlap between consecutive
shrinking $F$, both of which enlarge the overlap, raised the windows: Enlarging $W$ or shrinking $F$, both of which enlarge the
warm-start performance increase. overlap, result in larger gains of the warm-start.
We argued that the underlying mechanism is an effective increase in We observe that the underlying mechanism is an effective increase in
the number of \ac{bp} iterations spent on the \acp{vn} in the overlap the number of \ac{bp} iterations spent on the \acp{vn} in the overlap
region: each such \ac{vn} is processed by multiple consecutive window region: Each such \ac{vn} is processed by multiple consecutive window
invocations, and the warm start lets these invocations accumulate invocations, and the warm start lets these invocations accumulate
iterations on the same \acp{vn} rather than restarting from scratch. iterations on the same \acp{vn} rather than restarting from scratch.
The gain was most pronounced at low numbers of maximum iterations, where The gain was most pronounced at low numbers of maximum iterations, where
every additional iteration carries proportionally more information. every additional iteration carries proportionally more information.
For \ac{bpgd}, we noted that more information is available in the For \ac{bpgd}, we note that more information is available in the
overlap region of a window: in addition to the \ac{bp} messages, overlap region of a window: In addition to the \ac{bp} messages,
there is information about which \acp{vn} were decimated and to what value. there is information about which \acp{vn} were decimated and to what value.
Passing this decimation information to the next window in addition to Passing this decimation information to the next window in addition to
the messages turned out to worsen the performance considerably, which the messages turned out to worsen the performance considerably, which
@@ -65,14 +66,14 @@ overlap region.
Restricting the warm start to the \ac{bp} messages alone, removed this effect. Restricting the warm start to the \ac{bp} messages alone, removed this effect.
The resulting message-only warm start recovered a consistent The resulting message-only warm start recovered a consistent
improvement over cold-start that followed the same qualitative improvement over cold-start that followed the same qualitative
behaviour as for plain \ac{bp}: larger overlap, achieved by larger behaviour as for standard \ac{bp}: Larger overlap, achieved by larger
$W$ or smaller $F$, yielded a larger gain, and the $W$ or smaller $F$, yielded a larger gain, and the
performance difference was most pronounced at low numbers of maximum iterations. performance difference is most pronounced at low numbers of maximum iterations.
% Implications from experimental results % Implications from experimental results
These observations imply that the warm-start modification to These observations imply that the warm-start modification to
sliding-window decoding provides a consistent improvement, as long as sliding-window decoding can provide a consistent improvement, as long as
some care is taken with specifying the information to be passed to some care is taken with specifying the information to be passed to
the subsequent window. the subsequent window.
Note that this comes at no additional cost to the decoding complexity, Note that this comes at no additional cost to the decoding complexity,
@@ -94,25 +95,10 @@ underlying mechanism is structural rather than code-specific, but
quantifying the gain across code families and noise models is left to quantifying the gain across code families and noise models is left to
future work. future work.
A second direction is a systematic study of inner decoders under the A second direction is a systematic study of other inner decoders under the
warm-start framework. warm-start framework, such as automorphism ensemble decoding
We considered plain min-sum \ac{bp} and \ac{bpgd}, but other \cite{koutsioumpas_automorphism_2025} or neural \ac{bp}
algorithms used for \ac{qldpc} decoding, such as automorphism \cite{miao_quaternary_2025}.
ensemble decoding \cite{koutsioumpas_automorphism_2025} or neural
\ac{bp} \cite{miao_quaternary_2025} may admit warm-start variants of their own.
A third direction is a softer treatment of the decimation state in \ac{bpgd}.
Rather than discarding the decimation information of the previous
window entirely, as in the message-only warm start used here, one
could encode the decimation decisions as strong but finite biases on
the channel \acp{llr} of the next window, allowing the new window's parity
checks to override them if the syndrome calls for it.
This would interpolate between the two warm-start variants studied here and
might combine the benefits of both.
A related question is whether the decimation schedule itself should
be aware of the window structure, for instance by deferring
decimation of \acp{vn} in the overlap region until they have been
visited by the next window.
A final direction is suggested by the structural similarity between A final direction is suggested by the structural similarity between
sliding-window decoding for \acp{dem} and windowed decoding for sliding-window decoding for \acp{dem} and windowed decoding for

View File

@@ -4,6 +4,8 @@
\Ac{qec} protects fragile quantum states against decoherence by \Ac{qec} protects fragile quantum states against decoherence by
encoding logical information into a larger number of physical qubits. encoding logical information into a larger number of physical qubits.
To obtain parity information on an encoded state without disturbing it, a
syndrome extraction is performed.
Because the syndrome extraction circuitry is itself implemented on Because the syndrome extraction circuitry is itself implemented on
noisy quantum hardware, practical \ac{qec} must be fault-tolerant, noisy quantum hardware, practical \ac{qec} must be fault-tolerant,
accounting for errors introduced by the correction procedure itself. accounting for errors introduced by the correction procedure itself.
@@ -19,35 +21,35 @@ can be decoded.
Together, these factors pose a serious challenge for practical decoders. Together, these factors pose a serious challenge for practical decoders.
Sliding-window decoding addresses this challenge by exploiting the Sliding-window decoding addresses this challenge by exploiting the
repeated structure of the syndrome extraction circuitry, partitioning repeated structure of the syndrome extraction circuitry, partitioning
the \ac{dem}'s check matrix into overlapping windows that can be the check matrix of the \ac{dem} into overlapping windows that can be
decoded sequentially. decoded sequentially.
This allows for an earlier start to the decoding process, before all Therefore, decoding can begin as soon as the syndrome components
syndrome measurements have been completed, thereby lowering the latency. associated with the first window have been measured.
% Our work: Identify research gap % Our work: Identify research gap
In this thesis, we perform a review of the existing literature on In this thesis, we perform a review of the existing literature on
sliding-window decoding and draw an analogy to windowed sliding-window decoding and draw an analogy to windowed
decoding for classical spatially-coupled low-density parity-check decoding of classical spatially-coupled low-density parity-check
(\acs{sc}-\acs{ldpc}) codes. (\acs{sc}-\acs{ldpc}) codes.
We recognize that in contrast to the latter, existing realizations We recognize that in contrast to the latter, existing realizations
of sliding-window decoding for \ac{qec} discard the soft information of sliding-window decoding for \ac{qec} discard the soft information
produced inside one window before moving to the next. produced inside one window before moving to the subsequent window.
% Our work: Warm-start % Our work: Warm-start
% TODO: Quantify improvement. Also for conclusion % TODO: Quantify improvement. Also for conclusion
We propose warm-start sliding-window decoding, in which the To take this information into account, we propose warm-start
\ac{bp} messages on the edges crossing into the overlap region of the previous sliding-window decoding, in which the \ac{bp} messages on the edges
window are reused to initialize the corresponding messages of the crossing into the overlap region of the previous window are reused to
next window. initialize the corresponding messages of the next window.
The warm start is formulated first for plain \ac{bp} and then extended to The warm start is formulated first for standard \ac{bp} and then extended to
\ac{bp} with guided decimation (\acs{bpgd}). \ac{bp} with guided decimation (\acs{bpgd}).
For both plain min-sum \ac{bp} and \ac{bpgd} decoding, the warm-start For both standard \ac{bp} and \ac{bpgd} decoding, the warm-start
initialization provides a consistent improvement across all examined initialization provides a consistent improvement across all examined
parameter settings. parameter settings.
We attribute this to an effective increase in \ac{bp} iterations on We attribute this to an effective increase in \ac{bp} iterations on
variable nodes in the overlap regions: each such VN is processed by variable nodes in the overlap regions: Each such VN is processed by
multiple consecutive windows, and warm-starting lets these multiple consecutive windows, and warm-starting lets these
invocations accumulate iterations rather than restart from scratch. invocations accumulate iterations rather than restart from scratch.
Crucially, the warm-start modification incurs no additional Crucially, the warm-start modification incurs no additional

View File

@@ -29,6 +29,7 @@
\usepackage{colortbl} \usepackage{colortbl}
\usepackage{cleveref} \usepackage{cleveref}
\usepackage{lipsum} \usepackage{lipsum}
\usepackage{booktabs}
\usetikzlibrary{calc, positioning, arrows, fit} \usetikzlibrary{calc, positioning, arrows, fit}
\usetikzlibrary{external} \usetikzlibrary{external}
@@ -89,10 +90,10 @@
% \thesisHeadOfInstitute{Prof. Dr.-Ing. Peter Rost} % \thesisHeadOfInstitute{Prof. Dr.-Ing. Peter Rost}
%\thesisHeadOfInstitute{Prof. Dr.-Ing. Peter Rost\\Prof. Dr.-Ing. %\thesisHeadOfInstitute{Prof. Dr.-Ing. Peter Rost\\Prof. Dr.-Ing.
% Laurent Schmalen} % Laurent Schmalen}
\thesisSupervisor{M.Sc. Jonathan Mandelbaum} \thesisSupervisor{Dr.-Ing. Hedongliang Liu\\ && M.Sc. Jonathan Mandelbaum}
\thesisStartDate{01.11.2025} \thesisStartDate{Nov. 1st, 2025}
\thesisEndDate{04.05.2026} \thesisEndDate{May 4th, 2026}
\thesisSignatureDate{04.05.2026} \thesisSignatureDate{May 4th, 2026}
\thesisSignature{res/Unterschrift_AT_blue.png} \thesisSignature{res/Unterschrift_AT_blue.png}
\thesisSignatureHeight{2.4cm} \thesisSignatureHeight{2.4cm}
\thesisLanguage{english} \thesisLanguage{english}
@@ -108,8 +109,11 @@
\cleardoublepage \cleardoublepage
\pagenumbering{arabic} \pagenumbering{arabic}
\newgeometry{a4paper,left=3cm,right=3cm,top=2cm,bottom=2.5cm}
\addtocontents{toc}{\protect\vspace*{-9mm}}
\tableofcontents \tableofcontents
\cleardoublepage \cleardoublepage
\restoregeometry
\input{chapters/1_introduction.tex} \input{chapters/1_introduction.tex}
\input{chapters/2_fundamentals.tex} \input{chapters/2_fundamentals.tex}
@@ -122,6 +126,11 @@
% \listoftables % \listoftables
% \include{abbreviations} % \include{abbreviations}
\cleardoublepage
\phantomsection
\addcontentsline{toc}{chapter}{List of Abbreviations}
\printacronyms
\bibliography{lib/cel-thesis/IEEEabrv,src/thesis/bibliography} \bibliography{lib/cel-thesis/IEEEabrv,src/thesis/bibliography}
\end{document} \end{document}