diff --git a/src/thesis/chapters/1_introduction.tex b/src/thesis/chapters/1_introduction.tex
index f4ec3a1..c7eaf33 100644
--- a/src/thesis/chapters/1_introduction.tex
+++ b/src/thesis/chapters/1_introduction.tex
@@ -1 +1,297 @@
 \chapter{Introduction}
+\label{ch:Introduction}
+
+% Intro to quantum computing
+
+% TODO: Rephrase
+In 1982, Richard Feynman, motivated by the difficulty of simulating
+quantum-mechanical systems on classical hardware, put forward the
+idea of building computers from quantum hardware themselves
+\cite{feynman_simulating_1982}.
+The use of such quantum computers has since been shown to offer promising
+prospects not only with regard to simulating quantum systems but also
+for solving certain kinds of problems that are classicaly intractable.
+The most prominent example is Shor's algorithm for integer
+factorization \cite{shor_algorithms_1994}.
+
+Similar to the way classical computers are built from bits and gates,
+quantum computers are built from \emph{qubits} and \emph{quantum gates}.
+Because of quantum entanglement, it is not enough to consider the
+qubits individually, we also have to consider correlations between them.
+For a system of $n$ qubits, this makes the state space grow with
+$2^n$ instead of linearly with $n$, as would be the case for a classical system
+\cite[Sec.~1]{gottesman_stabilizer_1997}.
+This is both the reason quantum systems are difficult to simulate and
+what provides them with their power \cite[Sec.~2.1]{roffe_decoding_2020}.
+
+% The need for QEC
+
+Realizing algorithms that leverage these quantum-mechanical effects
+requires hardware that can execute long quantum computations reliably.
+This poses a problem, because the qubits making up current devices
+are difficult to sufficiently isolate from their environment
+\cite[Sec.~1]{roffe_quantum_2019}.
+Their interaction with the environment acts as a continuous small-scale
+measurement, an effect we call \emph{decoherence} of the stored quantum
+state.
+Decoherence is the reason large systems don't exhibit visible quantum
+properties at human scales \cite[Sec.~1]{gottesman_stabilizer_1997}.
+
+% Intro to QEC
+
+\Ac{qec} has emerged as a leading candidate in solving this problem.
+It addresses the issue by encoding the information of $k$
+\emph{logical qubits} into a larger number $n>k$ of \emph{physical
+qubits}, in close analogy to classical channel coding
+\cite[Sec.~1]{roffe_quantum_2019}.
+The redundancy introduced this way can then be used to restore
+the quantum state, should it be disturbed.
+The quantum setting imposes some important constraints that do not exist in the
+classical case, however \cite[Sec.~2.4]{roffe_quantum_2019}:
+\begin{itemize}
+    \item The no-cloning theorem prohibits the duplication of quantum states.
+    \item In addition to the bit-flip errors we know from the
+        classical setting, qubits are subject to \emph{phase-flips}.
+    \item We are not allowed to directly measure the encoded qubits,
+        as that would disturb their quantum states.
+\end{itemize}
+We can deal with the first constraint by not duplicating information, instead
+spreading the quantum state across the physical qubits
+\cite[Sec.~I]{calderbank_good_1996}.
+To deal with phase-flip errors, we must take special care when
+constructing \ac{qec} codes.
+Using \ac{css} codes, for example, we can use two separate classical
+binary linear codes to protect against the two kinds of errors
+\cite[Sec. 10.5.6]{nielsen_quantum_2010}.
+Finally, we can get around the last issue by using \emph{stabilizer
+measurements}.
+These are parity measurements that give us information about
+potential errors without revealing the underlying qubit states
+\cite[Sec.~II.C.]{babar_fifteen_2015}.
+This way, we perform a \emph{syndrome extraction} and base the
+subsequent decoding process on the measured syndrome.
+
+Another difference between \ac{qec} and classical channel coding is
+the resource constraints.
+For QEC, low latency matters more than low overall computational
+complexity, due to the backlog problem
+\cite[Sec.~II.G.3.]{terhal_quantum_2015}: Some gates may turn
+single-qubit errors into multi-qubit ones, so errors must be
+corrected beforehand.
+A QEC system that is too slow accumulates a backlog at these points,
+causing exponential slowdown.
+
+Several code constructions have been proposed for \ac{qec} codes over the years.
+Topological codes such as surface codes have been the industry
+standard for experimental applications for a long time
+\cite[Sec.~I]{koutsioumpas_colour_2025}, due to their
+reliance on only local connections between qubits
+\cite[Sec.~5]{roffe_decoding_2020}.
+Recently, \ac{qldpc} codes have been getting increasingly more
+attention as they have been shown to offer comparable thresholds with
+substantially improved encoding rates \cite[Sec.~1]{bravyi_high-threshold_2024}.
+\ac{qldpc} codes are generally decoded using a syndrome-based variant
+of the \ac{bp} algorithm \cite[Sec.~1]{roffe_decoding_2020}.
+
+% DEMs and fault tolerance
+
+\content{Syndrome extraction can also be faulty -> Need for fault tolerance}
+\content{Have to repeat syndrome measurements}
+\content{DEMs one way of implementing fault tolerance: Model more
+error locations -> Larger resulting codes}
+\content{Literature deals with latency problem for fault tolerance by
+sliding-window decoding}
+
+% Reseach gap + our work
+
+\content{Use BP for decoding, but has convergence issues -> Modify BP}
+
+\content{We note a striking similarity between sliding-window
+decoding for DEMs and the way SC-LDPC codes are decoded}
+\content{Extend QEC sliding-window decoding by warm start, inspired
+by SC-LDPC decoders}
+The existing realizations of sliding-window decoding for \ac{qec}
+discard the soft information produced inside one window before moving
+on to the next, in contrast to the analogous \ac{sc}-\ac{ldpc}
+decoders, which carry messages between windows
+\cite[Sec.~III.~C.]{hassan_fully_2016}.
+This thesis investigates whether the same idea can be carried over to
+the \ac{qec} setting.
+We propose \emph{warm-start sliding-window decoding}, in which the
+\ac{bp} messages from the overlap region of the previous window are
+reused to initialize \ac{bp} in the current window in place of the
+standard cold-start initialization.
+We formulate the warm start first for plain \ac{bp} and then for
+\ac{bpgd}, where some care is needed in deciding which information to
+carry over.
+The decoders are evaluated by Monte Carlo simulation on the
+$\llbracket 144,12,12 \rrbracket$ \ac{bb} code under standard
+circuit-based depolarizing noise over $12$ syndrome extraction rounds.
+The main finding is that warm-starting yields a consistent
+improvement at low iteration budgets, which is the regime relevant for
+low-latency operation.
+
+% The need for fault tolerance
+
+% A naive picture of \ac{qec} treats the syndrome extraction circuit as
+% ideal and only considers errors on the data qubits.
+% In reality, every gate, every ancilla, and every measurement involved
+% in extracting the syndrome can itself fail, introducing new faults
+% into the procedure that is supposed to correct them
+% \cite[Sec.~III]{shor_scheme_1995}.
+% A \ac{qec} procedure is called \emph{fault-tolerant} if it remains
+% effective in the presence of these internal faults
+% \cite[Sec.~4]{gottesman_introduction_2009}.
+
+% Fault tolerance
+
+% The standard formal definition requires the number of output errors
+% to remain bounded as long as the combined number of input and
+% internal errors does not exceed the correction capability of the code
+% \cite[Def.~4.2]{derks_designing_2025}.
+% To deal with internal errors that flip syndrome bits, multiple rounds
+% of syndrome measurements are performed, and the resulting space-time
+% history of detector outcomes is decoded jointly.
+% The probabilities of errors at each location in the circuit are
+% collected in a \emph{noise model}.
+% The most general such model, in which an arbitrary Pauli error is
+% allowed after each gate, is referred to as \emph{circuit-level noise}
+% \cite[Def.~2.5]{derks_designing_2025} and is the noise model that
+% should be used for fault-tolerance simulations
+% \cite[Sec.~4.2]{derks_designing_2025}.
+
+% DEMs
+
+% The combination of circuit-level noise and multiple syndrome
+% measurement rounds yields a complicated, code- and circuit-specific
+% decoding problem.
+% A recent line of work argues that this problem is most cleanly
+% expressed through a \acf{dem} \cite[Sec.~6]{derks_designing_2025}.
+% A \ac{dem} abstracts away the underlying circuit and lists the
+% independent error mechanisms together with the detectors they flip
+% and the logical observables they affect.
+% From the decoder's perspective, decoding under a \ac{dem} is again a
+% classical decoding problem on a parity-check matrix, with the
+% detectors playing the role of \acfp{cn} and the error mechanisms
+% playing the role of \acfp{vn}.
+% The standard tool for generating \acp{dem} from arbitrary stabilizer
+% circuits is Stim \cite{gidney_stim_2021}, in which the \ac{dem}
+% formalism was originally introduced.
+
+% The issues with deocoding under DEMs
+
+% For \ac{qec}, the binding constraint on the decoder is latency, not
+% raw computational complexity.
+% This is the \emph{backlog problem}: certain gates can transform
+% existing single-qubit errors into multi-qubit errors, and any
+% correction must be applied before such gates are reached.
+% A decoder that fails to keep up with the rate at which the hardware
+% produces syndromes leads to an exponential slowdown of the computation
+% \cite[Sec.~II.G.3.]{terhal_quantum_2015}.
+
+% Decoding under a \ac{dem} aggravates this constraint, because the
+% matrix that results from unrolling several rounds of syndrome
+% extraction is much larger than the parity-check matrix of the
+% underlying code.
+% Each error mechanism in the circuit becomes a separate \ac{vn} and
+% each detector becomes a separate \ac{cn}.
+% For the $\llbracket 144,12,12 \rrbracket$ \acf{bb} code
+% \cite[Sec.~3]{bravyi_high-threshold_2024} with $12$ syndrome
+% measurement rounds, the number of \acp{vn} grows from $144$ to $9504$
+% and the number of \acp{cn} grows from $72$ to $1008$.
+
+% Exiting solutions to these issues (sliding-window decoding + BP modifications)
+
+% The dominant strategy for keeping the latency of \ac{dem} decoding
+% manageable is \emph{sliding-window decoding}.
+% Instead of decoding the entire space-time history at once, the
+% decoder operates on a window that spans only a few syndrome
+% measurement rounds.
+% After each round, the window slides forward, and the corrections in
+% the part of the previous window that is no longer needed are committed.
+% The idea originates with the \emph{overlapping recovery} scheme
+% proposed for the surface code in \cite[Sec.~IV.B]{dennis_topological_2002}
+% and has since been studied for surface and toric codes
+% \cite{kuo_fault-tolerant_2024} as well as for \ac{qldpc} codes under
+% both phenomenological and circuit-level noise
+% \cite{huang_increasing_2024,gong_toward_2024,kang_quits_2025}.
+% The structure of the decoding problem inside each window is
+% reminiscent of \acf{sc}-\acf{ldpc} decoding from classical
+% communications \cite[Intro.]{costello_spatially_2014}, where similar
+% windowing techniques are used and where soft information is passed
+% between consecutive windows
+% \cite[Sec.~III.~C.]{hassan_fully_2016}.
+
+% We focus on QLDPC codes
+
+% In this work we focus on \acf{qldpc} codes, of which the \ac{bb} code
+% mentioned above is one example.
+% \ac{qldpc} codes have emerged as leading candidates for practical
+% \ac{qec} due to their high encoding rates and large minimum distances
+% at short syndrome-extraction-circuit depths
+% \cite[Sec.~1]{bravyi_high-threshold_2024}.
+% The natural decoder for them is \acf{bp}, which is well suited to
+% sparse parity-check matrices and admits an efficient and parallel
+% implementation, but is known to converge poorly on quantum codes due
+% to quantum degeneracy and the unavoidable short cycles in the Tanner
+% graph \cite[Sec.~II.C.]{babar_fifteen_2015}\cite[Sec.~V]{roffe_decoding_2020}.
+% Several modifications of \ac{bp} have been proposed to address this:
+% combining \ac{bp} with \acf{osd} \cite{roffe_decoding_2020}, decoding
+% multiple variations of the code in parallel as in \acf{aed}
+% \cite{koutsioumpas_automorphism_2025}, or extending \ac{bp} with
+% guided decimation as in \acf{bpgd} \cite{yao_belief_2024}.
+
+% Contributions of this Thesis
+
+% The existing realizations of sliding-window decoding for \ac{qec}
+% discard the soft information produced inside one window before moving
+% on to the next, in contrast to the analogous \ac{sc}-\ac{ldpc}
+% decoders, which carry messages between windows
+% \cite[Sec.~III.~C.]{hassan_fully_2016}.
+% This thesis investigates whether the same idea can be carried over to
+% the \ac{qec} setting.
+%
+% We propose \emph{warm-start sliding-window decoding}, in which the
+% \ac{bp} messages from the overlap region of the previous window are
+% reused to initialize \ac{bp} in the current window in place of the
+% standard cold-start initialization.
+% We formulate the warm start first for plain \ac{bp} and then for
+% \ac{bpgd}, where some care is needed in deciding which information to
+% carry over.
+% The decoders are evaluated by Monte Carlo simulation on the
+% $\llbracket 144,12,12 \rrbracket$ \ac{bb} code under standard
+% circuit-based depolarizing noise over $12$ syndrome extraction rounds.
+% The main finding is that warm-starting yields a consistent
+% improvement at low iteration budgets, which is the regime relevant for
+% fault-tolerant operation.
+
+% Outline of the Thesis
+
+\Cref{ch:Fundamentals} reviews the fundamentals of classical and
+quantum error correction.
+On the classical side, it covers binary linear block codes,
+\ac{ldpc} and \ac{sc}-\ac{ldpc} codes, and the \ac{bp} decoding
+algorithm.
+On the quantum side, it introduces the relevant quantum mechanical
+notation, stabilizer measurements, stabilizer codes, \acf{css} codes,
+\ac{qldpc} codes, and the \ac{bpgd} algorithm.
+
+\Cref{ch:Fault tolerance} introduces fault-tolerant \ac{qec}.
+It formalizes the notion of fault tolerance, presents the noise
+models considered in this work, and develops the \ac{dem} formalism
+through the measurement syndrome matrix, the detector matrix, and the
+detector error matrix.
+The chapter closes with a discussion of practical considerations
+including the choice of noise model, the per-round \acf{ler}, and the
+Stim toolchain.
+
+\Cref{ch:Decoding} considers practical aspects of decoding under \acp{dem}.
+It reviews the existing literature on sliding-window decoding for
+\ac{qec}, develops the formal windowing construction we build upon,
+introduces the proposed warm-start sliding-window decoder for
+plain \ac{bp} and for \ac{bpgd}, and reports numerical results on the
+$\llbracket 144,12,12 \rrbracket$ \ac{bb} code.
+
+\Cref{ch:Conclusion} concludes the thesis and outlines directions for
+further research.
+
diff --git a/src/thesis/chapters/2_fundamentals.tex b/src/thesis/chapters/2_fundamentals.tex
index f442235..373f7f1 100644
--- a/src/thesis/chapters/2_fundamentals.tex
+++ b/src/thesis/chapters/2_fundamentals.tex
@@ -1,6 +1,8 @@
 \chapter{Fundamentals of Classical and Quantum Error Correction}
 \label{ch:Fundamentals}
 
+\acresetall
+
 \Ac{qec} is a field of research combining ``classical''
 communications engineering and quantum information science.
 This chapter provides the relevant theoretical background on both of
@@ -1112,11 +1114,11 @@ An example of this is the CNOT gate introduced in
 One of the major barriers on the road to building a functioning
 quantum computer is the inevitability of errors during quantum
 computation. These arise due to the difficulty in sufficiently isolating the
-qubits from external noise \cite[Intro.]{roffe_quantum_2019}.
+qubits from external noise \cite[Sec.~1]{roffe_quantum_2019}.
 This isolation is critical for quantum systems, as the constant interactions
 with the environment act as small measurements, an effect called
 \emph{decoherence} of the quantum state
-\cite[Intro.]{gottesman_stabilizer_1997}.
+\cite[Sec.~1]{gottesman_stabilizer_1997}.
 \ac{qec} is one approach of dealing with this problem, by protecting
 the quantum state in a similar fashion to information in classical error
 correction.
@@ -1145,7 +1147,7 @@ To this end, $k \in \mathbb{N}$ \emph{logical qubits} are mapped onto
 $n \in \mathbb{N}$ \emph{physical qubits}, $n>k$.
 We circumvent the no-cloning restriction by not copying the state of any of
 the $k$ logical qubits, instead spreading the total state out over all $n$
-physical qubits \cite[Intro.]{calderbank_good_1996}.
+physical qubits \cite[Sec.~I]{calderbank_good_1996}.
 To differentiate quantum codes from classical ones, we denote a
 code with parameters $k,n$ and minimum distance $d_\text{min}$ using
 double brackets, as $\llbracket n,k,d_\text{min} \rrbracket$
@@ -1570,7 +1572,8 @@ Additionally, we amend the \ac{cn} update to consider the parity
 indicated by the syndrome, calculating
 \begin{align*}
     L_{i\leftarrow j} = 2\cdot (-1)^{s_j} \cdot \tanh^{-1} \left( \prod_{i'\in
-    \mathcal{N}_\text{C}(j)\setminus \{i\}} \tanh \frac{L_{i'\rightarrow j}}{2} \right)
+        \mathcal{N}_\text{C}(j)\setminus \{i\}} \tanh
+    \frac{L_{i'\rightarrow j}}{2} \right)
     .
 \end{align*}
 The resulting syndrome-based \ac{bp} algorithm is shown in
diff --git a/src/thesis/chapters/4_decoding_under_dems.tex b/src/thesis/chapters/4_decoding_under_dems.tex
index 4eb42b7..3b38ec3 100644
--- a/src/thesis/chapters/4_decoding_under_dems.tex
+++ b/src/thesis/chapters/4_decoding_under_dems.tex
@@ -1,5 +1,6 @@
 % TODO: Make all [H] -> [t]
 \chapter{Decoding under Detector Error Models}
+\label{ch:Decoding}
 
 In \Cref{ch:Fundamentals} we introduced the fundamentals of classical
 error correction, before moving on to quantum information science and
diff --git a/src/thesis/chapters/5_conclusion_and_outlook.tex b/src/thesis/chapters/5_conclusion_and_outlook.tex
index f932458..e7375a7 100644
--- a/src/thesis/chapters/5_conclusion_and_outlook.tex
+++ b/src/thesis/chapters/5_conclusion_and_outlook.tex
@@ -1,4 +1,5 @@
 \chapter{Conclusion and Outlook}
+\label{ch:Conclusion}
 
 \content{Takeaway: Warm-start more effective for lower numbers of max
     iterations (plays into our hands because lower number of iterations