From 1edc3f301aedda33e880a10a2970dbfae4c192f0 Mon Sep 17 00:00:00 2001
From: Andreas Tsouchlos <an.tsouchlos@gmail.com>
Date: Mon, 4 May 2026 18:42:39 +0200
Subject: [PATCH] Final readthrough corrections for decoding chapter

---
 src/thesis/chapters/4_decoding_under_dems.tex | 66 ++++++++++---------
 1 file changed, 34 insertions(+), 32 deletions(-)

diff --git a/src/thesis/chapters/4_decoding_under_dems.tex b/src/thesis/chapters/4_decoding_under_dems.tex
index 8e05d3f..ec1f568 100644
--- a/src/thesis/chapters/4_decoding_under_dems.tex
+++ b/src/thesis/chapters/4_decoding_under_dems.tex
@@ -3,9 +3,9 @@
 \label{ch:Decoding}
 
 In \Cref{ch:Fundamentals}, we introduced the fundamentals of classical
-error correction, before moving on to quantum information science and
+error correction, before turning to quantum information science and
 finally combining the two in \acf{qec}.
-In \Cref{ch:Fault tolerance}, we then turned to fault-tolerance, with
+In \Cref{ch:Fault tolerance}, we then considered fault-tolerance, with
 a focus on a specific way of implementing it, called \acfp{dem}.
 In this chapter, we move on from the fundamental concepts and examine
 how to apply them in practice.
@@ -14,7 +14,7 @@ Specifically, we consider the practical aspects of decoding under \acp{dem}.
 In particular, we investigate decoding \acf{qldpc} codes under \acp{dem}.
 We focus on \ac{qldpc} codes, as they have emerged as leading
 candidates for practical quantum error correction, offering
-comparable thresholds with substantially improved encoding rates
+good thresholds with substantially improved encoding rates
 \cite[Sec.~1]{bravyi_high-threshold_2024}.
 Because of this, the decoding algorithms we consider will all be
 based on \acf{bp}.
@@ -29,7 +29,7 @@ exist, the \ac{bp} algorithm becomes uncertain of the direction to proceed in.
 Additionally, the commutativity conditions of the stabilizers
 necessitate the existence of short cycles.
 Together, these two aspects lead to substantial convergence problems
-of \ac{bp} for quantum codes, when it is used on its own.
+of \ac{bp} for quantum codes, when employed on its own.
 
 Second, the consideration of circuit-level noise introduces many more
 error locations into the circuit.
@@ -49,7 +49,7 @@ but rather quantum codes in general.
 Many different approaches to solving it exist, usually centered
 around modifying \ac{bp}.
 The most popular approach is combining a few initial iterations of
-\ac{bp} with a second decoding algorithm, namely \ac{osd}
+\ac{bp} with a second decoding algorithm, \ac{osd}
 \cite{roffe_decoding_2020}.
 Other approaches exist, such as \ac{aed}
 \cite{koutsioumpas_automorphism_2025}, where multiple variations of
@@ -274,7 +274,7 @@ The employed noise models also differ;
 Finally, in \cite{gong_toward_2024} the authors introduce their own variation of
 \ac{bpgd}, \ac{bp} with \ac{gdg}, while \cite{huang_increasing_2024}
 and \cite{kang_quits_2025} use \ac{bp} + \ac{osd}.
-We would additionally like to note that only in
+We would additionally like to note that only
 \cite{gong_toward_2024} and \cite{kang_quits_2025}
 explicitly work with the \ac{dem} formalism.
 
@@ -392,7 +392,7 @@ is in turn based on \cite{huang_increasing_2024}.
 Sliding-window decoding is made possible by the time-like structure
 of the syndrome extraction circuitry.
 This is especially clearly visible under the \ac{dem} formalism, where
-this manifests as a block-diagonal structure of the detector
+it manifests as a block-diagonal structure of the detector
 error matrix $\bm{H}$.
 Note that this presupposes a choice of detectors as seen in
 \Cref{subsec:Detector Error Matrix}.
@@ -412,7 +412,7 @@ no longer contribute to decoding, since none of their
 neighboring \acp{vn} appear in subsequent windows.
 We call the set of \acp{vn} connected to those \acp{cn} the
 \emph{commit region} and we commit them before moving to the
-next window, i.e., fix the values we estimate for the corresponding bits.
+next window, i.e., we fix the values we estimate for the corresponding bits.
 The benefit of this sequential sliding-window decoding approach is
 that the decoding process can begin as soon as the syndrome
 measurements for the first window are complete.
@@ -496,7 +496,7 @@ respective nodes.
 In this case, we take $\bm{H} \in \mathbb{F}_2^{m\times n}$ to be the
 check matrix of the underlying code, from which the \ac{dem} was generated.
 We use $m_\text{DEM}, \mathcal{I}_\text{DEM}$, and $\mathcal{J}_\text{DEM}$
-to refer to the respective values defined from the detector error matrix.
+to refer to the respective values defined for the detector error matrix.
 
 % How we get the corresponding rows
 
@@ -515,7 +515,7 @@ Thus, we define
     \mathcal{J}_\text{win}^{(\ell)} &:= \left\{ j\in \mathcal{J}_\text{DEM}:~
         \ell F m \le j < \min \left\{m_\text{DEM}, (\ell F + W) m \right\}
     \right\} \\[2mm]
-    & \hspace{30mm} \text{and} \\[2mm]
+    & \hspace{37mm} \text{and} \\[2mm]
     \mathcal{J}_\text{commit}^{(\ell)} &:= \left\{ j\in \mathcal{J}_\text{DEM}:~
         \ell F m \le j < \min \left\{m_\text{DEM}, (\ell + 1) F m \right\}
     \right\}
@@ -531,7 +531,7 @@ $\ell$ and $\ell + 1$ as $\mathcal{J}_\text{overlap}^{(\ell)} :=
 
 % How we get the corresponding columns
 
-We can now turn our attention to defining the sets of \acp{vn} relevant
+We now turn our attention to defining the sets of \acp{vn} relevant
 to each window.
 We first introduce a helper function $i_\text{max} :
 \mathcal{P}(\mathbb{N}) \to \mathbb{N}$, which takes a set
@@ -735,7 +735,7 @@ initialization can be used with.
 For instance, \ac{bp}+\ac{osd} does not immediately seem suitable, as
 it performs a hard decision on the \acp{vn}, though this remains to
 be investigated.
-We chose to investigate first standard \ac{bp} due to its simplicity and
+We chose to investigate first plain \ac{bp} due to its simplicity and
 then \ac{bpgd} because of the availability of recently computed messages.
 
 % TODO: Include this?
@@ -1249,10 +1249,11 @@ For the circuit generation, we employed utilities from QUITS
 \cite{kang_quits_2025}, which provides syndrome extraction circuitry
 generation for a number of different \ac{qldpc} codes.
 We initially created a Python implementation, which used QUITS for the window
-splitting and subsequent sliding-window decoding as well.
-The \ac{bp} and \ac{bpgd} are implementation in Rust to achieve
-higher simulation speeds leveraging the compiled nature of the language.
-We reimplemented both the window splitting and the decoders.
+splitting and subsequent sliding-window decoding as well, before
+reimplementing in Rust.
+The \ac{bp} and \ac{bpgd} are implemented in Rust to achieve
+higher simulation speeds leveraging the compiled nature of the
+language.
 
 % Global experimental setup
 
@@ -1268,7 +1269,7 @@ For the generation of the \ac{dem} we set the number of syndrome
 extraction rounds to $12$, similarly to \cite{gong_toward_2024}, and
 we defined our detectors as in the example in
 \Cref{subsec:Detector Error Matrix}.
-We employed circuit-lose noise as described in
+We employed circuit-level noise as described in
 \Cref{subsec:Choice of Noise Model} as our noise model, specifically standard
 ciruit-based depolarizing noise \cite[Sec.~VIII]{fowler_high-threshold_2009},
 i.e., all error locations in the circuit get assigned the same
@@ -1285,7 +1286,8 @@ generated by simulating at least $200$ logical error events.
 
 We begin our investigation by using \ac{bp} with no further
 modifications as the inner decoder.
-We chose the min-sum variant of \ac{bp} due to its low computational complexity.
+We choose the min-sum variant of \ac{bp} due to its low computational
+complexity.
 
 % [Thread] Get impression for max gain
 
@@ -1295,7 +1297,7 @@ To this end, we begin by analyzing the decoding performance of the
 original process, without our warm-start modification.
 We will call this \emph{cold-start} decoding in the following.
 Because we expect more global decoding to work better (the inner
-    decoder then has access to a larger portion of the long-range
+    decoder has access to a larger portion of the long-range
     correlations encoded in the detector error matrix before any commit
 is made) we initially decide to use decoding on the whole detector
 error matrix as a proxy for the attainable decoding performance.
@@ -1729,7 +1731,7 @@ hypothesis put forward above.
 The whole-block decoder eventually overtaking every windowed scheme
 matches the prediction made there: with a sufficiently large
 iteration budget, the whole-block decoder reaches an error rate
-that nonone of the windowed schemes can beat, because of the more global
+that none of the windowed schemes can beat, because of the more global
 nature of the considered constraints.
 Furthermore, the pronounced advantage of warm- over cold-start decoding at low
 numbers of iterations makes sense if we consider the overall trend of the plots.
@@ -1742,7 +1744,7 @@ initialization diminishes, and the curves approach each other.
 The fact that no curve clearly saturates within the swept range is
 itself worth noting.
 We know that \ac{bp} on \ac{qldpc} codes suffers from poor
-convergence due to degeneracy and the short cycles in the underlying
+convergence due to degeneracy and short cycles in the underlying
 Tanner graph, so even after several thousand iterations the decoder
 may continue to slowly refine its message estimates rather than
 settle into a stable fixed point.
@@ -1968,9 +1970,9 @@ previous experiments.
 % [Experimental parameters] Figure 4.9
 
 \Cref{fig:bp_f} summarizes the results of this investigation.
-In both panels the dashed colored curves correspond to cold-start
-sliding-window decoding for $F \in \{1, 2, 3\}$ and the solid colored
-curves to the corresponding warm-start runs.
+In both panels, the dashed curves correspond to cold-start
+sliding-window decoding for $F \in \{1, 2, 3\}$ and the solid
+curves to warm-start decoding.
 The window size is fixed to $W = 5$ throughout.
 \Cref{fig:bp_f_over_p} sweeps the physical error rate over
 $p \in [0.001, 0.004]$ in steps of $0.0005$ at a fixed maximum of
@@ -1990,7 +1992,7 @@ monotonic increase of the per-round \ac{ler} with the physical
 error rate.
 At fixed $F$, the warm-start approach lies below
 cold-start across the entire sweep, and at fixed
-warm- or cold-start, smaller $F$ produces a lower \ac{ler}.
+warm or cold start, smaller $F$ produces a lower \ac{ler}.
 Both gaps grow as the physical error rate decreases:
 the curves at $F = 1$ separate further from those at $F = 2$ and $F = 3$,
 and the warm-start curves separate further from the cold-start ones.
@@ -2314,7 +2316,7 @@ Recall from
 that the warm start for \ac{bpgd} carries over not only the \ac{bp}
 messages on the edges of the overlap region but also the decimation
 information.
-Because we run with an iteration budget large enough to decimate
+Because we decode with an iteration budget large enough to decimate
 every \ac{vn} in a window, by the time window $\ell$ ends, all
 of its \acp{vn} have already been hard-decided.
 For the \acp{vn} that lie in the overlap region with window $\ell + 1$
@@ -2515,8 +2517,8 @@ fixed physical error rate.
 \Cref{fig:bpgd_iter} shows the per-round \ac{ler} of \ac{bpgd}
 sliding-window decoding as a function of the maximum number of inner
 \ac{bp} iterations $n_\text{iter}$.
-The dashed colored curves correspond to cold-start sliding-window
-decoding and the solid colored curves to warm-start, which again
+The dashed curves correspond to cold-start sliding-window
+decoding and the solid curves to warm-start, which again
 retains both the \ac{bp} messages and the decimaiton information on
 the overlap region.
 The physical error rate is fixed at $p = 0.0025$ and the iteration
@@ -2607,8 +2609,8 @@ of the warm-start curves and limit ourselves to noting it.
 The natural consequence of the previous diagnosis is to drop the
 problematic part of the warm-start initialization for \ac{bpgd} and
 to carry over only the \ac{bp} messages on the edges of the overlap
-region, as in \Cref{fig:messages_tanner}, while leaving the channel
-\acp{llr} of the next window in their original cold-start state.
+region, as in \Cref{fig:messages_tanner}, while leaving the
+decimation information of the next window in its original cold-start state.
 Note that some information about the previous window's decimation
 state is still implicitly carried over through the \ac{bp} messages,
 since the decimation decisions were made based on the messages themselves.
@@ -2803,7 +2805,7 @@ as $F$ grows.
 
 % [Description] Interpretation 4.12
 
-Removing the channel \acp{llr} from the warm-start initialization lifts
+Removing the decimation information from the warm-start initialization lifts
 the warm-start regression observed in \Cref{fig:bpgd_wf},
 and warm-start now consistently outperforms cold-start.
 The dependence on the window size and the step size also recovers
@@ -3036,7 +3038,7 @@ performance gain when the inner decoder is upgraded from plain
 \ac{bp} to its guided-decimation variant, but only if some care is
 taken in choosing what to information carry over.
 Passing the channel \acp{llr} along with the \ac{bp} messages,
-as suggested by naively carrying over the warm-start idea to \ac{bpgd},
+as suggested by naively transferring the warm-start idea to \ac{bpgd},
 leads to premature hard decisions on \acp{vn} in the overlap region.
 This leads to warm-start initialization actually worsening the
 performance compared to cold-start initialization.