diff --git a/src/thesis/chapters/3_fault_tolerant_qec.tex b/src/thesis/chapters/3_fault_tolerant_qec.tex
index 17de3df..eeed27f 100644
--- a/src/thesis/chapters/3_fault_tolerant_qec.tex
+++ b/src/thesis/chapters/3_fault_tolerant_qec.tex
@@ -16,17 +16,19 @@ using qubits.
 While the use of error correcting codes may facilitate this, it also
 introduces two new challenges \cite[Sec.~4]{gottesman_introduction_2009}:
 \begin{itemize}
-    \item We must be able to perform operations on the encoded state
-        in such a way that we do not lose the protection against errors.
-    \item \ac{qec} systems are themselves partially implemented in
-        quantum hardware. In addition to the errors we have
-        originally introduced them for, these systems must
-        be able to account for the fact they are implemented on noisy
-        hardware themselves.
+    \item For realizing a quantum algorithm, we must be able to
+        perform operations on the encoded state in such a way that we
+        do not lose the protection against errors.
+    \item \ac{qec} systems, in particular the syndrome extraction
+        circuit, are themselves partially implemented in
+        quantum hardware.
+        In addition to the errors we have originally introduced them
+        for, these systems must be able to account for the fact they
+        are implemented on noisy hardware themselves.
 \end{itemize}
 In the literature, both of these points are viewed under the umbrella
 of \emph{fault-tolerant} quantum computing.
-We focus only on the second aspect in this work.
+In this thesis, we focus only on the second aspect.
 
 It was recognized early on as a challenge of \ac{qec} that the correction
 machinery itself may introduce new faults \cite[Sec.~III]{shor_scheme_1995}.
@@ -43,16 +45,16 @@ address both.
 We model the possible occurrence of errors during any processing
 stage as different \emph{error locations} $E_i,~i\in [1:N]$
 in the circuit.
-$N \in \mathbb{N}$ is the total number of considered error locations.
+The parameter $N \in \mathbb{N}$ is the total number of considered
+error locations.
 The \emph{circuit error vector} $\bm{e} \in \{0,1\}^N$ is a vector
 indicating which errors occurred, with
 \begin{align*}
     e_i :=
     \begin{cases}
-        1, & \text{Error $E_i$ occurred} \\
-        0, & \text{otherwise}
+        1, & \text{error $E_i$ occurred}, \\
+        0, & \text{otherwise}.
     \end{cases}
-    .%
 \end{align*}
 \Cref{fig:fault_tolerance_overview} illustrates the flow of errors.
 Specifically for \ac{css} codes, a \ac{qec} procedure is deemed
@@ -72,12 +74,14 @@ fault-tolerant, if \cite[Def.~4.2]{derks_designing_2025}
 where $t = \lfloor (d_\text{min} -1)/2 \rfloor$ is the number of
 errors the code is able to correct.
 The vectors $\bm{e}_{\text{output},X}$ and $\bm{e}_{\text{output},Z}$
-denote only $X$ and $Z$ errors respectively.
+denote only $X$ and $Z$ errors, respectively.
 
 % TODO: Properly introduce d_min for QEC, specifically for CSS codes
 In order to deal with internal errors that flip syndrome bits,
-multiple rounds of syndrome measurements must be performed.
-Typically, the number of syndrome extraction rounds is chosen as $d_\text{min}$.
+multiple rounds of syndrome measurements are performed.
+Typically, the number of syndrome extraction rounds is chosen as
+$d_\text{min}$, e.g., \cite{gong_toward_2024}
+\cite{koutsioumpas_automorphism_2025}.
 
 % % This is the definition of a fault-tolerant QEC gadget
 % A \ac{qec} procedure is deemed fault tolerant if
@@ -150,7 +154,7 @@ Typically, the number of syndrome extraction rounds is chosen as $d_\text{min}$.
 % Intro
 
 We collect the probabilities of error at each location in the
-\emph{noise model}, a vector $\bm{p} \in [0,1]^N$.
+\emph{noise model}, represented by a vector $\bm{p} \in [0,1]^N$.
 There are different types of noise models, each allowing for
 different error locations in the circuit.
 
@@ -178,8 +182,7 @@ $\ket{\psi}_\text{L}$ as \emph{data qubits}.
 Note that this is a concrete implementation using CNOT gates, as
 opposed to the system-level view introduced in
 \Cref{subsec:Stabilizer Codes}.
-We visualize the different types of noise models in
-\Cref{fig:noise_model_types}.
+\Cref{fig:noise_model_types} visualizes the different types of noise models.
 
 %%%%%%%%%%%%%%%%
 \subsection{Bit-Flip Noise}
@@ -190,7 +193,7 @@ This corresponds to the classical \ac{bsc}, i.e., only $X$ errors on the
 data qubits are possible \cite[Appendix~A]{gidney_new_2023}.
 The occurrence of bit-flip errors is modeled as a Bernoulli process
 $\text{Bern}(p)$.
-This type of noise model is shown in \Cref{subfig:bit_flip}.
+\Cref{subfig:bit_flip} shows this type of noise model.
 
 Note that bit-flip noise is not suitable for developing fault-tolerant
 systems, as it does not account for errors during the syndrome extraction.
@@ -223,7 +226,7 @@ Here, we consider multiple rounds of syndrome measurements with a
 depolarizing channel before each round.
 Additionally, we allow for measurement errors by having $X$ error
 locations right before each measurement \cite[Appendix~A]{gidney_new_2023}.
-Note that it is enough to only consider $X$ errors at these points,
+Note that it is enough to only consider $X$ errors before measuring,
 since that is the only type of error directly affecting the
 measurement outcomes.
 This model is depicted in \Cref{subfig:phenomenological}.
@@ -253,7 +256,7 @@ While phenomenological noise is useful for some design aspects of
 fault-tolerant circuitry, for simulations, circuit-level noise should
 always be used \cite[Sec.~4.2]{derks_designing_2025}.
 Note that this introduces new challenges during the decoding process,
-as the decoding complexity is increased considerably due to the many
+as the decoding complexity is considerably increased due to the many
 error locations.
 
 \begin{figure}[t]
@@ -284,11 +287,11 @@ error locations.
 framework for
 passing information about a circuit used for \ac{qec} to a decoder.
 They are also useful as a theoretical tool to aid in the design of
-fault-tolerant \ac{qec} schemes.
-E.g., they can be used to easily determine whether a measurement
-schedule is fault-tolerant \cite[Example~12]{derks_designing_2025}.
+fault-tolerant \ac{qec} schemes, e.g., they can be used to easily
+determine whether a measurement schedule is fault-tolerant
+\cite[Example~12]{derks_designing_2025}.
 
-Other approaches of implementing fault-tolerance circuits exist, such as
+Other approaches of implementing fault-tolerance circuits exist, e.g.,
 flag error correction, which uses additional ancilla qubits to detect
 potentially damaging high-weight errors \cite[Sec.~1]{chamberland_flag_2018}.
 However, \acp{dem} offer some unique advantages
@@ -310,7 +313,7 @@ To achieve fault tolerance, the goal we strive towards is to
 consider the internal errors in addition to the input errors during
 the decoding process.
 The core idea behind detector error models is to do this by defining
-a new \emph{circuit code} that describes the circuit.
+a new \emph{circuit code} describing the whole circuit.
 Each \ac{vn} of this new code corresponds to an error location in the
 circuit and each \ac{cn} corresponds to a syndrome measurement.
 % This circuit code, combined with the prior probabilities of error
@@ -446,12 +449,11 @@ matrix} $\bm{\Omega} \in \mathbb{F}_2^{M\times N}$, with
 \begin{align*}
     \Omega_{\ell,i} =
     \begin{cases}
-        1, & \text{Error $i$ flips measurement $\ell$}\\
-        0, & \text{otherwise}
+        1, & \text{error $i$ flips measurement $\ell$},\\
+        0, & \text{otherwise},
     \end{cases}
-    ,%
 \end{align*}
-where $M \in \mathbb{N}$ is the number of measurements.
+where $M \in \mathbb{N}$ is the number of performed syndrome measurements.
 To obtain $\bm{\Omega}$, we must propagate Pauli errors through the
 circuit, tracking which measurements they affect
 \cite[Sec.~2.4]{derks_designing_2025}.
@@ -466,8 +468,8 @@ Each round yields an additional set of syndrome bits,
 and we combine them by stacking them in a new vector
 $\bm{s} \in \mathbb{F}_2^{R(n-k)}$, where $R \in \mathbb{N}$ is the
 number of syndrome measurement rounds.
-We thus have to replicate the rows of $\bm{H}_Z$, once for each
-additional syndrome measurement, to obtain
+Thus, we have to replicate the rows of $\bm{H}_Z$, once for each
+additional syndrome measurement, and obtain
 \begin{align*}
     \bm{\Omega}_0 =
     \begin{pmatrix}
@@ -493,11 +495,11 @@ extraction circuitry, so we still consider only bit flip noise at this stage.
 Recall that $\bm{\Omega}_0$ describes which \ac{vn} is connected to
 which parity check and the syndrome indicates which parity checks
 are violated.
-This means that if an error exists at only a single \ac{vn}, we can
-read off the syndrome in the corresponding column.
+Therefore, if an error occurs that corresponds to a single \ac{vn},
+the measured syndrome is the corresponding column.
 If errors occur at multiple locations, the resulting syndrome will be
 the linear combination of the respective columns.
-We thus have
+Thus, we have
 \begin{align*}
     \bm{s} \in \text{span} \{\bm{\Omega}_0\}
     .%
@@ -505,13 +507,13 @@ We thus have
 
 % Expand to phenomenological
 
-We now wish to expand the error model to phenomenological noise, though
+Next, we expand the error model to phenomenological noise, though
 only considering $X$ errors in this case.
 We introduce new error locations at the appropriate positions,
-arriving at the circuit depicted in
+resulting in the circuit depicted in
 \Cref{fig:rep_code_multiple_rounds_phenomenological}.
 For each additional error location, we extend $\bm{\Omega}_0$ by
-appending the corresponding syndrome vector as a column.
+appending the corresponding syndrome vector as a column, yielding
 \begin{gather}
     \label{eq:syndrome_matrix_ex}
     \bm{\Omega}_1 =
@@ -790,15 +792,14 @@ to a detector.
 We should note at this point that the combination of measurements
 into detectors has no bearing on the actual construction of the
 syndrome extraction circuitry.
-It is something that happens ``virtually'' after the fact and only
-affects the decoder.
+It is something that happens ``virtually'' and only affects the decoder.
 
 Note that we can use the detector matrix $\bm{D}$ to describe the set
 of possible measurement outcomes under the absence of noise.
 Similar to the we use a \ac{pcm} to describe the code space as
 \begin{equation*}
     \mathcal{C}
-    = \{ \bm{x} \in \mathbb{F}_2^{n} : \bm{H}\bm{x}^\text{T} = \bm{0} \}
+    = \{ \bm{x} \in \mathbb{F}_2^{n} : \bm{H}\bm{x}^\mathsf{T} = \bm{0} \}
     ,%
 \end{equation*}
 the set of possible measurement outcomes is simply $\text{kern}\{\bm{D}\}$
@@ -815,7 +816,7 @@ affect the measurements (through $\bm{\Omega}$), and we know how the
 measurements relate to the detectors (through $\bm{D}$).
 For decoding, we are interested in the effect of the errors on the
 detectors directly.
-We thus construct the \emph{detector error matrix} $\bm{H} \in
+Thus, we construct the \emph{detector error matrix} $\bm{H} \in
 \mathbb{F}_2^{D\times N}$ \cite[Def.~2.9]{derks_designing_2025} as
 \begin{align*}
     \bm{H} := \bm{D}\bm{\Omega}
@@ -859,7 +860,7 @@ It may, however, change the decoding performance when using a practical decoder.
 
 What constitutes a good set of detectors is difficult to assess
 without performing explicit decoding simulations, since it ultimately
-depends on the decoder employed.
+depends on the employed decoder.
 For iterative decoders, high sparsity is generally beneficial, but
 finding detectors that maximize sparsity is an NP-complete problem
 \cite[Sec.~2.6]{derks_designing_2025}.
@@ -868,7 +869,7 @@ at a later stage.
 To the measurement results from each syndrome extraction round we
 can add the results from the previous round, as illustrated in
 \Cref{fig:detectors_from_measurements_general}.
-We thus have $D=n-k$.
+Thus, we have $D=n-k$.
 Concretely, we denote the outcome of
 measurement $\ell \in [1:n-k]$ in round $r \in [1:R]$ by
 $m_\ell^{(r)} \in \mathbb{F}_2$
@@ -935,17 +936,18 @@ note that the error $E_6$ in
 \Cref{fig:rep_code_multiple_rounds_phenomenological} has not only
 triggered the measurements in the syndrome extraction round immediately
 afterwards, but all subsequent ones as well.
-To only see errors in the rounds immediately following them, we
-consider our newly defined detectors instead of the measurements,
-that effectively compute the difference between the measurements.
+To only see the effect of errors in the syndrome measurement round
+immediately following them, we consider our newly defined detectors
+instead of the measurements, that effectively compute the difference
+between the measurements.
 
-Each error can only trigger syndrome bits that follow it.
+Hereby, each error can only trigger syndrome bits that follow it.
 This is reflected in the triangular structure of $\bm{\Omega}$ in
 \Cref{eq:syndrome_matrix_ex}.
 Combining the measurements into detectors according to
 \Cref{eq:measurement_combination}, we are effectively performing
 row additions in such a way as to clear the bottom left of the matrix.
-The detector error matrix
+The resulting detector error matrix
 \begin{align*}
     \bm{H} =
     \left(
@@ -959,7 +961,7 @@ The detector error matrix
         \end{array}
     \right)
 \end{align*}
-obtained this way has a block-diagonal structure.
+has a block-diagonal structure.
 Note that we exploit the fact that each syndrome measurement round is
 identical to obtain this structure.
 
@@ -1008,9 +1010,8 @@ error matrix $\bm{H}$ and the noise model $\bm{p}$.
 \cite[Sec.~6]{derks_designing_2025}.
 It serves as an abstract representation of a circuit and can be used
 both to transfer information to a decoder but also to aid in the
-design of fault-tolerant systems.
-E.g., it can be used to investigate the properties of a circuit with
-respect to fault tolerance.
+design of fault-tolerant systems, e.g., it can be used to investigate
+the properties of a circuit with respect to fault tolerance.
 It contains all information necessary for the decoding process.
 
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
@@ -1052,7 +1053,7 @@ value, the physical error rate $p_\text{phys}$.
 
 % Per-round LER
 
-Another aspect that is important to consider is the meaning of the
+Another important aspect to consider is the meaning of the
 \ac{ler} in the context of a \ac{qec} system with multiple
 rounds of syndrome measurements.
 In order to facilitate the comparability of results obtained from
@@ -1063,7 +1064,7 @@ The simplest way of calculating the per-round \ac{ler} is by modeling
 each round as an independent experiment.
 For each experiment, an error might occur with a certain probability
 $p_\text{e,round}$.
-The overall probability of error is then
+Then the overall probability of error is
 \begin{align}
     \hspace{-12mm}
     p_\text{e,total} &= 1 - (1 - p_\text{e,round})^{R} \nonumber\\
@@ -1073,13 +1074,14 @@ The overall probability of error is then
     .%
     \hspace{12mm}
 \end{align}
-We approximate $p_\text{e,total}$ using a Monte Carlo simulation and
-compute the per-round-\ac{ler} using \Cref{eq:per_round_ler}.
+To this end, we approximate $p_\text{e,total}$ using a Monte Carlo
+simulation and
+compute the per-round-\ac{ler} according to \Cref{eq:per_round_ler}.
 This is the approach taken in \cite{gong_toward_2024}\cite{wang_fully_2025}.
 
 Another approach \cite{chen_exponential_2021}%
 \cite{bausch_learning_2024}\cite{beni_tesseract_2025} is to assume an
-exponential decay for the decoder's \emph{logical fidelity}
+exponential decay for the \emph{logical fidelity} of the decoder
 \cite[Eq.~(2)]{bausch_learning_2024}
 \begin{align*}
     F_\text{total} = (F_\text{round})^{R}
@@ -1104,10 +1106,10 @@ topic to our own work.
 \subsection{Stim}
 \label{subsec:Stim}
 
-It is not immediately apparent how the \ac{dem} will look from looking
-at a code's \ac{pcm}, because it heavily depends on the exact circuit
-construction and choice of noise model.
-As we noted in \Cref{subsec:Measurement Syndrome Matrix}, we can
+It is not immediately apparent how the \ac{dem} will look from
+considering the \ac{pcm} of a code, because it heavily depends on the
+exact circuit construction and choice of noise model.
+As we noted in \Cref{subsec:Measurement Syndrome Matrix}, we
 obtain a measurement syndrome matrix by propagating Pauli frames
 through the circuit.
 The standard choice of simulation tool used for this purpose is
@@ -1118,16 +1120,16 @@ pypi package.
 In fact, it was in this tool that the concept of the \ac{dem} was
 first introduced.
 
-One capability of stim, and \acp{dem} in general, that we didn't go
-into detail about in this chapter is the merging of error mechanisms.
+One capability of stim, and \acp{dem} in general, that we did not
+explain in detail about in this chapter, is the merging of error mechanisms.
 Since \acp{dem} differentiate errors based on their effect on the
 measurements and not on their Pauli type and location
 \cite[Sec.~1.4.3]{higgott_practical_2024}, it is natural to group
-errors that have the same effect.
+errors that have the same effect, i.e., syndrome.
 This slightly lowers the computational complexity of decoding, as the
 number of resulting \acp{vn} is reduced.
 
-While stim is a useful tool for circuit simulation, it doesn't
+While stim is a useful tool for circuit simulation, it does not
 include many utilities for building syndrome extraction circuitry automatically.
 The user has to define most, if not all, of the circuit manually,
 depending on the code in question.