diff --git a/src/thesis/chapters/3_fault_tolerant_qec.tex b/src/thesis/chapters/3_fault_tolerant_qec.tex
index 111fa64..2161761 100644
--- a/src/thesis/chapters/3_fault_tolerant_qec.tex
+++ b/src/thesis/chapters/3_fault_tolerant_qec.tex
@@ -312,7 +312,7 @@ the decoding process.
 The core idea behind detector error models is to do this by defining
 a new \emph{circuit code} that describes the circuit.
 Each \ac{vn} of this new code corresponds to an error location in the
-circuit and each \ac{cn} corresponds to a to a syndrome measurement.
+circuit and each \ac{cn} corresponds to a syndrome measurement.
 % This circuit code, combined with the prior probabilities of error
 % given by the noise model, incorporates all information necessary for decoding.
 
@@ -508,7 +508,8 @@ arriving at the circuit depicted in
 \autoref{fig:rep_code_multiple_rounds_phenomenological}.
 For each additional error location, we extend $\bm{\Omega}$ by
 appending the corresponding syndrome vector as a column.
-\begin{gather*}
+\begin{gather}
+    \label{eq:syndrome_matrix_ex}
     \bm{\Omega} =
     \left(
         \begin{array}{ccccccccccccccc}
@@ -534,11 +535,21 @@ appending the corresponding syndrome vector as a column.
             \end{array}
         }
     }_\text{Original matrix}
-\end{gather*}
+\end{gather}
 Notice that the first three columns correspond to the original
 measurement syndrome matrix, as these columns correspond to the error
 locations on the data qubits.
 
+In this example, all measurements we considered were syndrome measurements.
+Assuming no errors, the results of those measurements were
+deterministic, irrespective of the actual logical state
+$\ket{\psi}_\text{L}$, as they only depend on whether
+$\ket{\psi}_\text{L} \in \mathcal{C}$, not on the concrete state.
+It is, in general, possible to also consider non-deterministic measurements.
+As an example, it is usual to consider a round of noiseless
+measurements of the actual data qubit states after the last syndrome
+extraction round.
+
 \begin{figure}[t]
     \centering
 
@@ -745,22 +756,21 @@ locations on the data qubits.
 \end{figure}
 
 %%%%%%%%%%%%%%%%
-\subsection{Detector Error Matrix}
-\label{subsec:Detector Error Matrix}
+\subsection{Detector Matrix}
+\label{subsec:Detector Matrix}
 
 % Core idea
 
-% TODO: Make this a proper definition?
-Instead of using the measurements as parity indicators directly, we
-may wish to combine them in some way.
-We call such combinations \emph{detectors}.
-Formally, a detector is a parity constraint on a set of measurement
-outcomes \cite[Def.~2.1]{derks_designing_2025}.
-Changing the perspective in this way does not alter the theoretical
-error correcting capabilities of the circuit, but it may change the
-decoding performance when using a practical decoder.
-\red{[Possibly a few more words on this (maybe a mathematical
-proof/intuition?)]}
+Instead of using stabilizer measurement results directly, we
+generalize the notion of what constitutes a parity check slightly.
+We formally define a \emph{detector} as a deterministic parity constraint on
+a set of measurement outcomes \cite[Def.~2.1]{derks_designing_2025}.
+In the most straight forward case, we may simply use the stabilizer
+measurements as detectors.
+We immediately recognize that we will have as many linearly
+independent detectors as there are separate deterministic measurements.
+We generally aim to utilize the maximum number of linearly
+independent detectors \cite[Sec.~2.2]{derks_designing_2025}.
 
 % The detector matrix
 
@@ -768,21 +778,36 @@ proof/intuition?)]}
 We describe the relationship between measurements and detectors using
 the \emph{detector matrix} $\bm{D} \in \mathbb{F}_2^{d\times m}$
 \cite[Def.~2.2]{derks_designing_2025}.
-Similar to the way a \ac{pcm} connects bits with parity checks, the
+Similar to the way a \ac{pcm} associates bits with parity checks, the
 detector matrix links measurements and detectors.
-Each column corresponds to a measurement, while the rows correspond
-to the detectors.
+Each column corresponds to a measurement, while each rows corresponds
+to a detector.
 We should note at this point that the combination of measurements
 into detectors has no bearing on the actual construction of the
 syndrome extraction circuitry.
 It is something that happens ``virtually'' after the fact and only
 affects the decoder.
 
+Note that we can use the detector matrix $\bm{D}$ to describe the set
+of possible measurement outcomes under the absence of noise.
+The same way we use a \ac{pcm} to describe the code space as
+\begin{align*}
+    \mathcal{C}
+    = \{ \bm{x} \in \mathbb{F}_2^{n} : \bm{H}\bm{x}^\text{T} = \bm{0} \}
+    ,%
+\end{align*}
+the set of possible measurement outcomes is simply $\text{kern}\{\bm{D}\}$
+\cite[Sec.~2.2]{derks_designing_2025}.
+
+%%%%%%%%%%%%%%%%
+\subsection{Detector Error Matrix}
+\label{subsec:Detector Error Matrix}
+
 % The detector error matrix
 
 We now know how the errors at different locations in the circuit
-affect the measurements ($\bm{\Omega}$), and we know how the
-measurements relate to the detectors ($\bm{D}$).
+affect the measurements (through $\bm{\Omega}$), and we know how the
+measurements relate to the detectors (through $\bm{D}$).
 For decoding, we are interested in the effect of the errors on the
 detectors directly.
 We thus construct the \emph{detector error matrix} $\bm{H} \in
@@ -791,25 +816,58 @@ We thus construct the \emph{detector error matrix} $\bm{H} \in
     \bm{H} := \bm{D}\bm{\Omega}
     .%
 \end{align*}
-Note that, in particular when $d=m$, this is equivalent to performing row
-additions on the matrix $\bm{\Omega}$.
+
+% There are multiple ways of choosing the detectors
+
+There is a degree of freedom in how we choose the detectors, which is
+reflected in the fact that we can construct multiple different
+detector matrices $\bm{D}$ from the same circuit.
+For two detector matrices $\bm{D}_1$ and $\bm{D}_2$, as long as
+\begin{gather}
+    \label{eq:kern_condition}
+    \text{kern}\{\bm{D}_1\} = \text{kern}\{\bm{D}_2\}
+\end{gather}
+they describe the same set of possible measurement outcomes (under
+the absence of noise) and thus the same circuit.
+In fact, as long as \autoref{eq:kern_condition} holds, the detector
+error matrices we construct from them can distinguish between the
+same pairs of error sets \cite[Lemma~6]{derks_designing_2025}.
+To see this, we note that we can distinguish between two circuit
+error vectors $\bm{e}_1$ and $\bm{e}_2$ as long as they do not
+violate the same set of detectors, i.e.,
+\begin{align*}
+    \hspace{-15mm}
+    % tex-fmt: off
+                       && \bm{H} \bm{e}_1^\text{T} & \neq \bm{H} \bm{e}_2^\text{T} \\
+   \iff \hspace{-33mm} && \bm{H} \left( \bm{e}_1 - \bm{e}_2 \right)^\text{T} & \neq 0 \\
+   \iff \hspace{-33mm} && \bm{D} \bm{\Omega} \left( \bm{e}_1 - \bm{e}_2 \right)^\text{T} & \neq 0 \\
+   \iff \hspace{-33mm} && \bm{\Omega} \left( \bm{e}_1 - \bm{e}_2 \right)^\text{T} & \notin \text{kern} \{\bm{D}\}
+    % tex-fmt: on
+    .%
+\end{align*}
+We conclude that altering our perspective by choosing a different
+detector matrix
+does not modify the error correcting capabilities of the code.
+It may, however, change the decoding performance when using a practical decoder.
 
 % How to choose the detectors
 
 % TODO: Fix notation (n used for both number of measurements and
 % measurements themselves)
 % TODO: Properly define the ranges i and r belong to
-We still have a degree of freedom in how we choose the detectors.
-\red{[No way to automate this, NP-complete problem to find detectors
-which yield highest sparity]}
+What constitutes a good set of detectors is difficult to assess
+without performing explicit decoding simulations, since it ultimately
+depends on the decoder employed.
+For iterative decoders, high sparsity is generally beneficial, but
+finding detectors that maximize sparsity is an NP-complete problem
+\cite[Sec.~2.6]{derks_designing_2025}.
 There is, however, one way of defining the detectors that will prove useful
 at a later stage.
-To the measurement results from each syndrome extraction round, we
+To the measurement results from each syndrome extraction round we
 can add the results from the previous round, as illustrated in
 \autoref{fig:detectors_from_measurements_general}.
 Concretely, we denote the outcome of the
-$i$-th syndrome measurement in round $r$ by $m_i^{(r)} \in \mathbb{F}_2$
-and define
+$i$-th measurement in round $r$ by $m_i^{(r)} \in \mathbb{F}_2$ and define
 \begin{gather*}
     \bm{m}^{(r)} :=
     \begin{pmatrix}
@@ -832,50 +890,8 @@ $d_i^{(r)} \in \mathbb{F}_2$ and define
     := \bm{m}^{(r)} + \bm{m}^{(r-1)}
     ,%
 \end{gather}
-where $\bm{m}^{(0)} = \bm{0}$.
+with $\bm{m}^{(0)} = \bm{0}$.
 
-We again turn our attention to the three-qubit repetition code.
-In \autoref{fig:rep_code_multiple_rounds_phenomenological} we can see
-that $E_6$ has occurred and has subsequently tripped the last four measurements.
-We now take those measurements and combine them according to
-\autoref{eq:measurement_combination}.
-We can see this process graphically in
-\autoref{fig:detectors_from_measurements_rep_code}.
-To understand why this way of defining the detectors is useful, we
-note that the error $E_6$ in
-\autoref{fig:rep_code_multiple_rounds_phenomenological} has not only
-tripped the measurements in the syndrome extraction round immediately
-afterwards, but all subsequent ones as well.
-To only see errors in the rounds immediately following them, we
-consider our newly defined detectors instead of the measurements,
-that effectively compute the difference between the measurements.
-
-Each error can only trip syndrome bits that follow it.
-We can see this in the triangular structure of $\bm{\Omega}$ in
-\autoref{fig:rep_code_multiple_rounds_phenomenological}.
-Combining the measurements into detectors according to
-\autoref{eq:measurement_combination}, we are performing row additions
-in such a way as to clear the bottom left of the matrix.
-This yields a block-diagonal structure for the detector error matrix
-$\bm{H}$, as in the example in
-\autoref{fig:detectors_from_measurements_rep_code}.
-\begin{align*}
-    \bm{H} =
-    \left(
-        \begin{array}{ccccccccccccccc}
-            1 & 1 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0  \\
-            0 & 1 & 1 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0  \\
-            0 & 0 & 0 & 1 & 0 & 1 & 1 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0  \\
-            0 & 0 & 0 & 0 & 1 & 0 & 1 & 1 & 0 & 1 & 0 & 0 & 0 & 0 & 0  \\
-            0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 1 & 1 & 0 & 1 & 0  \\
-            0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 1 & 1 & 0 & 1
-        \end{array}
-    \right)
-\end{align*}
-Note that we exploit the fact that each syndrome measurement round is
-identical to obtain this structure.
-
-% TODO: Change notation (\bm{D})
 \begin{figure}[t]
     \centering
 
@@ -901,6 +917,46 @@ identical to obtain this structure.
     \label{fig:detectors_from_measurements_general}
 \end{figure}
 
+We again turn our attention to the three-qubit repetition code.
+In \autoref{fig:rep_code_multiple_rounds_phenomenological} we can see
+that $E_6$ has occurred and has subsequently tripped the last four measurements.
+We now take those measurements and combine them according to
+\autoref{eq:measurement_combination}.
+We can see this process graphically in
+\autoref{fig:detectors_from_measurements_rep_code}.
+To understand why this way of defining the detectors is useful, we
+note that the error $E_6$ in
+\autoref{fig:rep_code_multiple_rounds_phenomenological} has not only
+tripped the measurements in the syndrome extraction round immediately
+afterwards, but all subsequent ones as well.
+To only see errors in the rounds immediately following them, we
+consider our newly defined detectors instead of the measurements,
+that effectively compute the difference between the measurements.
+
+Each error can only trip syndrome bits that follow it.
+This is reflected in the triangular structure of $\bm{\Omega}$ in
+\autoref{eq:syndrome_matrix_ex}.
+Combining the measurements into detectors according to
+\autoref{eq:measurement_combination}, we are effectively performing
+row additions in such a way as to clear the bottom left of the matrix.
+The detector error matrix
+\begin{align*}
+    \bm{H} =
+    \left(
+        \begin{array}{ccccccccccccccc}
+            1 & 1 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0  \\
+            0 & 1 & 1 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0  \\
+            0 & 0 & 0 & 1 & 0 & 1 & 1 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0  \\
+            0 & 0 & 0 & 0 & 1 & 0 & 1 & 1 & 0 & 1 & 0 & 0 & 0 & 0 & 0  \\
+            0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 1 & 1 & 0 & 1 & 0  \\
+            0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 1 & 1 & 0 & 1
+        \end{array}
+    \right)
+\end{align*}
+we obtain this way has a block-diagonal structure.
+Note that we exploit the fact that each syndrome measurement round is
+identical to obtain this structure.
+
 \begin{figure}[t]
     \centering