diff --git a/src/thesis/chapters/4_decoding_under_dems.tex b/src/thesis/chapters/4_decoding_under_dems.tex
index 4c026c3..0503cdb 100644
--- a/src/thesis/chapters/4_decoding_under_dems.tex
+++ b/src/thesis/chapters/4_decoding_under_dems.tex
@@ -1659,34 +1659,16 @@ sliding-window approach is still at an advantage.
 
 % [Thread] Exploration of the effect of the step size
 
-% TODO: Write
-
-% [Experimental parameters] Figure 4.10
-
-% tex-fmt: off
-\red{\textbf{overall:}[warm, cold $F\in\{1,2,3\}$][$W=5$]}
-\red{\textbf{a)}[$p \in \{\ldots\}$][$n_\text{iter} = 200$]}
-\red{\textbf{b)}[$p = 0.0025$][$n_\text{iter}\in\{...\}$]}
-
-% [Description] Figure 4.10
-
-\red{\textbf{a)}[lower F -> better performance, lower p -> larger
-gain of warm vs soft, \textbf{TODO}: find more]}
-\red{\textbf{b)}[lower F -> better performance, lower $n_\text{iter}$
--> larger gain of warm vs soft, no real saturation, \textbf{TODO}: find more]}
-% tex-fmt: on
-
-% [Interpretation] Figure 4.10
-
-\red{[lower $n_\text{iter}$ -> larger gain is same behavior as seen
-in plot before]}
-\red{[lower F -> better performance makes sense for the same reason
-larger W -> better performance: greater overlap]}
-
-% At some later point
-\content{When looking at max iterations: Callback to diminishing
-    returns with growing window size: More iterations more beneficial
-than larger window (+1 for warm-start)}
+Having examined the effect of the window size $W$, we next turned to
+the second windowing parameter, the step size $F$.
+We carried out an investigation analogous to the one above:
+we first compared warm- and cold-start decoding across the full range
+of physical error rates at a fixed iteration budget, and then we
+examined the dependence on the iteration budget at a fixed physical
+error rate.
+The window size was held fixed at $W = 5$ throughout, the value at
+which the warm-start variant produced the strongest performance in the
+previous experiments.
 
 \begin{figure}[t]
     \centering
@@ -1757,6 +1739,7 @@ than larger window (+1 for warm-start)}
         \end{tikzpicture}
 
         \caption{Comparison of window sizes for $F=1$.}
+        \label{fig:bp_f_over_p}
     \end{subfigure}%
     \hfill%
     \begin{subfigure}{0.48\textwidth}
@@ -1864,13 +1847,105 @@ than larger window (+1 for warm-start)}
         \vspace{-3.2mm}
 
         \caption{Comparison of step sizes for $W=5$.}
+        \label{fig:bp_f_over_iter}
     \end{subfigure}
 
     \caption{
         \red{\lipsum[2]}
     }
+    \label{fig:bp_f}
 \end{figure}
 
+% [Experimental parameters] Figure 4.10
+
+\Cref{fig:bp_f} summarizes the results of this investigation.
+
+In both panels the dashed colored curves correspond to cold-start
+sliding-window decoding for $F \in \{1, 2, 3\}$ and the solid colored
+curves to the corresponding warm-start runs.
+The window size is fixed to $W = 5$ throughout.
+\Cref{fig:bp_f_over_p} sweeps the physical error rate over
+$p \in [0.001, 0.004]$ in steps of $0.0005$ at a fixed maximum of
+$n_\text{iter} = 200$ \ac{bp} iterations per window invocation,
+mirroring the experimental setup of \Cref{fig:whole_vs_cold_vs_warm}.
+\Cref{fig:bp_f_over_iter} fixes the physical error rate at
+$p = 0.0025$ and sweeps the iteration budget over
+$n_\text{iter} \in \{32, 128, 256, 512, 1024, 2048, 4096\}$,
+mirroring the setup of \Cref{fig:bp_w_over_iter} and again including
+an inset that magnifies the low-iteration regime
+$n_\text{iter} \in [32, 512]$.
+
+% [Description] Figure 4.10
+
+In \Cref{fig:bp_f_over_p}, every curve exhibits the expected
+monotonic increase of the per-round \ac{ler} with the physical
+error rate.
+At fixed $F$, the warm-start approach lies below
+cold-start across the entire sweep, and at fixed
+warm- or cold-start, smaller $F$ produces a lower \ac{ler}.
+Both gaps grow as the physical error rate decreases:
+the curves at $F = 1$ separate further from those at $F = 2$ and $F = 3$,
+and the warm-start curves separate further from the cold-start ones.
+In \Cref{fig:bp_f_over_iter}, all six curves again decrease
+monotonically with the iteration budget, with no clear saturation
+even at $n_\text{iter} = 4096$.
+Lower $F$ yields a lower \ac{ler} throughout, and warm-start
+consistently outperforms cold-start at matching $F$.
+At $n_\text{iter} = 32$, all three cold-start curves coincide at
+roughly the same per-round \ac{ler}, while the warm-start curves are
+visibly spread out.
+Furthermore, the magnified plot confirms that the gap between warm-
+and cold-start curves at fixed $F$ shrinks as $n_\text{iter}$ grows,
+and that at fixed $n_\text{iter}$ this gap is largest for $F = 1$.
+
+% [Interpretation] Figure 4.10
+
+The observed dependence on the step size mirrors the dependence on
+the window size studied earlier and the same explanation applies.
+With $W$ held fixed, decreasing $F$ enlarges the overlap between
+consecutive windows from $W - F$ to $W - F + 1$ syndrome measurement rounds, so
+a smaller step size is beneficial for the same reason that a larger
+window size is:
+each \ac{vn} in an overlap region participates in more window
+invocations, and the warm-start modification effectively accumulates
+iterations on it across these invocations.
+The widening of the warm/cold gap towards low iteration counts and
+low physical error rates similarly mirrors the patterns already
+observed in
+\Cref{fig:whole_vs_cold_vs_warm,fig:bp_w_over_iter}.
+
+% TODO: Rephrase
+The coincidence of all three cold-start curves at
+$n_\text{iter} = 32$ is a direct consequence of the cold-start initialization.
+With each new window starting from a uniform prior regardless of $F$,
+the per-window decoding problem is essentially the same for every
+step size, and the corresponding \acp{ler} agree as long as the
+inner decoder has too few iterations to propagate information
+beyond the local syndrome structure within a single window.
+This is also the regime in which the warm-start advantage is most
+valuable, and indeed it is where the warm-start curves spread out
+most strongly with $F$.
+
+% TODO: Rephrase
+A noteworthy methodological point is that, in contrast to the window
+size $W$, the step size $F$ has no effect on decoding latency:
+the time at which the inner decoder for a given window can begin
+running is determined solely by when the syndromes for the rounds
+covered by that window have been collected, which is independent of
+how much the window overlaps with its predecessor.
+A smaller $F$ thus only costs additional total compute and not
+additional latency, which is favorable for a warm-start
+sliding-window implementation:
+the regime in which the warm-start modification helps most --- large
+overlap and therefore small $F$ --- is precisely the regime in which
+the cost of that overlap shows up only in the compute budget and not
+in the latency budget.
+
+% At some later point
+\content{When looking at max iterations: Callback to diminishing
+    returns with growing window size: More iterations more beneficial
+than larger window (+1 for warm-start)}
+
 %%%%%%%%%%%%%%%%
 \subsection{Belief Propagation with Guided Decimation}
 \label{subsec:Belief Propagation with Guided Decimation}