Finish first draft of BP warm start subsection

2026-05-02 23:40:29 +02:00
parent a90458dd8a
commit 5fabe2e146
2 changed files with 176 additions and 160 deletions
--- a/src/thesis/chapters/4_decoding_under_dems.tex
+++ b/src/thesis/chapters/4_decoding_under_dems.tex
@@ -574,6 +574,8 @@ We thus define
    \right\}
    .%
 \end{align*}
+Again, we set $\mathcal{I}_\text{overlap}^{(\ell)} =
+\mathcal{I}_\text{win}^{(\ell)}\setminus \mathcal{I}_\text{commit}^{(\ell)}$.
 Note that we have
 \begin{align*}
    \bigcup_{\ell=0}^{n_\text{win}-1}
@@ -640,6 +642,15 @@ and after decoding all windows we will therefore have committed all \acp{vn}.
            := \mathcal{J}_\text{win}^{(\ell)} \setminus
        \mathcal{J}_\text{commit}^{(\ell)}$};

+        \draw [
+            decorate,
+            decoration={brace,amplitude=3mm,raise=1mm}
+        ]
+        (a10) -- (a00 -| b00)
+        node[midway,yshift=-8.25mm,xshift=-8mm,right]{$\mathcal{I}_\text{overlap}^{(\ell)}
+            := \mathcal{I}_\text{win}^{(\ell)} \setminus
+        \mathcal{I}_\text{commit}^{(\ell)}$};
+
        \node[align=center] at ($(a00)!0.5!(b01)$)
        {%
            $\bm{H}_\text{overlap}^{(\ell)}$ \\[3mm]
@@ -657,11 +668,9 @@ and after decoding all windows we will therefore have committed all \acp{vn}.
        dashed box shows the analogous region for window $\ell + 1$.
        The shaded region marks the submatrix
        $\bm{H}_\text{overlap}^{(\ell)}$, whose rows correspond to the
-        overlap CNs $\mathcal{J}_\text{overlap}^{(\ell)} =
-        \mathcal{J}_\text{win}^{(\ell)} \setminus
-        \mathcal{J}_\text{commit}^{(\ell)}$ shared with the next window,
-        and whose columns correspond to the committed VNs
-        $\mathcal{I}_\text{commit}^{(\ell)}$.
+        overlap CNs $\mathcal{J}_\text{overlap}^{(\ell)}$ shared with
+        the next window, and whose columns correspond to the
+        committed VNs $\mathcal{I}_\text{commit}^{(\ell)}$.
        After decoding window $\ell$, this submatrix is used to update
        the syndrome of the overlap CNs based on the committed bit estimates.
    }
@@ -685,12 +694,12 @@ $\bm{H}_\text{overlap}^{(\ell)} =
 \mathcal{I}_\text{commit}^{(\ell)}}$.
 We have to account for this fact by updating the syndrome $\bm{s}$
 based on the committed bit values.
-Specifically, if $\bm{e}_\text{commit}^{(\ell)}$ describes the error
+Specifically, if $\hat{\bm{e}}_\text{commit}^{(\ell)}$ describes the error
 estimates committed after decoding window $\ell$, we have to set
 \begin{align*}
-    \bm{s}_{\mathcal{J}_\text{overlap}^{(\ell)}} =
+    \left(\bm{s}\right)_{\mathcal{J}_\text{overlap}^{(\ell)}} =
    \bm{H}_\text{overlap}^{(\ell)}
-    \left( \bm{e}_\text{commit}^{(\ell)} \right)^\text{T}
+    \left( \hat{\bm{e}}_\text{commit}^{(\ell)} \right)^\text{T}
    .%
 \end{align*}

@@ -698,152 +707,43 @@ estimates committed after decoding window $\ell$, we have to set
 \section{Warm-Start Sliding-Window Decoding}
 \label{sec:warm_start_bp}

-% Intro
+% Intro: Problem with above procedure

 The sliding-window structure visible in \Cref{fig:windowing_pcm} is
 highly reminiscent of windowed decoding for \ac{sc}-\ac{ldpc} codes.
 Switching our viewpoint to the Tanner graph depicted in
-\Cref{fig:windowing_tanner}, however, we can see an important
+\Cref{fig:messages_decimation_tanner}, however, we can see an important
 difference between \ac{sc}-\ac{ldpc} decoding and the
 sliding-window decoding procedure detailed above.
 While the windowing process is similar, the algorithm above
 reinitializes the decoder to start from a clean state when moving to
 the next window.
 It therefore does not make use of the integral property of
-\ac{sc}-\ac{ldpc} decoding of exploiting the spatially coupled
+windowed \ac{sc}-\ac{ldpc} decoding of exploiting the spatially coupled
 structure by passing soft information from earlier to later spatial positions.

-We propose a modification to the procedure detailed in
-\Cref{subsec:Window Splitting and Sequential Sliding-Window Decoding}:
-Instead of zero-initializing the \ac{bp} messages of the next window,
-we perform a \emph{warm start} by initializing the messages in the
-overlapping region to the values last held during the decoding of the
-previous window.
+% Passing messages requires messages
+
+% TODO: Move this to the intro?
+The act of passing messages from one window to the next requires
+there being messages at the end of decoding that are still relevant
+to the decoding process.
+This somewhat limits the variety of inner decoders the warm-start
+initialization can be used with.
+E.g., \ac{bp}+\ac{osd} does not immediately seem suitable, though
+this remains to be investigated.
+\red{[Something general about the available decoders]}
+\red{[Define inner decoder]}
+
+% Proposed modification: Mathematical definition

-\content{callback to intro: explain why we consider BPGD instead of,
-e.g., BP+OSD}
-\content{Explain why we expect a warm start to be beneficial}
 \content{Mention that our own work ties into the bottom category in
 \Cref{fig:literature}}
-\content{Explicitly state that $\mathcal{I}_\text{win}^{(\ell)}$
-    overlaps with $\mathcal{I}_\text{win}^{(\ell + 1)}$, and that this is
-where the warm start applies}
-
-\begin{figure}[t]
-    \centering
-
-    \tikzset{
-        VN/.style={
-            circle, fill=KITgreen, minimum width=1mm, minimum height=1mm,
-        },
-        CN/.style={
-            rectangle, fill=KITblue, minimum width=1mm, minimum height=1mm,
-        },
-    }
-
-    \begin{tikzpicture}[node distance = 5mm]
-        \node[VN]                  (vn00) {};
-        \node[VN, below = of vn00] (vn01) {};
-        \node[VN, below = of vn01] (vn02) {};
-        \node[VN, below = of vn02] (vn03) {};
-        \node[VN, below = of vn03] (vn04) {};
-
-        \coordinate (temp) at ($(vn01)!0.5!(vn02)$);
-
-        \node[CN, left =10mm of temp] (cn00) {};
-        \node[CN, below = of cn00] (cn01) {};
-
-        \draw (vn00) -- (cn00);
-        \draw (vn01) -- (cn00);
-        \draw (vn03) -- (cn00);
-        \draw (vn01) -- (cn01);
-        \draw (vn02) -- (cn01);
-        \draw (vn04) -- (cn01);
-
-        \foreach \i in {1,2,3,4} {
-            \pgfmathtruncatemacro{\prev}{\i-1}
-
-            \node[VN, right = 25mm of vn\prev 0] (vn\i0) {};
-            \node[VN, below = of vn\i0]          (vn\i1) {};
-            \node[VN, below = of vn\i1]          (vn\i2) {};
-            \node[VN, below = of vn\i2]          (vn\i3) {};
-            \node[VN, below = of vn\i3]          (vn\i4) {};
-
-            \coordinate (temp) at ($(vn\i1)!0.5!(vn\i2)$);
-
-            \node[CN, left = 10mm of temp] (cn\i0) {};
-            \node[CN, below = of cn\i0]     (cn\i1) {};
-
-            \draw (vn\i0) -- (cn\i0);
-            \draw (vn\i1) -- (cn\i0);
-            \draw (vn\i3) -- (cn\i0);
-            \draw (vn\i1) -- (cn\i1);
-            \draw (vn\i2) -- (cn\i1);
-            \draw (vn\i4) -- (cn\i1);
-        }
-
-        \foreach \i in {1,2,3,4} {
-            \pgfmathtruncatemacro{\prev}{\i-1}
-
-            \draw (vn\prev 3) -- (cn\i 0);
-            \draw (vn\prev 4) -- (cn\i 1);
-        }
-
-        \node[
-            draw, inner sep=5mm,line width=1pt,
-            fit=(vn00)(vn04)(cn00)(cn01)(vn20)(vn24)(cn20)(cn21)
-        ]
-        (box1) {};
-        \node[
-            draw, dashed, inner sep=5mm, inner ysep=8mm,line width=1pt,
-            fit=(vn10)(vn14)(cn10)(cn11)(vn30)(vn34)(cn30)(cn31)
-        ]
-        (box2) {};
-
-        % Marker for W on the bottom
-        \draw[line width=1pt]
-        ([yshift=-5mm, line width=1pt]box1.south west) -- ++(0,-4mm)
-        coordinate (dim1l);
-        \draw[line width=1pt]
-        ([yshift=-5mm]box1.south east) -- ++(0,-4mm)
-        coordinate (dim1r);
-        \draw[{Latex}-{Latex}, line width=1pt]
-        ([yshift=1mm]dim1l) -- ([yshift=1mm]dim1r)
-        node[midway, below=2pt] {$W$};
-
-        % Marker for F on top
-        \draw[line width=1pt]
-        ([yshift=3mm]box2.north west) -- ++(0,4mm)
-        coordinate (dim3l);
-        \draw[line width=1pt]
-        ([yshift=3mm]box2.north west -| box1.north west) -- ++(0,4mm)
-        coordinate (dim3r);
-        \draw[{Latex}-{Latex}, line width=1pt]
-        ([yshift=-1mm]dim3l) -- ([yshift=-1mm]dim3r)
-        node[midway, above=2pt] {$F$};
-
-        % Arrow on the top right
-        \draw[-{Latex}, line width=1pt]
-        ([yshift=8mm] box1.north east) -- ++(28mm,0);
-    \end{tikzpicture}
-
-    \caption{Visualization of the windowing process on the Tanner graph.}
-    \label{fig:windowing_tanner}
-\end{figure}

 %%%%%%%%%%%%%%%%
-\subsection{Belief Propagation}
+\subsection{Warm Start For Belief Propagation Decoding}
 \label{subsec:Warm-Start Belief Propagation}

-% Warm-Start decoding for BP
-
-\content{Explicitly name messages passed (${L_{j\leftarrow i} : i \in
-\ldots, j\in \ldots}$)}
-\content{Pass messages to next window}
-\content{(?) Explicitly mention initialization using only CN->VN
-messages + swapping of CN and VN update?}
-\content{(?) Algorithm}
-
 \begin{figure}[t]
    \centering

@@ -960,18 +860,138 @@ messages + swapping of CN and VN update?}
    \end{tikzpicture}

    \caption{
-        Visualization of the messages used for the
-        initialization of the next window.
+        \red{Visualization of the messages used for the
+        initialization of the next window.}
+        \Acfp{vn} are represented using green circles while \acfp{cn}
+        are represented using blue squares.
    }
    \label{fig:messages_tanner}
 \end{figure}

+% Proposed modification: Overview
+
+We propose a modification to the procedure detailed in
+\Cref{subsec:Window Splitting and Sequential Sliding-Window Decoding}:
+Instead of zero-initializing the \ac{bp} messages of the next window,
+we perform a \emph{warm start} by initializing the messages in the
+overlapping region to the values last held during the decoding of the
+previous window.
+
+To see how we realize this in practice, we reiterate the steps of the
+\ac{bp} algorithm
+\begin{align}
+    \label{eq:init}
+    \text{Initialization: } & L_{i \rightarrow j} = \tilde{L}_i \\[3mm]
+    \text{\ac{cn} Update (Sum-Product): }&
+    \displaystyle L_{i \leftarrow j} =
+    2\cdot(-1)^{s_j}\cdot\tanh^{-1}
+    \!\left(
+        \prod_{i' \in \mathcal{N}_\text{C}(j)\setminus\{i\}}
+        \tanh\frac{L_{i'\rightarrow j}}{2}
+    \right) \\[3mm]
+    \text{\ac{cn} Update (Min-Sum): }&
+    \displaystyle L_{i \leftarrow j} = (-1)^{s_j}\cdot \prod_{i'
+    \in \mathcal{N}(j)\setminus i} \sign \left( L_{i' \rightarrow j}
+    \right) \cdot \min_{i' \in \mathcal{N}(j)\setminus i} \lvert
+    L_{i'\rightarrow j} \rvert \\[3mm]
+    \label{eq:vn_update}
+    \text{\ac{vn} Update: } & \displaystyle L_{i \rightarrow j} =
+    \tilde{L}_i +
+    \sum_{j' \in \mathcal{N}_\text{V}(i)\setminus\{j\}}
+    L_{i \leftarrow j'}
+\end{align}
+and turn our attention to \Cref{fig:messages_tanner}.
+We consider the right-most boundary of the first window, drawn with a
+solid line.
+The fact that we partition the overall Tanner graph at this location,
+i.e., with the last nodes of the last window being \acp{vn} and the
+first nodes of the next window being \acp{cn}, is due to
+the windowing construction detailed in
+\Cref{subsec:Window Splitting and Sequential Sliding-Window Decoding}.
+We consider the edges connecting the last set of \acp{vn}
+still in the first window to the next set of \acp{cn}.
+These edges are the routes along which information is transferred
+to subsequent spatial positions, in the form of the \ac{vn} to \ac{cn}
+messages $L_{i\rightarrow j}$.
+Note that these edges are not considered during the decoding the first
+window, since they leave its bounds.
+Consequently, no messages have been computed for these when the
+decoding of the first window completes.
+This means that simply initializing the edges in the overlap region
+with the exising $L_{i\rightarrow j}$ messages and starting the
+decoding of the next window with a \ac{cn} update is not enough.
+
+We can resolve this issue by initializing the edges using the existing
+\ac{cn} to \ac{vn} messages $L_{i\leftarrow j}$ and beginning the
+decoding of the next window with a \ac{vn} update instead.
+This way, we recompute the existing $L_{i\rightarrow j}$ messages and
+additionally compute the messages crossing the window boundary.
+We can then continue decoding the next window as usual.
+
+We can further simplify the algorithm.
+Looking carefully at \Cref{eq:vn_update} we notice that when the
+\ac{cn} to \ac{vn} messages $L_{i\leftarrow j}$ have been zero-initialized,
+the \ac{vn} update degenerates to
+\begin{align*}
+    \displaystyle L_{i \rightarrow j} =
+    \tilde{L}_i +
+    \sum_{j' \in \mathcal{N}_\text{V}(i)\setminus\{j\}}
+    L_{i \leftarrow j'} = \tilde{L}_i
+    ,%
+\end{align*}
+i.e., the \ac{vn} update \Cref{eq:vn_update} becomes the same as the
+initialization step \Cref{eq:init}.
+We conclude that as long as we zero-initialize the
+$L_{i\leftarrow j}$ messages, there is no need for a separate
+initialization step.
+\Cref{alg:warm_start_bp} shows the full warm-start sliding-window
+decoding algorithm using \ac{bp} as the inner decoder for the
+windows.
+Note that the decoding procedure performed on the individual windows
+(lines 4-8 in \Cref{alg:warm_start_bp}) is functionally equivalent to
+\Cref{alg:syndome_bp} when using the \acf{spa} variant of \ac{bp}.
+
+% tex-fmt: off
+\tikzexternaldisable
+\begin{algorithm}[t]
+    \caption{Sliding-window belief propagation (BP) decoding algorithm with warm-start.}
+    \label{alg:warm_start_bp}
+    \begin{algorithmic}[1]
+        \State \textbf{Initialize:} $\hat{\bm{e}}^\text{total} \leftarrow \bm{0}$
+        \State \textbf{Initialize:} $L_{i\leftarrow j} = 0
+            ~\forall~ i\in \mathcal{I}, j\in \mathcal{J}$
+        \For{$\ell = 0, \ldots, n_\text{win}-1$}
+            \For{$\nu = 0, \ldots, n_\text{iter}-1$}
+                \State Perform \ac{vn} update for window $\ell$
+                \State Perform \ac{cn} update for window $\ell$
+                \State Compute $\hat{\bm{e}}^{(\ell)}$ and check early
+                    termination condition
+            \EndFor
+            \State $\displaystyle\left(\hat{\bm{e}}^\text{total}\right)_{\mathcal{I}^{(\ell)}_\text{commit}} \leftarrow \hat{\bm{e}}^{(\ell)}_\text{commit}$
+            \State $\displaystyle\left(\bm{s}\right)_{\mathcal{J}_\text{overlap}^{(\ell)}}
+                \leftarrow \bm{H}_\text{overlap}^{(\ell)}
+                \left( \hat{\bm{e}}_\text{commit}^{(\ell)} \right)^\text{T}$
+            \If{$\ell < n_\text{win} - 1$}
+                \State $L^{(\ell+1)}_{i\leftarrow j} \leftarrow
+                L^{(\ell)}_{i\leftarrow j}
+                ~\forall~ i \in \mathcal{I}_\text{overlap}^{(\ell)},
+                    j \in \mathcal{J}_\text{overlap}^{(\ell)}$
+            \EndIf
+        \EndFor
+        \State \textbf{return} $\hat{\bm{e}}$
+    \end{algorithmic}
+\end{algorithm}
+\tikzexternalenable
+% tex-fmt: on
+
 %%%%%%%%%%%%%%%%
-\subsection{Belief Propagation with Guided Decimation}
+\subsection{Warm Start for Belief Propagation with Guided Decimation Decoding}
 \label{subsec:Warm-Start Belief Propagation with Guided Decimation Decoding}

-% Warm-Start decoding for BPGD
+% Intro

+\content{Call back to previous paragraph: BPGD is one option where
+relevant messages are available}
 \content{Modified structure of BPGD $\rightarrow$ In addition to
 messages, pass decimation info}
 \content{(?) Explicitly mention decimation info = channel llrs?}
@@ -1163,17 +1183,11 @@ We initially created a Python implementation, which used QUITS for the window
 splitting and subsequent sliding-window decoding as well.
 The \ac{bp} and \ac{bpgd} decoders were also initially implemented in Python.
 After a preliminary investigation, we opted for a complete
-reimplementation in Rust to achieve higher simulation speeds due to
+reimplementation in Rust to achieve higher simulation speeds leveraging
 the compiled nature of the language.
-We reimplemented both the window splitting and the decoders themselves.
+We reimplemented both the window splitting and the decoders.

 % Global experimental setup
-%   - Code
-%   - # SE rounds
-%   - Noise model
-%   - Per-round LER as figure of merit
-%   - Detector definition
-%   - # simulated error frames

 We chose to carry out our simulations on \ac{bb} codes, as they have
 recently emerged as particularly promising candidates for practical
@@ -1275,7 +1289,7 @@ error matrix as a proxy for the attainable decoding performance.
    \label{fig:whole_vs_cold}
 \end{figure}

-% [Experimental parameters] Figure 4.7
+% [Experimental parameters] Figure 4.6

 \Cref{fig:whole_vs_cold} shows the simulation results for this initial
 investigation.
@@ -1287,7 +1301,7 @@ In all cases, the inner \ac{bp} decoder was allowed a maximum of
 $200$ iterations, and the physical error rate was swept from
 $p = 0.001$ to $p = 0.004$ in steps of $0.0005$.

-% [Description] Figure 4.7
+% [Description] Figure 4.6

 Across the entire range of physical error rates, all curves exhibit
 the expected monotonic increase in logical error rate with increasing
@@ -1299,7 +1313,7 @@ Increasing the window size to $W = 4$ substantially closes this gap,
 and the $W = 5$ curve nearly coincides with the whole-block decoder
 across the full range of physical error rates.

-% [Interpretation] Figure 4.7
+% [Interpretation] Figure 4.6

 This behavior is consistent with the intuition behind sliding-window decoding.
 The detector error matrix encodes correlations between detection
@@ -1406,7 +1420,7 @@ cold-start curves can be compared directly at matching values of $W$.
    \label{fig:whole_vs_cold_vs_warm}
 \end{figure}

-% [Experimental parameters] Figure 4.8
+% [Experimental parameters] Figure 4.7

 \Cref{fig:whole_vs_cold_vs_warm} extends the previous comparison by
 additionally including the warm-start variant of sliding-window decoding.
@@ -1421,7 +1435,7 @@ window invocation, the black curve again gives the whole-block
 reference, and the physical error rate is swept from $p = 0.001$ to
 $p = 0.004$ in steps of $0.0005$.

-% [Description] Figure 4.8
+% [Description] Figure 4.7

 For each window size, the warm-start variant consistently outperforms
 its cold-start counterpart, with the dashed curves lying above the
@@ -1431,7 +1445,7 @@ the largest window ($W = 5$) and gradually narrows as the window size decreases.
 Additionally, the gap between the cold- and warm-start curves
 generally widens as the physical error rate decreases.

-% [Interpretation] Figure 4.8
+% [Interpretation] Figure 4.7

 The improvement of warm-start over cold-start decoding matches the
 motivation for the modification:
@@ -1584,7 +1598,7 @@ available to any window.
    \label{fig:bp_w_over_iter}
 \end{figure}

-% [Experimental parameters] Figure 4.9
+% [Experimental parameters] Figure 4.8

 \Cref{fig:bp_w_over_iter} shows the per-round \ac{ler} as a function
 of the maximum number of \ac{bp} iterations granted to the inner decoders.
@@ -1598,7 +1612,7 @@ $n_\text{iter} \in \{32, 128, 256, 512, 1024, 2048, 4096\}$.
 The enlarged plot magnifies the low-iteration regime
 $n_\text{iter} \in [32, 512]$.

-% [Description] Figure 4.9
+% [Description] Figure 4.8

 All curves decrease monotonically with the iteration budget, but
 contrary to our expectation, none of them appears to fully saturate
@@ -1619,7 +1633,7 @@ counts and shrinks rapidly as $n_\text{iter}$ grows, and at fixed
 $n_\text{iter}$ the size of this gap grows with the window size,
 mirroring the behavior already observed in \Cref{fig:whole_vs_cold_vs_warm}.

-% [Interpretation] Figure 4.9
+% [Interpretation] Figure 4.8

 These observations are largely consistent with the effective-iterations
 hypothesis put forward above.
@@ -1856,10 +1870,9 @@ previous experiments.
    \label{fig:bp_f}
 \end{figure}

-% [Experimental parameters] Figure 4.10
+% [Experimental parameters] Figure 4.9

 \Cref{fig:bp_f} summarizes the results of this investigation.
-
 In both panels the dashed colored curves correspond to cold-start
 sliding-window decoding for $F \in \{1, 2, 3\}$ and the solid colored
 curves to the corresponding warm-start runs.
@@ -1875,7 +1888,7 @@ mirroring the setup of \Cref{fig:bp_w_over_iter} and again including
 an inset that magnifies the low-iteration regime
 $n_\text{iter} \in [32, 512]$.

-% [Description] Figure 4.10
+% [Description] Figure 4.9

 In \Cref{fig:bp_f_over_p}, every curve exhibits the expected
 monotonic increase of the per-round \ac{ler} with the physical
@@ -1898,7 +1911,7 @@ Furthermore, the magnified plot confirms that the gap between warm-
 and cold-start curves at fixed $F$ shrinks as $n_\text{iter}$ grows,
 and that at fixed $n_\text{iter}$ this gap is largest for $F = 1$.

-% [Interpretation] Figure 4.10
+% [Interpretation] Figure 4.9

 The observed dependence on the step size mirrors the dependence on
 the window size studied earlier and the same explanation applies.
--- a/src/thesis/chapters/5_conclusion_and_outlook.tex
+++ b/src/thesis/chapters/5_conclusion_and_outlook.tex
@@ -3,9 +3,12 @@
 \content{Takeaway: Warm-start more effective for lower numbers of max
    iterations (plays into our hands because lower number of iterations
 means lower latency)}
+\content{Warm-start initialization limited to decoding algorithms
+providing relevant soft information}

 \content{\textbf{Ideas for further research}}
 \content{Softer way of decimating VNs}
 \content{Systematic study on using different inner decoders (AED,
 SED, BPGD, ...)}
 \content{Investigate SC-LDPC window decoding wave-like effects}
+