Final readthrough corrections for decoding chapter

This commit is contained in:
2026-05-04 18:42:39 +02:00
parent a977860ddb
commit 1edc3f301a

View File

@@ -3,9 +3,9 @@
\label{ch:Decoding} \label{ch:Decoding}
In \Cref{ch:Fundamentals}, we introduced the fundamentals of classical In \Cref{ch:Fundamentals}, we introduced the fundamentals of classical
error correction, before moving on to quantum information science and error correction, before turning to quantum information science and
finally combining the two in \acf{qec}. finally combining the two in \acf{qec}.
In \Cref{ch:Fault tolerance}, we then turned to fault-tolerance, with In \Cref{ch:Fault tolerance}, we then considered fault-tolerance, with
a focus on a specific way of implementing it, called \acfp{dem}. a focus on a specific way of implementing it, called \acfp{dem}.
In this chapter, we move on from the fundamental concepts and examine In this chapter, we move on from the fundamental concepts and examine
how to apply them in practice. how to apply them in practice.
@@ -14,7 +14,7 @@ Specifically, we consider the practical aspects of decoding under \acp{dem}.
In particular, we investigate decoding \acf{qldpc} codes under \acp{dem}. In particular, we investigate decoding \acf{qldpc} codes under \acp{dem}.
We focus on \ac{qldpc} codes, as they have emerged as leading We focus on \ac{qldpc} codes, as they have emerged as leading
candidates for practical quantum error correction, offering candidates for practical quantum error correction, offering
comparable thresholds with substantially improved encoding rates good thresholds with substantially improved encoding rates
\cite[Sec.~1]{bravyi_high-threshold_2024}. \cite[Sec.~1]{bravyi_high-threshold_2024}.
Because of this, the decoding algorithms we consider will all be Because of this, the decoding algorithms we consider will all be
based on \acf{bp}. based on \acf{bp}.
@@ -29,7 +29,7 @@ exist, the \ac{bp} algorithm becomes uncertain of the direction to proceed in.
Additionally, the commutativity conditions of the stabilizers Additionally, the commutativity conditions of the stabilizers
necessitate the existence of short cycles. necessitate the existence of short cycles.
Together, these two aspects lead to substantial convergence problems Together, these two aspects lead to substantial convergence problems
of \ac{bp} for quantum codes, when it is used on its own. of \ac{bp} for quantum codes, when employed on its own.
Second, the consideration of circuit-level noise introduces many more Second, the consideration of circuit-level noise introduces many more
error locations into the circuit. error locations into the circuit.
@@ -49,7 +49,7 @@ but rather quantum codes in general.
Many different approaches to solving it exist, usually centered Many different approaches to solving it exist, usually centered
around modifying \ac{bp}. around modifying \ac{bp}.
The most popular approach is combining a few initial iterations of The most popular approach is combining a few initial iterations of
\ac{bp} with a second decoding algorithm, namely \ac{osd} \ac{bp} with a second decoding algorithm, \ac{osd}
\cite{roffe_decoding_2020}. \cite{roffe_decoding_2020}.
Other approaches exist, such as \ac{aed} Other approaches exist, such as \ac{aed}
\cite{koutsioumpas_automorphism_2025}, where multiple variations of \cite{koutsioumpas_automorphism_2025}, where multiple variations of
@@ -274,7 +274,7 @@ The employed noise models also differ;
Finally, in \cite{gong_toward_2024} the authors introduce their own variation of Finally, in \cite{gong_toward_2024} the authors introduce their own variation of
\ac{bpgd}, \ac{bp} with \ac{gdg}, while \cite{huang_increasing_2024} \ac{bpgd}, \ac{bp} with \ac{gdg}, while \cite{huang_increasing_2024}
and \cite{kang_quits_2025} use \ac{bp} + \ac{osd}. and \cite{kang_quits_2025} use \ac{bp} + \ac{osd}.
We would additionally like to note that only in We would additionally like to note that only
\cite{gong_toward_2024} and \cite{kang_quits_2025} \cite{gong_toward_2024} and \cite{kang_quits_2025}
explicitly work with the \ac{dem} formalism. explicitly work with the \ac{dem} formalism.
@@ -392,7 +392,7 @@ is in turn based on \cite{huang_increasing_2024}.
Sliding-window decoding is made possible by the time-like structure Sliding-window decoding is made possible by the time-like structure
of the syndrome extraction circuitry. of the syndrome extraction circuitry.
This is especially clearly visible under the \ac{dem} formalism, where This is especially clearly visible under the \ac{dem} formalism, where
this manifests as a block-diagonal structure of the detector it manifests as a block-diagonal structure of the detector
error matrix $\bm{H}$. error matrix $\bm{H}$.
Note that this presupposes a choice of detectors as seen in Note that this presupposes a choice of detectors as seen in
\Cref{subsec:Detector Error Matrix}. \Cref{subsec:Detector Error Matrix}.
@@ -412,7 +412,7 @@ no longer contribute to decoding, since none of their
neighboring \acp{vn} appear in subsequent windows. neighboring \acp{vn} appear in subsequent windows.
We call the set of \acp{vn} connected to those \acp{cn} the We call the set of \acp{vn} connected to those \acp{cn} the
\emph{commit region} and we commit them before moving to the \emph{commit region} and we commit them before moving to the
next window, i.e., fix the values we estimate for the corresponding bits. next window, i.e., we fix the values we estimate for the corresponding bits.
The benefit of this sequential sliding-window decoding approach is The benefit of this sequential sliding-window decoding approach is
that the decoding process can begin as soon as the syndrome that the decoding process can begin as soon as the syndrome
measurements for the first window are complete. measurements for the first window are complete.
@@ -496,7 +496,7 @@ respective nodes.
In this case, we take $\bm{H} \in \mathbb{F}_2^{m\times n}$ to be the In this case, we take $\bm{H} \in \mathbb{F}_2^{m\times n}$ to be the
check matrix of the underlying code, from which the \ac{dem} was generated. check matrix of the underlying code, from which the \ac{dem} was generated.
We use $m_\text{DEM}, \mathcal{I}_\text{DEM}$, and $\mathcal{J}_\text{DEM}$ We use $m_\text{DEM}, \mathcal{I}_\text{DEM}$, and $\mathcal{J}_\text{DEM}$
to refer to the respective values defined from the detector error matrix. to refer to the respective values defined for the detector error matrix.
% How we get the corresponding rows % How we get the corresponding rows
@@ -515,7 +515,7 @@ Thus, we define
\mathcal{J}_\text{win}^{(\ell)} &:= \left\{ j\in \mathcal{J}_\text{DEM}:~ \mathcal{J}_\text{win}^{(\ell)} &:= \left\{ j\in \mathcal{J}_\text{DEM}:~
\ell F m \le j < \min \left\{m_\text{DEM}, (\ell F + W) m \right\} \ell F m \le j < \min \left\{m_\text{DEM}, (\ell F + W) m \right\}
\right\} \\[2mm] \right\} \\[2mm]
& \hspace{30mm} \text{and} \\[2mm] & \hspace{37mm} \text{and} \\[2mm]
\mathcal{J}_\text{commit}^{(\ell)} &:= \left\{ j\in \mathcal{J}_\text{DEM}:~ \mathcal{J}_\text{commit}^{(\ell)} &:= \left\{ j\in \mathcal{J}_\text{DEM}:~
\ell F m \le j < \min \left\{m_\text{DEM}, (\ell + 1) F m \right\} \ell F m \le j < \min \left\{m_\text{DEM}, (\ell + 1) F m \right\}
\right\} \right\}
@@ -531,7 +531,7 @@ $\ell$ and $\ell + 1$ as $\mathcal{J}_\text{overlap}^{(\ell)} :=
% How we get the corresponding columns % How we get the corresponding columns
We can now turn our attention to defining the sets of \acp{vn} relevant We now turn our attention to defining the sets of \acp{vn} relevant
to each window. to each window.
We first introduce a helper function $i_\text{max} : We first introduce a helper function $i_\text{max} :
\mathcal{P}(\mathbb{N}) \to \mathbb{N}$, which takes a set \mathcal{P}(\mathbb{N}) \to \mathbb{N}$, which takes a set
@@ -735,7 +735,7 @@ initialization can be used with.
For instance, \ac{bp}+\ac{osd} does not immediately seem suitable, as For instance, \ac{bp}+\ac{osd} does not immediately seem suitable, as
it performs a hard decision on the \acp{vn}, though this remains to it performs a hard decision on the \acp{vn}, though this remains to
be investigated. be investigated.
We chose to investigate first standard \ac{bp} due to its simplicity and We chose to investigate first plain \ac{bp} due to its simplicity and
then \ac{bpgd} because of the availability of recently computed messages. then \ac{bpgd} because of the availability of recently computed messages.
% TODO: Include this? % TODO: Include this?
@@ -1249,10 +1249,11 @@ For the circuit generation, we employed utilities from QUITS
\cite{kang_quits_2025}, which provides syndrome extraction circuitry \cite{kang_quits_2025}, which provides syndrome extraction circuitry
generation for a number of different \ac{qldpc} codes. generation for a number of different \ac{qldpc} codes.
We initially created a Python implementation, which used QUITS for the window We initially created a Python implementation, which used QUITS for the window
splitting and subsequent sliding-window decoding as well. splitting and subsequent sliding-window decoding as well, before
The \ac{bp} and \ac{bpgd} are implementation in Rust to achieve reimplementing in Rust.
higher simulation speeds leveraging the compiled nature of the language. The \ac{bp} and \ac{bpgd} are implemented in Rust to achieve
We reimplemented both the window splitting and the decoders. higher simulation speeds leveraging the compiled nature of the
language.
% Global experimental setup % Global experimental setup
@@ -1268,7 +1269,7 @@ For the generation of the \ac{dem} we set the number of syndrome
extraction rounds to $12$, similarly to \cite{gong_toward_2024}, and extraction rounds to $12$, similarly to \cite{gong_toward_2024}, and
we defined our detectors as in the example in we defined our detectors as in the example in
\Cref{subsec:Detector Error Matrix}. \Cref{subsec:Detector Error Matrix}.
We employed circuit-lose noise as described in We employed circuit-level noise as described in
\Cref{subsec:Choice of Noise Model} as our noise model, specifically standard \Cref{subsec:Choice of Noise Model} as our noise model, specifically standard
ciruit-based depolarizing noise \cite[Sec.~VIII]{fowler_high-threshold_2009}, ciruit-based depolarizing noise \cite[Sec.~VIII]{fowler_high-threshold_2009},
i.e., all error locations in the circuit get assigned the same i.e., all error locations in the circuit get assigned the same
@@ -1285,7 +1286,8 @@ generated by simulating at least $200$ logical error events.
We begin our investigation by using \ac{bp} with no further We begin our investigation by using \ac{bp} with no further
modifications as the inner decoder. modifications as the inner decoder.
We chose the min-sum variant of \ac{bp} due to its low computational complexity. We choose the min-sum variant of \ac{bp} due to its low computational
complexity.
% [Thread] Get impression for max gain % [Thread] Get impression for max gain
@@ -1295,7 +1297,7 @@ To this end, we begin by analyzing the decoding performance of the
original process, without our warm-start modification. original process, without our warm-start modification.
We will call this \emph{cold-start} decoding in the following. We will call this \emph{cold-start} decoding in the following.
Because we expect more global decoding to work better (the inner Because we expect more global decoding to work better (the inner
decoder then has access to a larger portion of the long-range decoder has access to a larger portion of the long-range
correlations encoded in the detector error matrix before any commit correlations encoded in the detector error matrix before any commit
is made) we initially decide to use decoding on the whole detector is made) we initially decide to use decoding on the whole detector
error matrix as a proxy for the attainable decoding performance. error matrix as a proxy for the attainable decoding performance.
@@ -1729,7 +1731,7 @@ hypothesis put forward above.
The whole-block decoder eventually overtaking every windowed scheme The whole-block decoder eventually overtaking every windowed scheme
matches the prediction made there: with a sufficiently large matches the prediction made there: with a sufficiently large
iteration budget, the whole-block decoder reaches an error rate iteration budget, the whole-block decoder reaches an error rate
that nonone of the windowed schemes can beat, because of the more global that none of the windowed schemes can beat, because of the more global
nature of the considered constraints. nature of the considered constraints.
Furthermore, the pronounced advantage of warm- over cold-start decoding at low Furthermore, the pronounced advantage of warm- over cold-start decoding at low
numbers of iterations makes sense if we consider the overall trend of the plots. numbers of iterations makes sense if we consider the overall trend of the plots.
@@ -1742,7 +1744,7 @@ initialization diminishes, and the curves approach each other.
The fact that no curve clearly saturates within the swept range is The fact that no curve clearly saturates within the swept range is
itself worth noting. itself worth noting.
We know that \ac{bp} on \ac{qldpc} codes suffers from poor We know that \ac{bp} on \ac{qldpc} codes suffers from poor
convergence due to degeneracy and the short cycles in the underlying convergence due to degeneracy and short cycles in the underlying
Tanner graph, so even after several thousand iterations the decoder Tanner graph, so even after several thousand iterations the decoder
may continue to slowly refine its message estimates rather than may continue to slowly refine its message estimates rather than
settle into a stable fixed point. settle into a stable fixed point.
@@ -1968,9 +1970,9 @@ previous experiments.
% [Experimental parameters] Figure 4.9 % [Experimental parameters] Figure 4.9
\Cref{fig:bp_f} summarizes the results of this investigation. \Cref{fig:bp_f} summarizes the results of this investigation.
In both panels the dashed colored curves correspond to cold-start In both panels, the dashed curves correspond to cold-start
sliding-window decoding for $F \in \{1, 2, 3\}$ and the solid colored sliding-window decoding for $F \in \{1, 2, 3\}$ and the solid
curves to the corresponding warm-start runs. curves to warm-start decoding.
The window size is fixed to $W = 5$ throughout. The window size is fixed to $W = 5$ throughout.
\Cref{fig:bp_f_over_p} sweeps the physical error rate over \Cref{fig:bp_f_over_p} sweeps the physical error rate over
$p \in [0.001, 0.004]$ in steps of $0.0005$ at a fixed maximum of $p \in [0.001, 0.004]$ in steps of $0.0005$ at a fixed maximum of
@@ -1990,7 +1992,7 @@ monotonic increase of the per-round \ac{ler} with the physical
error rate. error rate.
At fixed $F$, the warm-start approach lies below At fixed $F$, the warm-start approach lies below
cold-start across the entire sweep, and at fixed cold-start across the entire sweep, and at fixed
warm- or cold-start, smaller $F$ produces a lower \ac{ler}. warm or cold start, smaller $F$ produces a lower \ac{ler}.
Both gaps grow as the physical error rate decreases: Both gaps grow as the physical error rate decreases:
the curves at $F = 1$ separate further from those at $F = 2$ and $F = 3$, the curves at $F = 1$ separate further from those at $F = 2$ and $F = 3$,
and the warm-start curves separate further from the cold-start ones. and the warm-start curves separate further from the cold-start ones.
@@ -2314,7 +2316,7 @@ Recall from
that the warm start for \ac{bpgd} carries over not only the \ac{bp} that the warm start for \ac{bpgd} carries over not only the \ac{bp}
messages on the edges of the overlap region but also the decimation messages on the edges of the overlap region but also the decimation
information. information.
Because we run with an iteration budget large enough to decimate Because we decode with an iteration budget large enough to decimate
every \ac{vn} in a window, by the time window $\ell$ ends, all every \ac{vn} in a window, by the time window $\ell$ ends, all
of its \acp{vn} have already been hard-decided. of its \acp{vn} have already been hard-decided.
For the \acp{vn} that lie in the overlap region with window $\ell + 1$ For the \acp{vn} that lie in the overlap region with window $\ell + 1$
@@ -2515,8 +2517,8 @@ fixed physical error rate.
\Cref{fig:bpgd_iter} shows the per-round \ac{ler} of \ac{bpgd} \Cref{fig:bpgd_iter} shows the per-round \ac{ler} of \ac{bpgd}
sliding-window decoding as a function of the maximum number of inner sliding-window decoding as a function of the maximum number of inner
\ac{bp} iterations $n_\text{iter}$. \ac{bp} iterations $n_\text{iter}$.
The dashed colored curves correspond to cold-start sliding-window The dashed curves correspond to cold-start sliding-window
decoding and the solid colored curves to warm-start, which again decoding and the solid curves to warm-start, which again
retains both the \ac{bp} messages and the decimaiton information on retains both the \ac{bp} messages and the decimaiton information on
the overlap region. the overlap region.
The physical error rate is fixed at $p = 0.0025$ and the iteration The physical error rate is fixed at $p = 0.0025$ and the iteration
@@ -2607,8 +2609,8 @@ of the warm-start curves and limit ourselves to noting it.
The natural consequence of the previous diagnosis is to drop the The natural consequence of the previous diagnosis is to drop the
problematic part of the warm-start initialization for \ac{bpgd} and problematic part of the warm-start initialization for \ac{bpgd} and
to carry over only the \ac{bp} messages on the edges of the overlap to carry over only the \ac{bp} messages on the edges of the overlap
region, as in \Cref{fig:messages_tanner}, while leaving the channel region, as in \Cref{fig:messages_tanner}, while leaving the
\acp{llr} of the next window in their original cold-start state. decimation information of the next window in its original cold-start state.
Note that some information about the previous window's decimation Note that some information about the previous window's decimation
state is still implicitly carried over through the \ac{bp} messages, state is still implicitly carried over through the \ac{bp} messages,
since the decimation decisions were made based on the messages themselves. since the decimation decisions were made based on the messages themselves.
@@ -2803,7 +2805,7 @@ as $F$ grows.
% [Description] Interpretation 4.12 % [Description] Interpretation 4.12
Removing the channel \acp{llr} from the warm-start initialization lifts Removing the decimation information from the warm-start initialization lifts
the warm-start regression observed in \Cref{fig:bpgd_wf}, the warm-start regression observed in \Cref{fig:bpgd_wf},
and warm-start now consistently outperforms cold-start. and warm-start now consistently outperforms cold-start.
The dependence on the window size and the step size also recovers The dependence on the window size and the step size also recovers
@@ -3036,7 +3038,7 @@ performance gain when the inner decoder is upgraded from plain
\ac{bp} to its guided-decimation variant, but only if some care is \ac{bp} to its guided-decimation variant, but only if some care is
taken in choosing what to information carry over. taken in choosing what to information carry over.
Passing the channel \acp{llr} along with the \ac{bp} messages, Passing the channel \acp{llr} along with the \ac{bp} messages,
as suggested by naively carrying over the warm-start idea to \ac{bpgd}, as suggested by naively transferring the warm-start idea to \ac{bpgd},
leads to premature hard decisions on \acp{vn} in the overlap region. leads to premature hard decisions on \acp{vn} in the overlap region.
This leads to warm-start initialization actually worsening the This leads to warm-start initialization actually worsening the
performance compared to cold-start initialization. performance compared to cold-start initialization.