Add first review responses

Add changes made before submission for review
Add matlab BP simulation
2024-06-13 17:42:42 +02:00 · 2024-06-13 17:41:59 +02:00 · 2024-06-13 17:35:48 +02:00
2 changed files with 404 additions and 78 deletions
--- a/letter.tex
+++ b/letter.tex
@@ -6,6 +6,7 @@
 \usepackage{algorithmic}
 \usepackage{algorithm}
 \usepackage{siunitx}
+\usepackage[normalem]{ulem}
 \usepackage{dsfont}
 \usepackage{mleftright}
 \usepackage{bbm}
@@ -26,6 +27,18 @@
 \hyphenation{op-tical net-works semi-conduc-tor IEEE-Xplore}


+%
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+% Custom commands
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%
+
+
+\newcommand{\reviewone}[1]{{\textcolor{KITblue}{#1}}}
+\newcommand{\reviewtwo}[1]{{\textcolor{KITpalegreen}{#1}}}
+\newcommand{\reviewthree}[1]{{\textcolor{KITred}{#1}}}
+
+
 %
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 % Inputs & Global Options
@@ -56,7 +69,7 @@
 \pgfplotsset{colorscheme/cel}

 \newcommand{\figwidth}{\columnwidth}
-\newcommand{\figheight}{0.75\columnwidth}
+\newcommand{\figheight}{0.7\columnwidth}

 \pgfplotsset{
 	FERPlot/.style={
@@ -107,17 +120,17 @@


 \begin{abstract}
-In this paper, the proximal decoding algorithm is considered within the
+In this paper, the proximal decoding algorithm described in, e.g., \cite{proximal_paper}, is considered within the
 context of \textit{additive white Gaussian noise} (AWGN) channels.
 An analysis of the convergence behavior of the algorithm shows that
 proximal decoding inherently enters an oscillating behavior of the estimate
 after a certain number of iterations.
 Due to this oscillation, frame errors arising during decoding can often
-be attributed to only a few remaining wrongly decoded bits.
+be attributed to only a few remaining wrongly decoded bit positions.
 In this letter, an improvement of the proximal decoding algorithm is proposed
-by appending an additional step, in which these erroneous components are
+by establishing an additional step, in which these erroneous positions are
 attempted to be corrected.
-We suggesst an empirical rule with which the components most likely needing
+We suggest an empirical rule with which the components most likely needing
 correction can be determined.
 Using this insight and performing a subsequent ``ML-in-the-list'' decoding,
 a gain of up to 1 dB is achieved compared to conventional
@@ -139,6 +152,11 @@ Optimization-based decoding, Proximal decoding, ML-in-the-list.
 %%%%%%%%%%%%%%%%%%%%%%%%%%%
 \section{Introduction}

+\reviewone{Test1}
+\reviewtwo{Test2}
+\reviewthree{Test3}
+
+
 \IEEEPARstart{C}{hannel} coding using binary linear codes is a way of enhancing
 the reliability of data by detecting and correcting any errors that may occur
 during its transmission or storage.
@@ -151,10 +169,12 @@ While the established decoders for LDPC codes, such as belief propagation (BP)
 and the min-sum algorithm, offer good decoding performance, they are generally
 not optimal and exhibit an error floor for high
 \textit{signal-to-noise ratios} (SNRs) \cite{channel_codes_book}, making them
-unsuitable for applications with extreme reliability requirements.
+inadequate for applications with extreme reliability requirements.

 Optimization based decoding algorithms are an entirely different way of
-approaching the decoding problem.
+approaching the decoding problem;
+they map the decoding problem onto an optimization problem in order to
+leverage the vast knowledge from the field of optimization theory.
 A number of different such algorithms have been introduced.
 The field of \textit{linear programming} (LP) decoding \cite{feldman_paper},
 for example, represents one class of such algorithms, based on a relaxation
@@ -167,26 +187,26 @@ Proximal decoding relies on a non-convex optimization formulation
 of the \textit{maximum a posteriori} (MAP) decoding problem.

 The aim of this work is to improve upon the performance of proximal decoding by
-first presenting an examination of the algorithm's behavior and then suggesting
+first presenting an analysis of the algorithm's behavior and then suggesting
 an approach to mitigate some of its flaws.
 This analysis is performed for
 \textit{additive white Gaussian noise} (AWGN) channels.
 We first observe that the algorithm initially moves the estimate in
-the right direction, however, in the final steps of the decoding process,
+the right direction; however, in the final steps of the decoding process,
 convergence to the correct codeword is often not achieved.
-Furthermore, we suggest that the reason for this behavior is the nature
+Subsequently, we attributed this behavior to the nature
 of the decoding algorithm itself, comprising two separate gradient descent
 steps working adversarially.

-We propose a method mitigate this effect by appending an
-additional step to the decoding process.
+We, thus, propose a method to mitigate this effect by appending an
+additional step to the iterative decoding process.
 In this additional step, the components of the estimate with the highest
 probability of being erroneous are identified.
 New codewords are then generated, over which an ``ML-in-the-list''
 \cite{ml_in_the_list} decoding is performed.
 A process to conduct this identification is proposed in this paper.
 Using the improved algorithm, a gain of up to
-1 dB can be achieved compared to conventional proximal decoding,
+$\SI{1}{dB}$ can be achieved compared to conventional proximal decoding,
 depending on the decoder parameters and the code.


@@ -200,7 +220,7 @@ When considering binary linear codes, data words are mapped onto
 codewords, the lengths of which are denoted by $k \in \mathbb{N}$
 and $n \in \mathbb{N}$, respectively, with $k \le n$.
 The set of codewords $\mathcal{C} \subset \mathbb{F}_2^n$ of a binary linear
-code can be represented using the parity-check matrix
+code can be characterized using the parity-check matrix
 $\boldsymbol{H} \in \mathbb{F}_2^{m \times n} $, where $m$ represents the
 number of parity-checks:
 %
@@ -230,7 +250,7 @@ estimate of the transmitted codeword, denoted as
 $\hat{\boldsymbol{c}} \in \mathbb{F}_2^n$.
 A distinction is made between $\boldsymbol{x} \in \left\{\pm 1\right\}^n$
 and $\tilde{\boldsymbol{x}} \in \mathbb{R}^n$,
-the former denoting the BPSK symbol physically transmitted over the channel and
+the former denoting the BPSK symbols transmitted over the channel and
 the latter being used as a variable during the optimization process.
 The posterior probability of having transmitted $\boldsymbol{x}$ when receiving
 $\boldsymbol{y}$ is expressed as a \textit{probability mass function} (PMF)
@@ -267,8 +287,8 @@ One such expression, formulated under the assumption of BPSK, is the
 .\end{align*}%
 %
 Its intent is to penalize vectors far from a codeword.
-It comprises two terms: one representing the bipolar constraint
-and one representing the parity constraint, incorporating all of the
+It comprises two terms: one representing the bipolar constraint due to transmitting BPSK
+and one representing the parity constraint, incorporating all 
 information regarding the code.

 The channel model can be considered using the negative log-likelihood
@@ -279,7 +299,7 @@ The channel model can be considered using the negative log-likelihood
 	    \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \mright) \mright)
 .\end{align*}
 %
-The information about the channel and the code are consolidated in the objective
+Then, the information about the channel and the code are consolidated in the objective
 function \cite{proximal_paper}
 %
 \begin{align*}
@@ -305,17 +325,17 @@ introduced, describing the result of each of the two steps:
 .\end{alignat}
 %
 An equation for determining $\nabla h(\boldsymbol{r})$ is given in
-\cite{proximal_paper}.
+\cite{proximal_paper}, where it is also proposed to initialized $\boldsymbol{s}=\boldsymbol{0}$.
 It should be noted that the variables $\boldsymbol{r}$ and $\boldsymbol{s}$
 represent $\tilde{\boldsymbol{x}}$ during different
 stages of the decoding process.

 As the gradient of the code-constraint polynomial can attain very large values
-in some cases, an additional step is introduced to ensure numerical stability:
-every current estimate $\boldsymbol{s}$ is projected onto
+in some cases, an additional step is introduced in \cite{proximal_paper} to ensure numerical stability:
+every estimate $\boldsymbol{s}$ is projected onto
 $\left[-\eta, \eta\right]^n$ by a projection
 $\Pi_\eta : \mathbb{R}^n \rightarrow \left[-\eta, \eta\right]^n$, where $\eta$
-is a positive constant slightly larger than one, e.g., $\eta = 1.5$.
+is a positive constant larger than one, e.g., $\eta = 1.5$.
 The resulting decoding process as described in \cite{proximal_paper} is
 presented in Algorithm \ref{alg:proximal_decoding}.

@@ -346,13 +366,12 @@ presented in Algorithm \ref{alg:proximal_decoding}.
 \subsection{Analysis of the Convergence Behavior}

 In Fig. \ref{fig:fer vs ber}, the \textit{frame error rate} (FER),
-\textit{bit error rate} (BER) and \textit{decoding failure rate} (DFR) of
+\textit{bit error rate} (BER), and \textit{decoding failure rate} (DFR) of
 proximal decoding are shown for an LDPC code with $n=204$ and $k=102$
 \cite[204.33.484]{mackay}.
-A decoding failure is defined as a decoding operation returning an invalid
-codeword, i.e., as non-convergence of the algorithm.
+Hereby, a \emph{decoding failure} is defined as returning a \emph{non valid codeword}, i.e., as non-convergence of the algorithm.
 The parameters chosen for this simulation are $\gamma=0.05, \omega=0.05,
-\eta=1.5$ and $K=200$.
+\eta=1.5$ and $K=200$ ($K$ describing the maximum number of iterations).
 They were determined to offer the best performance in a preliminary examination,
 where the effect of changing multiple parameters was simulated over a wide
 range of values.
@@ -367,7 +386,7 @@ the right direction.
 This would suggest that most frame errors occur due to only a few incorrectly
 decoded bits.%
 %
-\begin{figure}
+\begin{figure}[t]
    \centering


@@ -417,14 +436,13 @@ decoded bits.%
 \end{figure}%
 %

-An approach for lowering the FER might then be to append an ``ML-in-the-list''
+An approach for lowering the FER might then be to add an ``ML-in-the-list''
 \cite{ml_in_the_list} step to the decoding process shown in Algorithm
 \ref{alg:proximal_decoding}.
-This step consists in determining the $N \in \mathbb{N}$ most probable
-erroneous bits, finding all variations of the current estimate with those bits
-modified, and performing ML decoding on this list.
+This step consists in determining the $N \in \mathbb{N}$ most probably
+erroneous bit positions $\mathcal{I}'$, generating a list of $2^N$ codeword candidates out of the current estimate $\hat{\boldsymbol{c}}$ with bits in $\mathcal{I}'$ adopting all possible values, i.e., $\mathcal{L}'=\left\{ \hat{\boldsymbol{c}}'\in\mathbb{F}_2^n: \hat{c}'_i=\hat{c}_i, i\notin \mathcal{I}'\text{ and } \hat{c}'_i\in\mathbb{F}_2, i\in \mathcal{I}'  \right\}$, and performing ML decoding on this list.

-This approach crucially relies on identifying the most probable erroneous bits.
+This approach crucially relies on identifying the most probably erroneous bits.
 Therefore, the convergence properties of proximal decoding are investigated.
 Considering (\ref{eq:s_update}) and (\ref{eq:r_update}), Fig.
 \ref{fig:grad} shows the two gradients along which the minimization is
@@ -437,7 +455,7 @@ This behavior supports the conjecture that the reason for the high DFR is a
 failure to converge to the correct codeword in the final steps of the
 optimization process.%
 %
-\begin{figure}
+\begin{figure}[t]
    \centering

 	\ifoverleaf
@@ -538,7 +556,7 @@ optimization process.%
        $\nabla L\left(\boldsymbol{y} \mid \tilde{\boldsymbol{x}}\right)$
        and $\nabla h \left( \tilde{\boldsymbol{x}} \right)$ for a repetition
        code with $n=2$.
-        Shown for $\boldsymbol{y} = \begin{bmatrix} -0.5 & 0.8 \end{bmatrix}$.
+        Shown for $\boldsymbol{y} = \begin{pmatrix} -0.5 & 0.8 \end{pmatrix}$.
    }
    \label{fig:grad}
 \end{figure}%
@@ -548,7 +566,7 @@ In Fig. \ref{fig:prox:convergence_large_n}, we consider only component
 $\left(\tilde{\boldsymbol{x}}\right)_1$ of the estimate during a
 decoding operation for the LDPC code used also for Fig. 1.
 Two qualities may be observed.
-First, we observe the average absolute values of the two gradients are equal,
+First, we observe that the average absolute values of the two gradients are equal,
 however, they have opposing signs,
 leading to the aforementioned oscillation.
 Second, the gradient of the code constraint polynomial itself starts to
@@ -605,16 +623,16 @@ oscillate after a certain number of iterations.%
 \subsection{Improvement Using ``ML-in-the-List'' Step}

 Considering the magnitude of the oscillation of the gradient of the code constraint
-polynomial, some interesting behavior may be observed.
-Fig. \ref{fig:p_error} shows the probability that a component of the estimate
-is wrong, determined through a Monte Carlo simulation, when the components of
-$\boldsymbol{c}$ are ordered from smallest to largest oscillation of
-$\left(\nabla h\right)_i$.
+polynomial, some interesting behavior may be observed. Let $\boldsymbol{i}'=(i'_1, \ldots, i_n')$ be a permutation of $\{1,\ldots, n\}$ such that $\left(\nabla h\right)_{i'}$ is arranged according to increasing variance of  oscillation of its magnitude, i.e., $\text{Var}_\text{iter}(|\left(\nabla h\right)_{i'_1}|)\leq \cdots \leq \text{Var}_\text{iter}(|\left(\nabla h\right)_{i'_n}|)$ with $\text{Var}_\text{iter}(\cdot)$ denoting the empirical variance along the iterations. 

-The lower the magnitude of the oscillation, the higher the probability that the
-corresponding bit was not decoded correctly.
-This means that this magnitude is a suitable figure of merit for determining
-the probability that a given component was decoded incorrectly.%
+Hereafter, Fig. \ref{fig:p_error} shows Monte Carlo simulations of the probability that decoded bit $\hat{c}_i'$ at position $i'$ of the estimated codeword
+is wrong. %, when the components of
+%$\boldsymbol{c}$ are ordered from smallest to largest oscillation of
+%$\left(\nabla h\right)_i$.
+It can be observed that lower magnitudes of oscillation correlate with higher probability that the corresponding bit was not decoded correctly.
+Thus, this magnitude might be used as a feasible indicator
+%for determining the probability that a given component was decoded incorrectly and, thus, 
+for identifying erroneously decoded bit positions as $\mathcal{I}'=\{i_1', \ldots, i_N'\}$.%
 %
 \begin{figure}[H]
    \centering
@@ -640,10 +658,9 @@ the probability that a given component was decoded incorrectly.%
 	\fi

    \caption{Probability that a component of the estimated codeword
-        $\hat{\boldsymbol{c}}\in \mathbb{F}_2^n$ is erroneous for a (3,6) regular
+        $\boldsymbol{\hat{c}}\in \mathbb{F}_2^n$ is erroneous for a (3,6) regular
        LDPC code with $n=204, k=102$ \cite[\text{204.33.484}]{mackay}.
-        The indices $i'$ are ordered such that the amplitude of oscillation of
-        $\left(\nabla h\right)_{i'}$ increases with $i'$.
+        Indices $i'$ are ordered such that $|\left(\nabla h\right)_{i'_1}|\leq \cdots \leq |\left(\nabla h\right)_{i'_n}|$.
        Parameters used for the simulation: $\gamma = 0.05, \omega = 0.05,
        \eta = 1.5, E_b/N_0 = \SI{4}{dB}$.
        Simulated with $\SI{100000000}{}$ iterations using the all-zeros codeword.}
@@ -656,25 +673,10 @@ If a valid codeword has been reached, i.e., if the algorithm has converged,
 we return this solution.
 Otherwise, $N \in \mathbb{N}$ components are selected based on the criterion
 presented above.
-Beginning with the recent estimate $\hat{\boldsymbol{c}} \in \mathbb{F}_2^n$,
-all variations of words with the selected components modified are then
+Originating from $\boldsymbol{\hat{c}} \in \mathbb{F}_2^n$ resulting from proximal decoding,
+the list $\mathcal{L}'$ of codeword candidates with bits in $\mathcal{I}'$ modified is
 generated and an ``ML-in-the-list'' step is performed.

-\begin{algorithm}
-    \caption{ML-in-the-List algorithm.}
-    \label{alg:ml-in-the-list}
-
-    \begin{algorithmic}
-        \STATE Find valid codewords under $\left(\hat{\boldsymbol{c}}_{l}\right)_{1=1}^{2^N}$
-        \STATE \textbf{if} no valid codewords exist
-        \STATE \hspace{5mm} Compute $\langle \hat{\boldsymbol{c}}_l, \hat{\boldsymbol{c}} \rangle$ for all variations $\boldsymbol{c}_l$
-        \STATE \textbf{else}
-        \STATE \hspace{5mm} Compute $\langle \hat{\boldsymbol{c}}_l, \hat{\boldsymbol{c}} \rangle$ for valid codewords
-        \STATE \textbf{end if}
-        \STATE \textbf{return} $\hat{\boldsymbol{c}}_l$ with highest $\langle \hat{\boldsymbol{c}}_l, \hat{\boldsymbol{c}} \rangle$
-    \end{algorithmic}
-\end{algorithm}%
-%
 \begin{algorithm}
    \caption{Improved proximal decoding algorithm.
        }
@@ -685,24 +687,62 @@ generated and an ``ML-in-the-list'' step is performed.
        \STATE \textbf{for} $K$ iterations \textbf{do}
        \STATE \hspace{5mm} $\boldsymbol{r} \leftarrow \boldsymbol{s} - \omega \left( \boldsymbol{s} - \boldsymbol{y} \right) $
        \STATE \hspace{5mm} $\boldsymbol{s} \leftarrow \Pi_\eta \left(\boldsymbol{r} - \gamma \nabla h\left( \boldsymbol{r} \right) \right)$
-        \STATE \hspace{5mm} $\boldsymbol{\hat{c}} \leftarrow \mathds{1} \left\{ \text{sign}\left( \boldsymbol{s} \right) = -1 \right\}$
-        \STATE \hspace{10mm} \textbf{if} $\boldsymbol{H}\boldsymbol{\hat{c}} = \boldsymbol{0}$ \textbf{do}
+        \STATE \hspace{5mm} $\boldsymbol{\hat{c}} \leftarrow \mathds{1}_{ \left\{ \boldsymbol{s} \leq 0 \right\}}$
+        \STATE \hspace{5mm} \textbf{if} $\boldsymbol{H}\boldsymbol{\hat{c}} = \boldsymbol{0}$ \textbf{do}
        \STATE \hspace{10mm} \textbf{return} $\boldsymbol{\hat{c}}$
        \STATE \hspace{5mm} \textbf{end if}
        \STATE \textbf{end for}
-        \STATE $\textcolor{KITblue}{\text{Estimate $N$ wrong bit indices $\mathcal{I} = \{i_1,\ldots,i_N\}$}}$
-        \STATE $\textcolor{KITblue}{\text{Generate candidate list $\left(\hat{\boldsymbol{c}}_{l}\right)_{l=1}^{2^N}$ by varying bits in $\mathcal{I}$}}$\vspace{1mm}
-        \STATE $\textcolor{KITblue}{\textbf{return  ml\textunderscore in\textunderscore the\textunderscore list}\left(\left(\hat{\boldsymbol{c}}_l\right)_{1=1}^{2^N}\right)}$
+        \STATE $\textcolor{KITblue}{\text{$\mathcal{I}'\leftarrow \{i_1',\ldots, i_N'\}$ (indices of $N$ probably wrong bits) 
+        %$\mathcal{I} = \{i_1,\ldots,i_N\}$
+        }
+        }$
+
+        \STATE $\textcolor{KITblue}{\text{%Generate candidates 
+        $\mathcal{L}'\leftarrow\left\{ \boldsymbol{\hat{c}}'\in\mathbb{F}_2^n: \hat{c}'_i=\hat{c}_i, i\notin \mathcal{I}' \text{ and } \hat{c}'_i\in\mathbb{F}_2, i\in \mathcal{I}'  \right\}
+        %\left(\boldsymbol{\hat{c}}_{l}\right)_{l=1}^{2^N}
+        $ 
+        %by varying bits in $\mathcal{I}$
+        }}
+        $\vspace{1mm}
+        %\STATE \hspace{20mm} \textcolor{KITblue}{(list of codeword candidates)}
+        \STATE $\textcolor{KITblue}{\textbf{return  ML\textunderscore in\textunderscore the\textunderscore list}\left(
+        %\left(\boldsymbol{\hat{c}}_l\right)_{1=1}^{2^N}
+        \mathcal{L}'
+        \right)}$
    \end{algorithmic}
 \end{algorithm}

+\begin{algorithm}
+    \caption{ML-in-the-List algorithm.}
+    \label{alg:ml-in-the-list}
+
+    \begin{algorithmic}
+        \STATE $\mathcal{L}'_\text{valid} \leftarrow \{ \boldsymbol{\hat{c}}'\in\mathcal{L}': \boldsymbol{H}\boldsymbol{\hat{c}}'=\boldsymbol{0}\}$ (select valid codewords) 
+        % Find valid codewords within $\mathcal{L}'$
+        %under $\left(\boldsymbol{\hat{c}}_{l}\right)_{1=1}^{2^N}$
+        \STATE \textbf{if} $\mathcal{L}'_\text{valid}\neq\emptyset$ \textbf{do}
+        %no valid codewords exist
+        \STATE \hspace{5mm} 
+        \textbf{return} $\arg\max \{ \langle 1-2\boldsymbol{\hat{c}}'_l, \boldsymbol{y} \rangle : \boldsymbol{\hat{c}}'_l\in\mathcal{L}'_\text{valid}\}$
+        %Compute $\langle \boldsymbol{\hat{c}}'_l, \boldsymbol{\hat{c}} \rangle$ for all variations $\boldsymbol{\hat{c}}'_l\in\mathcal{L}$
+        \STATE \textbf{else}
+        \STATE \hspace{5mm} 
+        \textbf{return} $\arg\max \{ \langle 1-2 \boldsymbol{\hat{c}}'_l, \boldsymbol{y} \rangle : \boldsymbol{\hat{c}}'_l\in\mathcal{L}'\}$
+        %Compute $\langle \boldsymbol{\hat{c}}'_l, \boldsymbol{\hat{c}} \rangle$ for valid codewords $\boldsymbol{\hat{c}}'_l\in\mathcal{L}$
+        \STATE \textbf{end if}
+        %\STATE \textbf{return} $\boldsymbol{\hat{c}}_l$ with highest $\langle \boldsymbol{\hat{c}}_l, \boldsymbol{\hat{c}} \rangle$
+    \end{algorithmic}
+\end{algorithm}%
+%
+
+

 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 \section{Simulation Results \& Discussion}

 Fig. \ref{fig:results} shows the FER and BER resulting from applying
 proximal decoding as presented in \cite{proximal_paper} and the improved
-algorithm presented here when applied to a $\left( 3,6 \right)$-regular LDPC
+algorithm presented in this work, when both are applied to a $\left( 3,6 \right)$-regular LDPC
 code with $n=204$ and $k=102$ \cite[204.33.484]{mackay}.
 The parameters chosen for the simulation are
 $\gamma = 0.05, \omega=0.05, \eta=1.5, K=200$.
@@ -711,7 +751,7 @@ as a preliminary examination
 showed that they provide the best results for proximal decoding as well as
 the improved algorithm.
 All points were generated by simulating at least 100 frame errors.
-The number $N$ of possibly wrong components selected was selected as $8$,
+The number of possibly wrong components selected was selected as $N=8$,
 since this provides reasonable gain without requiring an unreasonable amount
 of memory and computational resources.
 %
@@ -740,8 +780,18 @@ of memory and computational resources.
 				width=\figwidth,
 				height=\figheight,
 				legend pos=north east,
-				ylabel={BER (\lineintext{}) / FER (\lineintext{dashed})},
+				ylabel={BER (\lineintext{}), FER (\lineintext{dashed})},
 			]
+				\addplot+[FERPlot, mark=o, mark options={solid}, scol0, forget plot]
+					table [x=SNR, y=FER, col sep=comma,
+						   discard if gt={SNR}{9}]
+						{res/bp_20433484.csv};
+
+				\addplot+[BERPlot, mark=*, scol0]
+					table [x=SNR, y=BER, col sep=comma,
+						   discard if gt={SNR}{7.5}]
+						{res/bp_20433484.csv};
+				\addlegendentry{BP};

 				\addplot+[FERPlot, mark=o, mark options={solid}, scol1, forget plot]
 					table [x=SNR, y=FER, col sep=comma,
@@ -785,12 +835,12 @@ of memory and computational resources.

 A noticeable improvement can be observed both in the FER as well as the BER.
 The gain varies significantly
-with the SNR (which is to be expected, since with higher SNR values the number
-of bit errors decreases, making the correction of those errors in the
-``ML-in-the-list'' step more likely).
+with the SNR, which is to be expected since higher SNR values result in a decreased number
+of bit errors, making the correction of those errors in the
+``ML-in-the-list'' step more likely.
 For an FER of $10^{-6}$, the gain is approximately $\SI{1}{dB}$.
-Similar behavior can be observed with various other codes.
-No immediate relationship between the code length and the gain was observed
+Similar behavior was observed with a number of different codes, e.g., \cite[\text{PEGReg252x504, 204.55.187, 96.3.965}]{mackay}.
+Furthermore, no immediate relationship between the code length and the gain was observed
 during our examinations.

 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%
@@ -825,5 +875,226 @@ Ministry of Education and Research (BMBF) within the project Open6GHub

 \printbibliography

-\end{document}

+%
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+% Response to the reviews
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+%
+
+\newpage
+\onecolumn
+
+\section{Authors' Response to the Editor resp. the Reviewers}
+
+\subsection{Review 1}
+
+
+\begin{itemize}
+    \item \textbf{Comment 1:} This paper proposes a combination of proximal decoding and ML-in-the-list decoding. There are several issues with the paper in its current form that need to be addressed.
+    
+    \vspace{0.25cm}
+    \textbf{Authors:} 
+    Thank you for your feedback. \reviewone{The according changes will be marked by accordingly coloring the changes in the paper and be listed below.
+    }
+    \vspace{0.75cm}    
+
+
+    \item \textbf{Comment 2:} The definition of code-constraint polynomial is baseless. The authors should explain why we use the code-constraint polynomial. Also, I think the code-constraint polynomial cannot be used to replace the prior PDF of $\boldsymbol{x}$, since $h(\boldsymbol{0})$ is the minimum value of $h(\boldsymbol{x})$.
+    
+    \vspace{0.25cm}
+    \textbf{Authors:} 
+    Thank you for your feedback. The definition of the code-constraint polynomial is directly according to \cite{proximal_paper}. There the authors state that:
+    
+    \vspace{.1cm}
+        "[...] The first term on the right-hand side of this equation represents the bipolar constraint [...] and the second term corresponds to the parity constraint induced by $\boldsymbol{H}$ [...]. Since the polynomial $h(x)$ has a sum-of-squares (SOS) form, it can be regarded
+        as a penalty function that gives positive penalty values for non-codeword vectors in $\mathbb{R}^n$. The code-constraint polynomial $h(x)$ is inspired by the non-convex parity constraint function used in the GDBF objective function [4]. [...]"
+    \vspace{.1cm}
+    
+    Please note that $\boldsymbol{0}$ is not a global minimum for the code-constraint polynomial, but every codeword constitutes a local minimum. Therefore, an iterative algorithm can converge to one of those local minima and, thus, approximate the nearest neighbor decision.
+    
+    \vspace{0.75cm} 
+
+
+    \item \textbf{Comment 3:} The definition of the projection $\prod_\eta$ should be provided.
+    
+    \vspace{0.25cm}
+    \textbf{Authors:} 
+    Thank you for your feedback. We added the following description:
+
+    \vspace{.1cm}
+    "[...] every estimate $\boldsymbol{s}$ is projected onto $\left[-\eta, \eta\right]^n$ by a projection $\Pi_\eta : \mathbb{R}^n \rightarrow \left[-\eta, \eta\right]^n$ 
+    \reviewone{
+    defined as component-wise clipping, i.e., $\Pi_\eta(x_i)=x_i$ if $-\eta\leq x_i\leq \eta$, $\Pi_\eta(x_i)=\eta$ if $x_i>\eta$, and $\Pi_\eta(x_i)=-\eta$ if $x_i<\eta$,
+    }
+    where $\eta$ is a positive constant larger than one, e.g., $\eta = 1.5$. [...]"
+    \vspace{0.75cm} 
+
+
+    \item \textbf{Comment 4:} The proposed improved proximal decoding algorithm is just a combination of proximal decoding and ML-in-the-list decoding. Then, the process of the ML-in-the-list decoding used in this paper is similar to that of chase decoding, which is commonly used in decoding. ML-in-the-list decoding.
+    
+    \vspace{0.25cm}
+    \textbf{Authors:} 
+    Thank you for your feedback. Yes, this is correct. The idea is pretty similar to Chase decoding. The paper at hand is not claiming the introduction of ML-in-the-list or Chase decoding, but to provide a way how the list can be being generated in proximal decoding. We tried to clarify by adding the following statement:
+
+    \vspace{.1cm}
+    \reviewone{asdf}
+    \vspace{0.75cm} 
+    
+
+    \item \textbf{Comment 5:}
+    The criterion to construct the index set ${\mathcal I}’$ with $N$ elements should be explained clearly.
+    
+    \vspace{0.25cm} 
+    \textbf{Authors:} 
+    Thank you for your feedback. We added the following parts in the according paragraph for more clarity:
+
+    \vspace{.1cm}
+    "[...] \reviewone{Tagging the $N\in\mathbb{N}$ most likely erroneous bits can be based on} considering the \reviewone{oscillation of the gradient magnitudes  $|\left(\nabla h\right)_{i}|$, $i=1,\ldots, n$ \sout{of the magnitude of the gradient oscillation}} of the code-constraint polynomial \reviewone{by determining the empirical variances along the iterations $\text{Var}_\text{iter}(|\left(\nabla h\right)_{i}|)$, $i=1,\ldots, n$}. 
+    \reviewone{\sout{some interesting behavior may be observed}}. 
+    \reviewone{Now,} let \reviewone{$\boldsymbol{i}'=(i'_1, \ldots, i_n')=(\tau(1),\ldots, \tau(n))$ with $\tau: \{1,\ldots, n\}\to\{1,\ldots,n\}$} be a permutation of $\{1,\ldots, n\}$ such that $\left| \left(\nabla h\right)\right|_{i'}$ is arranged according to increasing \reviewone{empirical} variances \reviewone{\sout{gradient's magnitude oscillation of its magnitude}}, i.e., 
+    \begin{equation}\label{eq:def:i_prime}
+    \text{Var}_\text{iter}(|\left(\nabla h\right)_{i'_1}|)\leq \cdots \leq \text{Var}_\text{iter}(|\left(\nabla h\right)_{i'_n}|).
+    \end{equation}
+    \reviewone{\sout{with $\text{Var}_\text{iter}(\cdot)$ denoting the empirical variance along the iterations.}}
+    
+    \reviewone{To reason the approach in eq. (\ref{eq:def:i_prime}) \sout{Hereafter}}, Fig. \ref{fig:p_error} shows Monte Carlo simulations of the probability that decoded bit $\hat{c}_i'$ at position $i'$ of the estimated codeword
+    is wrong. %, when the components of
+    %$\boldsymbol{c}$ are ordered from smallest to largest oscillation of
+    %$\left(\nabla h\right)_i$.
+    It can be observed that lower magnitudes of oscillation correlate with higher probability that the corresponding bit was not decoded correctly.
+    Thus, this magnitude might be used as a feasible indicator
+    %for determining the probability that a given component was decoded incorrectly and, thus, 
+    for identifying \reviewone{the $N$ most likely} erroneously decoded bit positions as \reviewone{the first $N$ indices of $\boldsymbol{i}'$}:
+    \[
+    \mathcal{I}'=\{i_1', \ldots, i_N': \boldsymbol{i}' \text{ as defined in (\ref{eq:def:i_prime})} \}.%[...]"
+    \]
+    \vspace{0.75cm} 
+
+    \item \textbf{Comment 6:}
+    The performance of BP decoding should be provided as the baseline.
+    
+    \vspace{0.25cm}
+    \textbf{Authors:} 
+    Thank you for your feedback. We added the according BP behavior in the figure and added the following comment:
+
+    \vspace{.1cm}
+    \reviewone{As shown in Fig., it can that BP decoding performs...}
+    \vspace{0.75cm} 
+
+\end{itemize}
+
+
+\subsection{Review 2}
+
+
+\begin{itemize}
+    \item \textbf{Comment 1:} I believe that the paper makes a nice contribution to the topic of optimization-based decoding of LDPC codes. The topic is especially relevant, nowadays, for the applicability of this kind of decoders to quantum error correction - where classical BP decoding may yield limited coding gains, due to the loopy nature of the graphs.
+
+    The work is nicely-presented, solid, and the results are convincing. My only comment would be to try to put the use of this decoder in some perspective:
+
+    
+    \vspace{0.25cm}
+    \textbf{Authors:}
+    Thank you for your positive feedback. \reviewtwo{The according changes will be marked by accordingly coloring the changes in the paper and be listed below.}
+    
+    \vspace{0.75cm}    
+
+    \item \textbf{Comment 2:} [...] adding, on the performance charts, the performance of a standard BP decoder (it will beat your decoding algorithm, but this is not the point)
+
+
+    
+    \vspace{0.25cm}
+    \textbf{Authors:} 
+    Thank you for your feedback. We added the according BP behavior in the figure and added the following comment:
+
+    \vspace{.1cm}
+    \reviewtwo{As shown in Fig., it can seen that BP decoding performs...}
+    \vspace{0.75cm} 
+
+    \vspace{0.75cm} 
+
+
+    \item \textbf{Comment 3:} [...] explaining when this class of algorithms should be preferred to BP decoding
+    
+    \vspace{0.25cm}
+    \textbf{Authors:} 
+    Thank you for your feedback. We added the following statement:
+
+    \vspace{.1cm}
+    \reviewtwo{something concerning effort?!?}
+    \vspace{0.75cm} 
+    \vspace{0.75cm} 
+
+
+\end{itemize}
+
+
+\subsection{Review 3}
+
+\begin{itemize}
+    \item \textbf{Comment 1:} The paper describes an enhancement and mitigate essential flaws found in the recently reported proximal decoding algorithm for LDPC codes, mentioned in the references section. At first the algorithm subject to the paper is interesting because the published material a few years back seem to have no substantial performance improvement, and did not seem to make any influence. It is therefore interesting to see that this paper addresses this fact and fixes the issues around the originally proposed algorithm and demonstrating up to a 1 dB coding gain as a result of these corrections and enhancements.
+
+    While I find the paper is interesting and relevant, here are my essential comments that would prevent me in favor of publication.
+
+    
+    \vspace{0.25cm}
+    \textbf{Authors:} 
+    Thank you for your positive feedback. \reviewthree{The according changes will be marked by accordingly coloring the changes in the paper and be listed below.}
+    \vspace{0.75cm}    
+
+    \item \textbf{Comment 2:} The work is titled after linar block codes, however both the original proximal decoding paper and this work go after LDPC codes only. Clarification required as in whether the proposed method would work for any linear block code, and if so, elaboration and proof is needed as well. Currently, linear codes are only mentioned in the first two sentences of the Introduction section other than the title.
+
+
+    
+    \vspace{0.25cm}
+    \textbf{Authors:} 
+    Thank you for your feedback. We also analysezed the proposed scheme for BCH codes. There it turned out that...
+
+    
+    \vspace{.1cm}
+    \reviewthree{Some comment regarding the applicability of the proposed scheme to BCH...}
+    \vspace{0.75cm} 
+
+
+    \item \textbf{Comment 3:} Does this work (and the original work) based on BPSK modulation only? How would the code constraint polynomial change with higher order modulations? It would be interesting to see how this would change given that the polynomial is based on a nearest neighbor decision.
+    
+    \vspace{0.25cm}
+    \textbf{Authors:} 
+    Thank you for your feedback. 
+
+    \vspace{.1cm}
+    \reviewthree{one sentence regarding bit-metric decoder mapping higher valued symbols to elementwise bit-LLRs. }
+    \vspace{0.75cm} 
+
+
+    \item \textbf{Comment 4:} The decoding failure rate stands out as a good analysis as in explaining the FER behavior. But if the codeword is not really converging at all, wouldn't there be simpler approaches than ML decoding to find out which one of $2^N$ codewords is the valid one?
+    
+    \vspace{0.25cm}
+    \textbf{Authors:} 
+    Thank you for your feedback. 
+    
+    \vspace{.1cm}
+    \reviewthree{one sentence concerning the feasability of using $N$ bit candidates and choosing $N$ according to the complexity; comment on trade-off w.r.t. $N$}
+    \vspace{0.75cm} 
+
+
+    \item \textbf{Comment 5:} If you can, please have a more comprehensive simulation to smooth out the curve in Fig.4.  Otherwise, please explain the odd behavior in the middle of the figure. I would also recommend a bar graph over a line graph for a better representation of the data.
+    
+    \vspace{0.25cm}
+    \textbf{Authors:} 
+    Thank you for your feedback. The behavior is due to only few errors occurring in this setting (please mind the $y$-axis. Since the relevant information is contained in only the lower values of $i'$, which will finally be chosen for constituting $\mathcal{I}'$, only indices up to, e.g. $N=12$ are relevant. The figure has been complemented by focussing the relevant region.
+    \vspace{0.75cm}
+
+
+    \item \textbf{Comment 6:} How does your algorithm handle the case when there is more than one ML in your final list? It is not shown in the algorithm. 
+    
+    \vspace{0.25cm}
+    \textbf{Authors:} 
+    Thank you for your feedback. Since Algorithm 3 (ML-in-the-List) is dealing with real-valued numbers, the probability of two correlations being equal is zero almost surely. \textcolor{red}{@AT: Do we need to check this?} Even if the event of a draw would happen, choosing either of the candidates is equivalent with respect to the ML decision rule.
+    \vspace{0.75cm}
+
+\end{itemize}
+
+
+\end{document}
--- a/res/bp_20433484.csv
+++ b/res/bp_20433484.csv
@@ -0,0 +1,55 @@
+SNR,FER, BER
+1.0, 0.660130718954248   , 0.0852528713750873  , 201
+1.5, 0.404000000000000   , 0.0521189120809614  , 201
+2.0, 0.152567975830816   , 0.0205194201655138  , 201
+2.5, 0.0608433734939759  , 0.00596517145671054 , 201
+3.0, 0.0129470580694783  , 0.00123860830178510 , 201
+3.5, 0.00181828001512233 , 0.000157819883647122, 201
+4.0, 0.000220000000000000, 1.38446077628694e-05, 201
+4.5, 2.00000000000000e-05, 2.09803921568627e-06, 58
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
Author	SHA1	Message	Date
Andreas Tsouchlos	2670cac40b	Add first review responses	2024-06-13 17:42:42 +02:00
Andreas Tsouchlos	adb7321b93	Add changes made before submission for review	2024-06-13 17:41:59 +02:00
Andreas Tsouchlos	7211d63889	Add matlab BP simulation	2024-06-13 17:35:48 +02:00