ba-thesis/latex/thesis/chapters/proximal_decoding.tex

\chapter{Proximal Decoding}%
\label{chapter:proximal_decoding}

In this chapter, the proximal decoding algorithm is examined.
First, the algorithm itself is described.
Then, some interesting ideas concerning the implementation are presented.
Simulation results are shown, on the basis of which the behaviour of the
algorithm is investigated for different codes and parameters.
Finally, an improvement on proximal decoding is proposed.


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Decoding Algorithm}%
\label{sec:prox:Decoding Algorithm}

Proximal decoding was proposed by Wadayama et. al as a novel formulation of
optimization-based decoding \cite{proximal_paper}.
With this algorithm, minimization is performed using the proximal gradient
method.
In contrast to \ac{LP} decoding, the objective function is based on a
non-convex optimization formulation of the \ac{MAP} decoding problem.

In order to derive the objective function, the authors begin with the
\ac{MAP} decoding rule, expressed as a continuous maximization problem%
\footnote{The expansion of the domain to be continuous doesn't constitute a
material difference in the meaning of the rule.
The only change is that what previously were \acp{PMF} now have to be expressed
in terms of \acp{PDF}.}
over $\boldsymbol{x}$:%
%
\begin{align}
    \hat{\boldsymbol{x}} = \argmax_{\tilde{\boldsymbol{x}} \in \mathbb{R}^{n}}
        f_{\tilde{\boldsymbol{X}} \mid \boldsymbol{Y}}
        \left( \tilde{\boldsymbol{x}} \mid \boldsymbol{y} \right)
        = \argmax_{\tilde{\boldsymbol{x}} \in \mathbb{R}^{n}} f_{\boldsymbol{Y}
            \mid \tilde{\boldsymbol{X}}}
        \left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right)
        f_{\tilde{\boldsymbol{X}}}\left( \tilde{\boldsymbol{x}} \right)%
    \label{eq:prox:vanilla_MAP}
.\end{align}%
%
The likelihood $f_{\boldsymbol{Y} \mid \tilde{\boldsymbol{X}}}
\left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right) $ is a known function
determined by the channel model.
The prior \ac{PDF} $f_{\tilde{\boldsymbol{X}}}\left( \tilde{\boldsymbol{x}} \right)$ is also
known, as the equal probability assumption is made on
$\mathcal{C}$.
However, since the considered domain is continuous,
the prior \ac{PDF} cannot be ignored as a constant during the minimization
as is often done, and has a rather unwieldy representation:%
%
\begin{align}
    f_{\tilde{\boldsymbol{X}}}\left( \tilde{\boldsymbol{x}} \right) =
        \frac{1}{\left| \mathcal{C} \right| }
            \sum_{\boldsymbol{c} \in \mathcal{C} }
                \delta\big( \tilde{\boldsymbol{x}} - \left( -1 \right) ^{\boldsymbol{c}}\big)
    \label{eq:prox:prior_pdf}
.\end{align}%
%
In order to rewrite the prior \ac{PDF}
$f_{\tilde{\boldsymbol{X}}}\left( \tilde{\boldsymbol{x}} \right)$,
the so-called \textit{code-constraint polynomial} is introduced as:%
%
\begin{align*}
    h\left( \tilde{\boldsymbol{x}} \right) =
        \underbrace{\sum_{i=1}^{n} \left( \tilde{x_i}^2-1 \right) ^2}_{\text{Bipolar constraint}}
        + \underbrace{\sum_{j=1}^{m} \left[
            \left( \prod_{i\in N_c \left( j \right) } \tilde{x_i} \right)
        -1 \right] ^2}_{\text{Parity constraint}}%
.\end{align*}%
%
The intention of this function is to provide a way to penalize vectors far
from a codeword and favor those close to one.
In order to achieve this, the polynomial is composed of two parts: one term
representing the bipolar constraint, providing for a discrete solution of the
continuous optimization problem, and one term representing the parity
constraints, accommodating the role of the parity-check matrix $\boldsymbol{H}$.
The prior \ac{PDF} is then approximated using the code-constraint polynomial as:%
%
\begin{align}
    f_{\tilde{\boldsymbol{X}}}\left( \tilde{\boldsymbol{x}} \right)
    \approx \frac{1}{Z}\mathrm{e}^{-\gamma h\left( \tilde{\boldsymbol{x}} \right) }%
    \label{eq:prox:prior_pdf_approx}
.\end{align}%
%
The authors justify this approximation by arguing, that for
$\gamma \rightarrow \infty$, the approximation in equation
(\ref{eq:prox:prior_pdf_approx}) approaches the original function in equation
(\ref{eq:prox:prior_pdf}).
This approximation can then be plugged into equation (\ref{eq:prox:vanilla_MAP})
and the likelihood can be rewritten using the negative log-likelihood
$L \left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right) = -\ln\left(
        f_{\boldsymbol{Y} \mid \tilde{\boldsymbol{X}}}\left(
        \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right) \right) $:%
%
\begin{align*}
    \hat{\boldsymbol{x}} &= \argmax_{\tilde{\boldsymbol{x}} \in \mathbb{R}^{n}}
            \mathrm{e}^{- L\left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right) }
            \mathrm{e}^{-\gamma h\left( \tilde{\boldsymbol{x}} \right) } \\
        &= \argmin_{\tilde{\boldsymbol{x}} \in \mathbb{R}^n} \big(
            L\left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right)
            + \gamma h\left( \tilde{\boldsymbol{x}} \right)
            \big)%
.\end{align*}%
%
Thus, with proximal decoding, the objective function
$g\left( \tilde{\boldsymbol{x}} \right)$ considered is%
%
\begin{align}
    g\left( \tilde{\boldsymbol{x}} \right) = L\left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}}
    \right)
        + \gamma h\left( \tilde{\boldsymbol{x}} \right)%
    \label{eq:prox:objective_function}
\end{align}%
%
and the decoding problem is reformulated to%
%
\begin{align*}
    \text{minimize}\hspace{2mm}   &L\left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right)
        + \gamma h\left( \tilde{\boldsymbol{x}} \right)\\
    \text{subject to}\hspace{2mm} &\tilde{\boldsymbol{x}} \in \mathbb{R}^n
.\end{align*}
%

For the solution of the approximate \ac{MAP} decoding problem, the two parts
of equation (\ref{eq:prox:objective_function}) are considered separately:
the minimization of the objective function occurs in an alternating
fashion, switching between the negative log-likelihood
$L\left( \boldsymbol{y} \mid \boldsymbol{x} \right) $ and the scaled
code-constraint polynomial $\gamma h\left( \boldsymbol{x} \right) $.
Two helper variables, $\boldsymbol{r}$ and $\boldsymbol{s}$, are introduced,
describing the result of each of the two steps.
The first step, minimizing the log-likelihood, is performed using gradient
descent:%
%
\begin{align}
    \boldsymbol{r} \leftarrow \boldsymbol{s} - \omega \nabla
        L\left( \boldsymbol{y} \mid \boldsymbol{s} \right),
    \hspace{5mm}\omega > 0
    \label{eq:prox:step_log_likelihood}
.\end{align}%
%
For the second step, minimizing the scaled code-constraint polynomial, the
proximal gradient method is used and the \textit{proximal operator} of
$\gamma h\left( \tilde{\boldsymbol{x}} \right) $ has to be computed.
It is then immediately approximated with gradient-descent:%
%
\begin{align*}
    \textbf{prox}_{\gamma h} \left( \tilde{\boldsymbol{x}} \right) &\equiv
        \argmin_{\boldsymbol{t} \in \mathbb{R}^n}
            \left( \gamma h\left( \boldsymbol{t} \right) +
                \frac{1}{2} \lVert \boldsymbol{t} - \tilde{\boldsymbol{x}} \rVert \right)\\
        &\approx \tilde{\boldsymbol{x}} - \gamma \nabla h \left( \tilde{\boldsymbol{x}} \right),
    \hspace{5mm} \gamma > 0, \text{ small}
.\end{align*}%
%
The second step thus becomes%
%
\begin{align*}
    \boldsymbol{s} \leftarrow \boldsymbol{r} - \gamma \nabla h\left( \boldsymbol{r} \right),
    \hspace{5mm}\gamma > 0,\text{ small}
.\end{align*}
%
While the approximation of the prior \ac{PDF} made in equation (\ref{eq:prox:prior_pdf_approx})
theoretically becomes better
with larger $\gamma$, the constraint that $\gamma$ be small is important,
as it keeps the effect of $h\left( \tilde{\boldsymbol{x}} \right) $ on the landscape
of the objective function small.
Otherwise, unwanted stationary points, including local minima, are introduced.
The authors say that ``in practice, the value of $\gamma$ should be adjusted
according to the decoding performance.'' \cite[Sec. 3.1]{proximal_paper}.

%The components of the gradient of the code-constraint polynomial can be computed as follows:%
%%
%\begin{align*}
%    \frac{\partial}{\partial x_k} h\left( \boldsymbol{x} \right) =
%        4\left( x_k^2 - 1 \right) x_k + \frac{2}{x_k}
%            \sum_{i\in \mathcal{B}\left( k \right) } \left(
%                \left( \prod_{j\in\mathcal{A}\left( i \right)} x_j\right)^2
%                - \prod_{j\in\mathcal{A}\left( i \right) }x_j \right)
%.\end{align*}%
%\todo{Only multiplication?}%
%\todo{$x_k$: $k$ or some other indexing variable?}%
%%
In the case of \ac{AWGN}, the likelihood
$f_{\boldsymbol{Y} \mid \tilde{\boldsymbol{X}}}
    \left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right)$
is%
%
\begin{align*}
    f_{\boldsymbol{Y} \mid \tilde{\boldsymbol{X}}}
        \left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right)
        = \frac{1}{\sqrt{2\pi\sigma^2}}\mathrm{e}^{
            -\frac{\lVert \boldsymbol{y}-\tilde{\boldsymbol{x}}
        \rVert^2 }
    {2\sigma^2}}
.\end{align*}
%
Thus, the gradient of the negative log-likelihood becomes%
\footnote{For the minimization, constants can be disregarded. For this reason,
it suffices to consider only proportionality instead of equality.}%
%
\begin{align*}
    \nabla L \left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right)
    &\propto -\nabla \lVert \boldsymbol{y} - \tilde{\boldsymbol{x}} \rVert^2\\
    &\propto \tilde{\boldsymbol{x}} - \boldsymbol{y}
,\end{align*}%
%
allowing equation (\ref{eq:prox:step_log_likelihood}) to be rewritten as%
%
\begin{align*}
    \boldsymbol{r} \leftarrow \boldsymbol{s}
        - \omega \left( \boldsymbol{s} - \boldsymbol{y} \right)
.\end{align*}
%

One thing to consider during the actual decoding process, is that the gradient
of the code-constraint polynomial can take on extremely large values.
To avoid numerical instability, an additional step is added, where all
components of the current estimate are clipped to $\left[-\eta, \eta \right]$,
where $\eta$ is a positive constant slightly larger than one:%
%
\begin{align*}
    \boldsymbol{s} \leftarrow \Pi_{\eta} \left( \boldsymbol{r}
        - \gamma \nabla h\left( \boldsymbol{r} \right)  \right)
,\end{align*}
%
$\Pi_{\eta}\left( \cdot \right) $ expressing the projection onto
$\left[ -\eta, \eta \right]^n$.

The iterative decoding process resulting from these considerations is shown in
figure \ref{fig:prox:alg}.

\begin{figure}[H]
    \centering

    \begin{genericAlgorithm}[caption={}, label={}]
$\boldsymbol{s} \leftarrow \boldsymbol{0}$
for $K$ iterations do
    $\boldsymbol{r} \leftarrow \boldsymbol{s} - \omega \left( \boldsymbol{s} - \boldsymbol{y} \right) $
    $\boldsymbol{s} \leftarrow \Pi_\eta \left(\boldsymbol{r} - \gamma \nabla h\left( \boldsymbol{r} \right) \right)$
    $\boldsymbol{\hat{x}} \leftarrow \text{sign}\left( \boldsymbol{s} \right) $
    if $\boldsymbol{H}\boldsymbol{\hat{c}} = \boldsymbol{0}$ do
        return $\boldsymbol{\hat{c}}$
    end if
end for
return $\boldsymbol{\hat{c}}$
    \end{genericAlgorithm}


    \caption{Proximal decoding algorithm for an \ac{AWGN} channel}
    \label{fig:prox:alg}
\end{figure}


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Implementation Details}%
\label{sec:prox:Implementation Details}

The algorithm was first implemented in Python because of the fast development
process and straightforward debugging ability.
It was subsequently reimplemented in C++ using the Eigen%
\footnote{\url{https://eigen.tuxfamily.org}}
linear algebra library to achieve higher performance.
The focus has been set on a fast implementation, sometimes at the expense of
memory usage.
The evaluation of the simulation results has been wholly realized in Python.

The gradient of the code-constraint polynomial \cite[Sec. 2.3]{proximal_paper}
is given by%
%
\begin{align*}
    \nabla h\left( \boldsymbol{x} \right) &= \begin{bmatrix}
        \frac{\partial}{\partial x_1}h\left( \boldsymbol{x} \right) &
        \ldots &
        \frac{\partial}{\partial x_n}h\left( \boldsymbol{x} \right) &
    \end{bmatrix}^\text{T}, \\[1em]
    \frac{\partial}{\partial x_k}h\left( \boldsymbol{x} \right) &= 4\left( x_k^2 - 1 \right) x_k
        + \frac{2}{x_k} \sum_{j\in N_v\left( k \right) }\left(
            \left( \prod_{i \in N_c\left( j \right)} x_i \right)^2
                - \prod_{i\in N_c\left( j \right) } x_i \right)
.\end{align*}
%
Since the products
$\prod_{i\in N_c\left( j \right) } x_i,\hspace{2mm}j\in \mathcal{J}$
are the same for all components $x_k$ of $\boldsymbol{x}$, they can be
precomputed.
Defining%
%
\begin{align*}
    \boldsymbol{p} := \begin{bmatrix}
        \prod_{i\in N_c\left( 1 \right) }x_i \\
        \vdots \\
        \prod_{i\in N_c\left( m \right) }x_i \\
    \end{bmatrix}
    \hspace{5mm}
    \text{and}
    \hspace{5mm}
    \boldsymbol{v} := \boldsymbol{p}^{\circ 2} - \boldsymbol{p}
,\end{align*}
%
the gradient can be written as%
%
\begin{align*}
    \nabla h\left( \boldsymbol{x} \right) =
        4\left( \boldsymbol{x}^{\circ 3} - \boldsymbol{x} \right)
        + 2\boldsymbol{x}^{\circ -1} \circ \boldsymbol{H}^\text{T}
            \boldsymbol{v}
,\end{align*}
%
enabling the computation of the gradient primarily with element-wise
operations and matrix-vector multiplication.
This is beneficial, as the libraries used for the implementation are
heavily optimized for such calculations (e.g., through vectorization of the
operations).
\todo{Note about how the equation with which the gradient is calculated is
itself similar to a message-passing rule}

The projection $\prod_{\eta}\left( . \right)$ also proves straightforward to
compute, as it amounts to simply clipping each component of the vector onto
$[-\eta, \eta]$ individually.


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Simulation Results}%
\label{sec:prox:Simulation Results}

All simulation results presented hereafter are based on Monte Carlo
simulations.
The \ac{BER} and \ac{FER} curves in particular have been generated by
producing at least 100 frame-errors for each data point, unless otherwise
stated.
\todo{Mention number of datapoints from which each graph was created for
non ber and fer curves}

Figure \ref{fig:prox:results} shows a comparison of the decoding performance
of the proximal decoding algorithm as presented by Wadayama et al. in
\cite{proximal_paper} and the implementation realized for this work.

\begin{figure}[H]
    \centering

    \begin{tikzpicture}
        \begin{axis}[grid=both, grid style={line width=.1pt},
                     xlabel={$E_b / N_0$ (dB)}, ylabel={BER},
                     ymode=log,
                     legend style={at={(0.5,-0.55)},anchor=south},
                     width=0.75\textwidth,
                     height=0.5625\textwidth,
                     ymax=1.2, ymin=0.8e-4,
                     xtick={1, 2, ..., 5},
                     xmin=0.9, xmax=5.6,
                     legend columns=2,]

            \addplot [ForestGreen, mark=*, line width=1pt]
                table [x=SNR, y=gamma_0_15, col sep=comma] {res/proximal/ber_paper.csv};
            \addlegendentry{$\gamma = 0.15$ (Wadayama et al.)}
            \addplot [ForestGreen, mark=triangle, dashed, line width=1pt]
                table [x=SNR, y=BER, col sep=comma,
                            discard if not={gamma}{0.15},
                            discard if gt={SNR}{5.5},]
                {res/proximal/2d_ber_fer_dfr_20433484.csv};
            \addlegendentry{$\gamma = 0.15$ (Own results)}

            \addplot [NavyBlue, mark=*, line width=1pt]
                table [x=SNR, y=gamma_0_01, col sep=comma] {res/proximal/ber_paper.csv};
            \addlegendentry{$\gamma = 0.01$ (Wadayama et al.)}
            \addplot [NavyBlue, mark=triangle, dashed, line width=1pt]
                table [x=SNR, y=BER, col sep=comma,
                            discard if not={gamma}{0.01},
                            discard if gt={SNR}{5.5},]
                {res/proximal/2d_ber_fer_dfr_20433484.csv};
            \addlegendentry{$\gamma = 0.01$ (Own results)}

            \addplot [RedOrange, mark=*, line width=1pt]
                table [x=SNR, y=gamma_0_05, col sep=comma] {res/proximal/ber_paper.csv};
            \addlegendentry{$\gamma = 0.05$ (Wadayama et al.)}
            \addplot [RedOrange, mark=triangle, dashed, line width=1pt]
                table [x=SNR, y=BER, col sep=comma,
                            discard if not={gamma}{0.05},
                            discard if gt={SNR}{5.5},]
                {res/proximal/2d_ber_fer_dfr_20433484.csv};
            \addlegendentry{$\gamma = 0.05$ (Own results)}

            \addplot [RoyalPurple, mark=*, line width=1pt]
                table [x=SNR, y=BP, col sep=comma] {res/proximal/ber_paper.csv};
            \addlegendentry{BP (Wadayama et al.)}
        \end{axis}
    \end{tikzpicture}

    \caption{Simulation results\protect\footnotemark{} for $\omega = 0.05, K=100$}
    \label{fig:prox:results}
\end{figure}
%
\footnotetext{(3,6) regular LDPC code with n = 204, k = 102 \cite[204.33.484]{mackay_enc}}%
%

Looking at the graph in figure \ref{fig:prox:results} one might notice that for
a moderately chosen value of $\gamma$ ($\gamma = 0.05$) the decoding
performance is better than for low ($\gamma = 0.01$) or high
($\gamma = 0.15$) values.
The question arises if there is some optimal value maximazing the decoding
performance, especially since the decoding performance seems to dramatically
depend on $\gamma$.
To better understand how $\gamma$ and the decoding performance are
related, figure \ref{fig:prox:results} was recreated, but with a considerably
larger selection of values for $\gamma$ (figure \ref{fig:prox:results_3d}).%
%
\begin{figure}[H]
    \centering

    \begin{tikzpicture}

        \begin{axis}[view={75}{30},
                     zmode=log,
                     xlabel={$E_b / N_0$ (dB)},
                     ylabel={$\gamma$},
                     zlabel={BER},
                     legend pos=outer north east,
                     %legend style={at={(0.5,-0.55)},anchor=south},
                     ytick={0, 0.05, 0.1, 0.15},
                     width=0.6\textwidth,
                     height=0.45\textwidth,]

            \addplot3[surf,
                      mesh/rows=17, mesh/cols=14,
                      colormap/viridis] table [col sep=comma,
                                               x=SNR, y=gamma, z=BER]
                                            {res/proximal/2d_ber_fer_dfr_20433484.csv};
            \addlegendentry{$\gamma = \left[ 0\text{:}0.01\text{:}0.16 \right] $}
            \addplot3[NavyBlue, line width=1.5] table [col sep=comma,
                                                   discard if not={gamma}{0.01},
                                                   x=SNR, y=gamma, z=BER]
                                            {res/proximal/2d_ber_fer_dfr_20433484.csv};
            \addlegendentry{$\gamma = 0.01$}
            \addplot3[RedOrange, line width=1.5] table [col sep=comma,
                                                  discard if not={gamma}{0.05},
                                                  x=SNR, y=gamma, z=BER]
                                            {res/proximal/2d_ber_fer_dfr_20433484.csv};
            \addlegendentry{$\gamma = 0.05$}
            \addplot3[ForestGreen, line width=1.5] table [col sep=comma,
                                                    discard if not={gamma}{0.15},
                                                    x=SNR, y=gamma, z=BER]
                                            {res/proximal/2d_ber_fer_dfr_20433484.csv};
            \addlegendentry{$\gamma = 0.15$}
        \end{axis}
    \end{tikzpicture}

    \caption{BER\protect\footnotemark{} for $\omega = 0.05, K=100$}
    \label{fig:prox:results_3d}
\end{figure}%
%
\footnotetext{(3,6) regular LDPC code with n = 204, k = 102 \cite[\text{204.33.484}]{mackay_enc}}%
%
\noindent Evidently, while the performance does depend on the value of
$\gamma$, there is no single optimal value offering optimal performance, but
rather a certain interval in which the performance stays largely the same.
When examining a number of different codes (figure
\ref{fig:prox:results_3d_multiple}), it is apparent that while the exact
landscape of the graph depends on the code, the general behaviour is the same
in each case.

\begin{figure}[H]
    \centering

    \begin{subfigure}[c]{0.48\textwidth}
        \centering
        \begin{tikzpicture}
            \begin{axis}[view={75}{30},
                         zmode=log,
                         xlabel={$E_b / N_0$ (dB)},
                         ylabel={$\gamma$},
                         zlabel={BER},
                         width=\textwidth,
                         height=0.75\textwidth,]
                \addplot3[surf,
                          mesh/rows=17, mesh/cols=10,
                          colormap/viridis] table [col sep=comma,
                                                   x=SNR, y=gamma, z=BER]
                                            {res/proximal/2d_ber_fer_dfr_963965.csv};
                \addplot3[RedOrange, line width=1.5] table[col sep=comma,
                                                     discard if not={gamma}{0.05},
                                                     x=SNR, y=gamma, z=BER]
                                            {res/proximal/2d_ber_fer_dfr_963965.csv};
                \addplot3[NavyBlue, line width=1.5] table[col sep=comma,
                                                      discard if not={gamma}{0.01},
                                                      x=SNR, y=gamma, z=BER]
                                            {res/proximal/2d_ber_fer_dfr_963965.csv};
                \addplot3[ForestGreen, line width=1.5] table[col sep=comma,
                                                       discard if not={gamma}{0.15},
                                                       x=SNR, y=gamma, z=BER]
                                            {res/proximal/2d_ber_fer_dfr_963965.csv};
            \end{axis}
        \end{tikzpicture}
        \caption{$\left( 3, 6 \right)$-regular LDPC code with $n=96, k=48$
            \cite[\text{96.3.965}]{mackay_enc}}
    \end{subfigure}%
    \hfill
    \begin{subfigure}[c]{0.48\textwidth}
        \centering
        \begin{tikzpicture}
            \begin{axis}[view={75}{30},
                         zmode=log,
                         xlabel={$E_b / N_0$ (dB)},
                         ylabel={$\gamma$},
                         zlabel={BER},
                         width=\textwidth,
                         height=0.75\textwidth,]
                \addplot3[surf,
                          mesh/rows=17, mesh/cols=10,
                          colormap/viridis] table [col sep=comma,
                                                   x=SNR, y=gamma, z=BER]
                                            {res/proximal/2d_ber_fer_dfr_bch_31_26.csv};
                \addplot3[RedOrange, line width=1.5] table[col sep=comma,
                                                     discard if not={gamma}{0.05},
                                                     x=SNR, y=gamma, z=BER]
                                            {res/proximal/2d_ber_fer_dfr_bch_31_26.csv};
                \addplot3[NavyBlue, line width=1.5] table[col sep=comma,
                                                      discard if not={gamma}{0.01},
                                                      x=SNR, y=gamma, z=BER]
                                            {res/proximal/2d_ber_fer_dfr_bch_31_26.csv};
                \addplot3[ForestGreen, line width=1.5] table[col sep=comma,
                                                       discard if not={gamma}{0.15},
                                                       x=SNR, y=gamma, z=BER]
                                            {res/proximal/2d_ber_fer_dfr_bch_31_26.csv};
            \end{axis}
        \end{tikzpicture}
        \caption{BCH code with $n=31, k=26$\\[2\baselineskip]}
    \end{subfigure}

    \begin{subfigure}[c]{0.48\textwidth}
        \centering
        \begin{tikzpicture}
            \begin{axis}[view={75}{30},
                         zmode=log,
                         xlabel={$E_b/N_0$ (dB)},
                         ylabel={$\gamma$},
                         zlabel={BER},
                         width=\textwidth,
                         height=0.75\textwidth,]
                \addplot3[surf,
                          mesh/rows=17, mesh/cols=14,
                          colormap/viridis] table [col sep=comma,
                                                   x=SNR, y=gamma, z=BER]
                                            {res/proximal/2d_ber_fer_dfr_20433484.csv};
                \addplot3[RedOrange, line width=1.5] table[col sep=comma,
                                                     discard if not={gamma}{0.05},
                                                     x=SNR, y=gamma, z=BER]
                                            {res/proximal/2d_ber_fer_dfr_20433484.csv};
                \addplot3[NavyBlue, line width=1.5] table[col sep=comma,
                                                      discard if not={gamma}{0.01},
                                                      x=SNR, y=gamma, z=BER]
                                            {res/proximal/2d_ber_fer_dfr_20433484.csv};
                \addplot3[ForestGreen, line width=1.5] table[col sep=comma,
                                                       discard if not={gamma}{0.15},
                                                       x=SNR, y=gamma, z=BER]
                                            {res/proximal/2d_ber_fer_dfr_20433484.csv};
            \end{axis}
        \end{tikzpicture}
        \caption{$\left( 3, 6 \right)$-regular LDPC code with $n=204, k=102$
            \cite[\text{204.33.484}]{mackay_enc}}
    \end{subfigure}%
    \hfill
    \begin{subfigure}[c]{0.48\textwidth}
        \centering
        \begin{tikzpicture}
            \begin{axis}[view={75}{30},
                         zmode=log,
                         xlabel={$E_b / N_0$ (dB)},
                         ylabel={$\gamma$},
                         zlabel={BER},
                         width=\textwidth,
                         height=0.75\textwidth,]
                \addplot3[surf,
                          mesh/rows=17, mesh/cols=10,
                          colormap/viridis] table [col sep=comma,
                                                   x=SNR, y=gamma, z=BER]
                                            {res/proximal/2d_ber_fer_dfr_20455187.csv};
                \addplot3[RedOrange, line width=1.5] table[col sep=comma,
                                                     discard if not={gamma}{0.05},
                                                     x=SNR, y=gamma, z=BER]
                                            {res/proximal/2d_ber_fer_dfr_20455187.csv};
                \addplot3[NavyBlue, line width=1.5] table[col sep=comma,
                                                      discard if not={gamma}{0.01},
                                                      x=SNR, y=gamma, z=BER]
                                            {res/proximal/2d_ber_fer_dfr_20455187.csv};
                \addplot3[ForestGreen, line width=1.5] table[col sep=comma,
                                                       discard if not={gamma}{0.15},
                                                       x=SNR, y=gamma, z=BER]
                                            {res/proximal/2d_ber_fer_dfr_20455187.csv};
            \end{axis}
        \end{tikzpicture}
        \caption{$\left( 5, 10 \right)$-regular LDPC code with $n=204, k=102$
            \cite[\text{204.55.187}]{mackay_enc}}
    \end{subfigure}%

    \begin{subfigure}[c]{0.48\textwidth}
        \centering
        \begin{tikzpicture}
            \begin{axis}[view={75}{30},
                         zmode=log,
                         xlabel={$E_b / N_0$ (dB)},
                         ylabel={$\gamma$},
                         zlabel={BER},
                         width=\textwidth,
                         height=0.75\textwidth,]
                \addplot3[surf,
                          mesh/rows=17, mesh/cols=10,
                          colormap/viridis] table [col sep=comma,
                                                   x=SNR, y=gamma, z=BER]
                                            {res/proximal/2d_ber_fer_dfr_40833844.csv};
                \addplot3[RedOrange, line width=1.5] table[col sep=comma,
                                                     discard if not={gamma}{0.05},
                                                     x=SNR, y=gamma, z=BER]
                                            {res/proximal/2d_ber_fer_dfr_40833844.csv};
                \addplot3[NavyBlue, line width=1.5] table[col sep=comma,
                                                      discard if not={gamma}{0.01},
                                                      x=SNR, y=gamma, z=BER]
                                            {res/proximal/2d_ber_fer_dfr_40833844.csv};
                \addplot3[ForestGreen, line width=1.5] table[col sep=comma,
                                                       discard if not={gamma}{0.15},
                                                       x=SNR, y=gamma, z=BER]
                                            {res/proximal/2d_ber_fer_dfr_40833844.csv};
            \end{axis}
        \end{tikzpicture}
        \caption{$\left( 3, 6 \right)$-regular LDPC code with $n=408, k=204$
            \cite[\text{408.33.844}]{mackay_enc}}
    \end{subfigure}%
    \hfill
    \begin{subfigure}[c]{0.48\textwidth}
        \centering
        \begin{tikzpicture}
            \begin{axis}[view={75}{30},
                         zmode=log,
                         xlabel={$E_b / N_0$ (dB)},
                         ylabel={$\gamma$},
                         zlabel={BER},
                         width=\textwidth,
                         height=0.75\textwidth,]
                \addplot3[surf,
                          mesh/rows=17, mesh/cols=10,
                          colormap/viridis] table [col sep=comma,
                                                   x=SNR, y=gamma, z=BER]
                                        {res/proximal/2d_ber_fer_dfr_pegreg252x504.csv};
                \addplot3[RedOrange, line width=1.5] table[col sep=comma,
                                                     discard if not={gamma}{0.05},
                                                     x=SNR, y=gamma, z=BER]
                                        {res/proximal/2d_ber_fer_dfr_pegreg252x504.csv};
                \addplot3[NavyBlue, line width=1.5] table[col sep=comma,
                                                      discard if not={gamma}{0.01},
                                                      x=SNR, y=gamma, z=BER]
                                        {res/proximal/2d_ber_fer_dfr_pegreg252x504.csv};
                \addplot3[ForestGreen, line width=1.5] table[col sep=comma,
                                                       discard if not={gamma}{0.15},
                                                       x=SNR, y=gamma, z=BER]
                                        {res/proximal/2d_ber_fer_dfr_pegreg252x504.csv};
            \end{axis}
        \end{tikzpicture}
        \caption{LDPC code (Progressive Edge Growth Construction) with $n=504, k=252$
            \cite[\text{PEGReg252x504}]{mackay_enc}}
    \end{subfigure}%

    \vspace{1cm}

    \begin{subfigure}[c]{\textwidth}
        \centering
        \begin{tikzpicture}
            \begin{axis}[hide axis,
                         xmin=10, xmax=50,
                         ymin=0, ymax=0.4,
                         legend style={draw=white!15!black,legend cell align=left}]
                \addlegendimage{surf, colormap/viridis}
                \addlegendentry{$\gamma = \left[ 0\text{ : }0.01\text{ : }0.16 \right] $};
                \addlegendimage{NavyBlue, line width=1.5pt}
                \addlegendentry{$\gamma = 0.01$};
                \addlegendimage{RedOrange, line width=1.5pt}
                \addlegendentry{$\gamma = 0.05$};
                \addlegendimage{ForestGreen, line width=1.5pt}
                \addlegendentry{$\gamma = 0.15$};
            \end{axis}
        \end{tikzpicture}
    \end{subfigure}

    \caption{BER for $\omega = 0.05, K=100$ (different codes)}
    \label{fig:prox:results_3d_multiple}
\end{figure}

A similar analysis was performed to determine the optimal values for the other
parameters, $\omega$, $K$ and $\eta$.

TODO

Until now, only the \ac{BER} has been considered to assess the decoding
performance.
The \ac{FER}, however, shows considerably worse performance, as can be seen in
figure \ref{TODO}.
One possible explanation might be found in the structure of the proxmal
decoding algorithm \ref{TODO} itself.
As it comprises two separate steps, one responsible for addressing the
likelihood and one for addressing the constraints imposed by the parity-check
matrix, the algorithm could tend to gravitate toward the correct codeword
but then get stuck in a local minimum introduced by the code-constraint
polynomial.
This would yield fewer bit-errors, while still producing a frame error.
This course of thought will be picked up in section
\ref{sec:prox:Improved Implementation} to try to improve the algorithm.


\begin{itemize}
    \item Introduction
        \begin{itemize}
            \item asdf
            \item ghjk
        \end{itemize}
    \item Reconstruction of results from paper
        \begin{itemize}
            \item asdf
            \item ghjk
        \end{itemize}
    \item Choice of parameters, in particular gamma
        \begin{itemize}
            \item Introduction (``Looking at these results, the question arises \ldots'')
            \item Different gammas simulated for same code as in paper
            \item
        \end{itemize}
    \item The FER problem
        \begin{itemize}
            \item Intro (``\acs{FER} not as good as the \acs{BER} would have one assume'')
            \item Possible explanation
        \end{itemize}
    \item Computational performance
        \begin{itemize}
            \item Theoretical analysis
            \item Simulation results to substantiate theoretical analysis
        \end{itemize}
    \item Conclusion
        \begin{itemize}
            \item Choice of $\gamma$ code-dependant but decoding performance largely unaffected
                by small variations
            \item Number of iterations independent of \ac{SNR}
            \item $\mathcal{O}\left( n \right)$ time complexity, implementation heavily
                optimizable
        \end{itemize}
\end{itemize}


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Improved Implementation}%
\label{sec:prox:Improved Implementation}

\begin{itemize}
    \item Improvement using ``ML-on-List''
        \begin{itemize}
            \item Attach to FER problem
            \item
        \end{itemize}
    \item Decoding performance and comparison with standard proximal decoding
        \begin{itemize}
            \item asdf
            \item ghjk
        \end{itemize}
    \item Computational performance and comparison with standard proximal decoding
        \begin{itemize}
            \item asdf
            \item ghjk
        \end{itemize}
    \item Conclusion
        \begin{itemize}
            \item Summary
            \item Up to $\SI{1}{dB}$ gain possible
        \end{itemize}
\end{itemize}