Minor wording changes. Moved references to codes to captions

This commit is contained in:
Andreas Tsouchlos 2023-04-23 13:29:33 +02:00
parent 3ba87d5558
commit 5c2ffb4aa5
4 changed files with 52 additions and 91 deletions

View File

@ -45,15 +45,6 @@
% eprint = {https://doi.org/10.1080/24725854.2018.1550692}
}
@online{mackay_enc,
author = {MacKay, David J.C.},
title = {Encyclopedia of Sparse Graph Codes},
date = {2023-01},
url = {http://www.inference.org.uk/mackay/codes/data.html}
}
@article{proximal_algorithms,
author = {Parikh, Neal and Boyd, Stephen},
title = {Proximal Algorithms},
@ -221,9 +212,9 @@
@online{lautern_channelcodes,
author = "Helmling, Michael and Scholl, Stefan and Gensheimer, Florian and Dietz, Tobias and Kraft, Kira and Ruzika, Stefan and Wehn, Norbert",
title = "{D}atabase of {C}hannel {C}odes and {ML} {S}imulation {R}esults",
url = {https://www.uni-kl.de/channel-codes},
date = {2023-04}
title = "{D}atabase of {C}hannel {C}odes and {ML} {S}imulation {R}esults",
url={https://www.uni-kl.de/channel-codes},
date = {2023-04}
}
@online{mackay_enc,

View File

@ -3,14 +3,9 @@
In this chapter, proximal decoding and \ac{LP} Decoding using \ac{ADMM} are compared.
First, the two algorithms are studied on a theoretical basis.
Subsequently, their respective simulation results are examined, and their
Subsequently, their respective simulation results are examined and their
differences are interpreted based on their theoretical structure.
%some similarities between the proximal decoding algorithm
%and \ac{LP} decoding using \ac{ADMM} are be pointed out.
%The two algorithms are compared and their different computational and decoding
%performance is interpreted on the basis of their theoretical structure.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Theoretical Comparison}%
@ -18,12 +13,11 @@ differences are interpreted based on their theoretical structure.
\ac{ADMM} and the proximal gradient method can both be expressed in terms of
proximal operators \cite[Sec. 4.4]{proximal_algorithms}.
When using \ac{ADMM} as an optimization method to solve the \ac{LP} decoding
problem specifically, this is not quite possible because of the multiple
constraints. \todo{Elaborate}
In spite of that, the two algorithms still show some striking similarities.
Additionally, the two algorithms show some striking similarities with
regard to their general structure and the way in which the minimization of the
respective objective functions is accomplished.
To see the first of these similarities, the \ac{LP} decoding problem in
The \ac{LP} decoding problem in
equation (\ref{eq:lp:relaxed_formulation}) can be slightly rewritten using the
\textit{indicator functions} $g_j : \mathbb{R}^{d_j} \rightarrow
\left\{ 0, +\infty \right\} \hspace{1mm}, j\in\mathcal{J}$ for the polytopes
@ -156,7 +150,7 @@ Additionally, both algorithms can be understood as message-passing algorithms,
\cite[Sec. III. D.]{original_admm} and
\cite[Sec. II. B.]{efficient_lp_dec_admm}, and proximal decoding by starting
with algorithm \ref{alg:prox}, substituting for the gradient of the
code-constraint polynomial and separating it into two parts.
code-constraint polynomial and separating the $\boldsymbol{s}$ update into two parts.
The algorithms in their message-passing form are depicted in figure
\ref{fig:comp:message_passing}.
$M_{j\to i}$ denotes a message transmitted from \ac{CN} j to \ac{VN} i.
@ -246,12 +240,12 @@ respect to $n$ and are heavily parallelisable.
\section{Comparison of Simulation Results}%
\label{sec:comp:res}
The decoding performance of the two algorithms is shown in figure
The decoding performance of the two algorithms is compared in figure
\ref{fig:comp:prox_admm_dec} in the form of the \ac{FER}.
Shown as well is the performance of the improved proximal decoding
algorithm presented in section \ref{sec:prox:Improved Implementation}.
Additionally, the \ac{FER} resulting from decoding using \ac{BP} and,
wherever available, the \ac{ML} decoding \ac{FER} taken from
The \ac{FER} resulting from decoding using \ac{BP} and,
wherever available, the \ac{FER} of \ac{ML} decoding taken from
\cite{lautern_channelcodes} are plotted as a reference.
The parameters chosen for the proximal and improved proximal decoders are
$\gamma=0.05$, $\omega=0.05$, $K=200$, $\eta = 1.5$ and $N=12$.
@ -266,15 +260,15 @@ codes with larger $n$ and reaching values of up to $\SI{2}{dB}$.
These simulation results can be interpreted with regard to the theoretical
structure of the decoding methods, as analyzed in section \ref{sec:comp:theo}.
The worse performance of proximal decoding is somewhat surprising considering
The worse performance of proximal decoding is somewhat surprising, considering
the global treatment of the constraints in contrast to the local treatment
in the case of \ac{LP} decoding using \ac{ADMM}.
It may be explained, however, in the context of the nature of the
calculations performed in each case.
With proximal decoding, the calculations are approximate, leading
to the constraints never being quite satisfied.
With \ac{LP} decoding using \ac{ADMM}
the constraints are fulfilled for each parity check individualy, after each
With \ac{LP} decoding using \ac{ADMM},
the constraints are fulfilled for each parity check individualy after each
iteration of the decoding process.
The timing requirements of the decoding algorithms are visualized in figure
@ -282,12 +276,14 @@ The timing requirements of the decoding algorithms are visualized in figure
The datapoints have been generated by evaluating the metadata from \ac{FER}
and \ac{BER} simulations using the parameters mentioned earlier when
discussing the decoding performance.
The codes considered are the same as in sections \ref{subsec:prox:comp_perf}
and \ref{subsec:admm:comp_perf}.
While the \ac{ADMM} implementation seems to be faster the the proximal
decoding and improved proximal decoding implementations, infering some
general behavior is difficult in this case.
This is because of the comparison of actual implementations, making the
results dependent on factors such as the grade of optimization of each
implementation.
results dependent on factors such as the grade of optimization of each of the
implementations.
Nevertheless, the run time of both the proximal decoding and the \ac{LP}
decoding using \ac{ADMM} implementations is similar and both are
reasonably performant, owing to the parallelisable structure of the

View File

@ -1327,8 +1327,8 @@ In figure \ref{fig:admm:results}, the simulation results for the ``Margulis''
conducted in the context of this thesis.
The parameters chosen were $\mu=3.3$, $\rho=1.9$, $K=1000$,
$\epsilon_\text{pri}=10^{-5}$ and $\epsilon_\text{dual}=10^{-5}$,
the same as in \cite{original_admm};
the two \ac{FER} curves are practically identical.
the same as in \cite{original_admm}.
The two \ac{FER} curves are practically identical.
Also shown is the curve resulting from \ac{BP} decoding, performing
1000 iterations.
The two algorithms perform relatively similarly, coming within $\SI{0.5}{dB}$
@ -1509,8 +1509,8 @@ Simulation results from a range of different codes can be used to verify this
analysis.
Figure \ref{fig:admm:time} shows the average time needed to decode one
frame as a function of its length.
\todo{List codes used}
The results are necessarily skewed because the codes considered vary not only
The codes used for this consideration are the same as in section \ref{subsec:prox:comp_perf}
The results are necessarily skewed because these vary not only
in their length, but also in their construction scheme and rate.
Additionally, different optimization opportunities arise depending on the
length of a code, since for smaller codes dynamic memory allocation can be

View File

@ -350,7 +350,7 @@ $\gamma$ are shown, as well as the curve resulting from decoding
using a \ac{BP} decoder, as a reference.
The results from Wadayama et al. are shown with solid lines,
while the newly generated ones are shown with dashed lines.
%
\begin{figure}[h]
\centering
@ -405,16 +405,12 @@ while the newly generated ones are shown with dashed lines.
\end{axis}
\end{tikzpicture}
\caption{Comparison of datapoints from Wadayama et al. with own simulation results%
\protect\footnotemark{}}
\caption{Comparison of datapoints from Wadayama et al. with own simulation results.
(3, 6) regular \ac{LDPC} code with $n=204, k=102$ \cite[\text{204.33.484}]{mackay_enc}}
\label{fig:prox:results}
\end{figure}
%
\footnotetext{(3,6) regular \ac{LDPC} code with $n = 204$, $k = 102$
\cite[\text{204.33.484}]{mackay_enc}; $\omega = 0.05, K=200, \eta=1.5$
}%
%
\noindent It is noticeable that for a moderately chosen value of $\gamma$
It is noticeable that for a moderately chosen value of $\gamma$
($\gamma = 0.05$) the decoding performance is better than for low
($\gamma = 0.01$) or high ($\gamma = 0.15$) values.
The question arises whether there is some optimal value maximizing the decoding
@ -472,14 +468,11 @@ rather a certain interval in which it stays largely unchanged.
\end{tikzpicture}
\caption{Visualization of the relationship between the decoding performance
\protect\footnotemark{} and the parameter $\gamma$}
and the parameter $\gamma$. (3,6) regular \ac{LDPC} code with
$n=204, k=102$ \cite[\text{204.33.484}]{mackay_enc}}
\label{fig:prox:results_3d}
\end{figure}%
%
\footnotetext{(3,6) regular \ac{LDPC} code with n = 204, k = 102
\cite[\text{204.33.484}]{mackay_enc}; $\omega = 0.05, K=200, \eta=1.5$
}%
%
This indicates that while the choice of the parameter $\gamma$
significantly affects the decoding performance, there is not much benefit
attainable in undertaking an extensive search for an exact optimum.
@ -541,15 +534,11 @@ depicted in figure \ref{fig:prox:gamma_omega_multiple}.
\end{axis}
\end{tikzpicture}
\caption{The \ac{BER} as a function of the two step sizes\protect\footnotemark{}}
\caption{The \ac{BER} as a function of the two step sizes. (3,6) regular
\ac{LDPC} code with $n=204, k=102$ \cite[\text{204.33.484}]{mackay_enc}}
\label{fig:prox:gamma_omega}
\end{figure}%
%
\footnotetext{(3,6) regular \ac{LDPC} code with n = 204, k = 102
\cite[\text{204.33.484}]{mackay_enc}; $E_b / N_0=\SI{4}{dB}, K=100, \eta=1.5$
}%
%
To better understand how to determine the optimal value for the parameter $K$,
the average error is inspected.
@ -605,13 +594,10 @@ stabilizes.
\end{axis}
\end{tikzpicture}
\caption{Average error for $\SI{500000}{}$ decodings\protect\footnotemark{}}
\caption{Average error for $\SI{500000}{}$ decodings. (3,6) regular \ac{LDPC} code
with $n=204, k=102$ \cite[\text{204.33.484}]{mackay_enc}}
\end{figure}%
%
\footnotetext{(3,6) regular \ac{LDPC} code with n = 204, k = 102
\cite[\text{204.33.484}]{mackay_enc}; $\gamma=0.05, \omega = 0.05, K=200, \eta=1.5$
}%
%
Changing the parameter $\eta$ does not appear to have a significant effect on
the decoding performance when keeping the value within a reasonable window
@ -705,14 +691,11 @@ means to bring about numerical stability.
\end{axis}
\end{tikzpicture}
\caption{Comparison of \ac{FER}, \ac{BER} and decoding failure rate\protect\footnotemark{}}
\caption{Comparison of \ac{FER}, \ac{BER} and decoding failure rate. (3,6) regular
\ac{LDPC} code with $n=204, k=102$ \cite[\text{204.33.484}]{mackay_enc}}
\label{fig:prox:ber_fer_dfr}
\end{figure}%
%
\footnotetext{(3,6) regular \ac{LDPC} code with n = 204, k = 102
\cite[\text{204.33.484}]{mackay_enc}; $\omega = 0.05, K=100, \eta=1.5$
}%
%
Until now, only the \ac{BER} has been considered to gauge the decoding
performance.
@ -758,7 +741,7 @@ iteration ($\boldsymbol{r}$ and $\boldsymbol{s}$ are counted as different
estimates and their values are interwoven to obtain the shown result), as well
as the gradients of the negative log-likelihood and the code-constraint
polynomial, which influence the next estimate.
%
\begin{figure}[h]
\begin{subfigure}[t]{0.48\textwidth}
\centering
@ -964,16 +947,11 @@ polynomial, which influence the next estimate.
\end{tikzpicture}
\end{subfigure}
\caption{Visualization of a single decoding operation\protect\footnotemark{}
for a code with $n=7$}
\caption{Visualization of a single decoding operation. BCH$\left( 7, 4 \right)$ code}
\label{fig:prox:convergence}
\end{figure}%
%
\footnotetext{BCH$\left( 7,4 \right) $ code; $\gamma = 0.05, \omega = 0.05, K=200,
\eta = 1.5, E_b / N_0 = \SI{5}{dB}$
}%
%
\noindent It is evident that in all cases, past a certain number of
It is evident that in all cases, past a certain number of
iterations, the estimate starts to oscillate around a particular value.
Jointly, the two gradients stop further approaching the value
zero.
@ -1159,16 +1137,11 @@ an invalid codeword.
\end{axis}
\end{tikzpicture}
\caption{Visualization of a single decoding operation\protect\footnotemark{}
for a code with $n=204$}
\caption{Visualization of a single decoding operation. (3,6) regular \ac{LDPC} code
with $n=204, k=102$ \cite[\text{204.33.484}]{mackay_enc}}
\label{fig:prox:convergence_large_n}
\end{figure}%
%
\footnotetext{(3,6) regular \ac{LDPC} code with n = 204, k = 102
\cite[\text{204.33.484}]{mackay_enc}; $\gamma=0.05, \omega = 0.05, K=200, \eta=1.5,
E_b / N_0 = \SI{5}{dB}$
}%
%
\subsection{Computational Performance}
@ -1194,6 +1167,10 @@ practical since it is the same as that of $\ac{BP}$.
This theoretical analysis is also corroborated by the practical results shown
in figure \ref{fig:prox:time_comp}.
The codes considered are the BCH(31, 11) and BCH(31, 26) codes, a number of (3, 6)
regular \ac{LDPC} codes (\cite[\text{96.3.965, 204.33.484, 408.33.844}]{mackay_enc}),
a (5,10) regular \ac{LDPC} code (\cite[\text{204.55.187}]{mackay_enc}) and a
progressive edge growth construction code (\cite[\text{PEGReg252x504}]{mackay_enc}).
Some deviations from linear behavior are unavoidable because not all codes
considered are actually \ac{LDPC} codes, or \ac{LDPC} codes constructed
according to the same scheme.
@ -1260,7 +1237,7 @@ section \ref{subsec:prox:conv_properties} and shown in figure
\ref{fig:prox:convergence_large_n}) and the probability of having a bit
error are strongly correlated, a relationship depicted in figure
\ref{fig:prox:correlation}.
%
\begin{figure}[h]
\centering
@ -1283,22 +1260,17 @@ error are strongly correlated, a relationship depicted in figure
\end{tikzpicture}
\caption{Correlation between the occurrence of a bit error and the
amplitude of oscillation of the gradient of the code-constraint polynomial%
\protect\footnotemark{}}
amplitude of oscillation of the gradient of the code-constraint polynomial.
(3,6) regular \ac{LDPC} code with $n=204, k=102$ \cite[\text{204.33.484}]{mackay_enc}}
\label{fig:prox:correlation}
\end{figure}%
%
\footnotetext{(3,6) regular \ac{LDPC} code with n = 204, k = 102
\cite[\text{204.33.484}]{mackay_enc}; $\gamma = 0.05, \omega = 0.05, K=100, \eta=1.5$
}%
%
\noindent The y-axis depicts whether there is a bit error and the x-axis the
The y-axis depicts whether there is a bit error and the x-axis the
variance in $\nabla h\left( \tilde{\boldsymbol{x}} \right)$ after the
100th iteration.
While this is not exactly the magnitude of the oscillation, it is
proportional and easier to compute.
The datapoints are taken from a single decoding operation
The datapoints are taken from a single decoding operation.
Using this observation as a rule to determine the $N\in\mathbb{N}$ most
probably wrong bits, all variations of the estimate with those bits modified
@ -1482,7 +1454,9 @@ In some cases, a gain of up to $\SI{1}{dB}$ or higher can be achieved.
\end{axis}
\end{tikzpicture}
\caption{Simulation results for $\gamma = 0.05, \omega = 0.05, K=200, N=12$}
\caption{Comparison of the decoding performance between proximal decoding and the
improved implementation. (3,6) regular \ac{LDPC} code with $n=204, k=102$
\cite[\text{204.33.484}]{mackay_enc}}
\label{fig:prox:improved_results}
\end{figure}