10 Commits

11 changed files with 1593 additions and 592 deletions

View File

@@ -223,3 +223,12 @@
date = {2023-04}, date = {2023-04},
url = {http://www.inference.org.uk/mackay/codes/data.html} url = {http://www.inference.org.uk/mackay/codes/data.html}
} }
@article{adam,
title={Adam: A method for stochastic optimization},
author={Kingma, Diederik P and Ba, Jimmy},
journal={arXiv preprint arXiv:1412.6980},
year={2014},
doi={10.48550/arXiv.1412.6980}
}

View File

@@ -0,0 +1,18 @@
\chapter*{Acknowledgements}
I would like to thank Prof. Dr.-Ing. Laurent Schmalen for granting me the
opportunity to write my bachelor's thesis at the Communications Engineering Lab,
as well as all other members of the institute for their help and many productive
discussions, and for creating a very pleasant environment to do research in.
I am very grateful to Dr.-Ing. Holger Jäkel
for kindly providing me with his knowledge and many suggestions,
and for his constructive criticism during the preparation of this work.
Special thanks also to Mai Anh Vu for her invaluable feedback and support
during the entire undertaking that is this thesis.
Finally, I would like to thank my family, who have enabled me to pursue my
studies in a field I thoroughly enjoy and who have supported me completely
throughout my journey.

View File

@@ -508,7 +508,7 @@ $\gamma \in \left\{ 0.01, 0.05, 0.15 \right\}$.
\begin{tikzpicture} \begin{tikzpicture}
\begin{axis}[ \begin{axis}[
grid=both, grid=both,
xlabel={$E_b / N_0$}, ylabel={FER}, xlabel={$E_b / N_0$ (dB)}, ylabel={FER},
ymode=log, ymode=log,
legend columns=1, legend columns=1,
legend pos=outer north east, legend pos=outer north east,
@@ -549,7 +549,7 @@ $\gamma \in \left\{ 0.01, 0.05, 0.15 \right\}$.
\begin{tikzpicture} \begin{tikzpicture}
\begin{axis}[ \begin{axis}[
grid=both, grid=both,
xlabel={$E_b / N_0$}, ylabel={FER}, xlabel={$E_b / N_0$ (dB)}, ylabel={FER},
ymode=log, ymode=log,
legend columns=1, legend columns=1,
legend pos=outer north east, legend pos=outer north east,
@@ -593,7 +593,7 @@ $\gamma \in \left\{ 0.01, 0.05, 0.15 \right\}$.
\begin{tikzpicture} \begin{tikzpicture}
\begin{axis}[ \begin{axis}[
grid=both, grid=both,
xlabel={$E_b / N_0$}, ylabel={FER}, xlabel={$E_b / N_0$ (dB)}, ylabel={FER},
ymode=log, ymode=log,
legend columns=1, legend columns=1,
legend pos=outer north east, legend pos=outer north east,
@@ -647,7 +647,7 @@ $\gamma \in \left\{ 0.01, 0.05, 0.15 \right\}$.
\begin{tikzpicture} \begin{tikzpicture}
\begin{axis}[ \begin{axis}[
grid=both, grid=both,
xlabel={$E_b / N_0$}, ylabel={FER}, xlabel={$E_b / N_0$ (dB)}, ylabel={FER},
ymode=log, ymode=log,
legend columns=1, legend columns=1,
legend pos=outer north east, legend pos=outer north east,
@@ -692,7 +692,7 @@ $\gamma \in \left\{ 0.01, 0.05, 0.15 \right\}$.
\begin{tikzpicture} \begin{tikzpicture}
\begin{axis}[ \begin{axis}[
grid=both, grid=both,
xlabel={$E_b / N_0$}, ylabel={FER}, xlabel={$E_b / N_0$ (dB)}, ylabel={FER},
ymode=log, ymode=log,
legend columns=1, legend columns=1,
legend pos=outer north east, legend pos=outer north east,
@@ -735,7 +735,7 @@ $\gamma \in \left\{ 0.01, 0.05, 0.15 \right\}$.
\begin{tikzpicture} \begin{tikzpicture}
\begin{axis}[ \begin{axis}[
grid=both, grid=both,
xlabel={$E_b / N_0$}, ylabel={FER}, xlabel={$E_b / N_0$ (dB)}, ylabel={FER},
ymode=log, ymode=log,
legend columns=1, legend columns=1,
legend pos=outer north east, legend pos=outer north east,

View File

@@ -3,7 +3,7 @@
In this chapter, proximal decoding and \ac{LP} Decoding using \ac{ADMM} are compared. In this chapter, proximal decoding and \ac{LP} Decoding using \ac{ADMM} are compared.
First, the two algorithms are studied on a theoretical basis. First, the two algorithms are studied on a theoretical basis.
Subsequently, their respective simulation results are examined and their Subsequently, their respective simulation results are examined, and their
differences are interpreted based on their theoretical structure. differences are interpreted based on their theoretical structure.
@@ -32,13 +32,13 @@ $\mathcal{P}_{d_j}, \hspace{1mm} j\in\mathcal{J}$, defined as%
% %
by moving the constraints into the objective function, as shown in figure by moving the constraints into the objective function, as shown in figure
\ref{fig:ana:theo_comp_alg:admm}. \ref{fig:ana:theo_comp_alg:admm}.
Both algorithms are composed of an iterative approach consisting of two
alternating steps.
The objective functions of the two problems are similar in that they The objective functions of the two problems are similar in that they
both comprise two parts: one associated to the likelihood that a given both comprise two parts: one associated to the likelihood that a given
codeword was sent, stemming from the channel model, and one associated codeword was sent, arising from the channel model, and one associated
to the constraints the decoding process is subjected to, stemming from the to the constraints the decoding process is subjected to, arising from the
code used. code used.
Both algorithms are composed of an iterative approach consisting of two
alternating steps, each minimizing one part of the objective function.
% %
\begin{figure}[h] \begin{figure}[h]
@@ -109,7 +109,7 @@ return $\tilde{\boldsymbol{c}}$
\end{subfigure}% \end{subfigure}%
\caption{Comparison of the proximal gradient method and \ac{ADMM}} \caption{Comparison of proximal decoding and \ac{LP} decoding using \ac{ADMM}}
\label{fig:ana:theo_comp_alg} \label{fig:ana:theo_comp_alg}
\end{figure}% \end{figure}%
% %
@@ -139,7 +139,7 @@ This means that additional redundant parity-checks can be added successively
until the codeword returned is valid and thus the \ac{ML} solution is found until the codeword returned is valid and thus the \ac{ML} solution is found
\cite[Sec. IV.]{alp}. \cite[Sec. IV.]{alp}.
In terms of time complexity the two decoding algorithms are comparable. In terms of time complexity, the two decoding algorithms are comparable.
Each of the operations required for proximal decoding can be performed Each of the operations required for proximal decoding can be performed
in $\mathcal{O}\left( n \right) $ time for \ac{LDPC} codes (see section in $\mathcal{O}\left( n \right) $ time for \ac{LDPC} codes (see section
\ref{subsec:prox:comp_perf}). \ref{subsec:prox:comp_perf}).
@@ -172,10 +172,10 @@ while stopping critierion unfulfilled do
|\vspace{0.22mm}\Reactivatenumber| |\vspace{0.22mm}\Reactivatenumber|
end for end for
for i in $\mathcal{I}$ do for i in $\mathcal{I}$ do
$s_i \leftarrow s_i + \gamma \left[ 4\left( s_i^2 - 1 \right)s_i $s_i\leftarrow \Pi_\eta \left( s_i + \gamma \left( 4\left( s_i^2 - 1 \right)s_i
\phantom{\frac{4}{s_i}}\right.$|\Suppressnumber| \phantom{\frac{4}{s_i}}\right.\right.$|\Suppressnumber|
|\Reactivatenumber|$\left.+ \frac{4}{s_i}\sum_{j\in N_v\left( i \right) } |\Reactivatenumber|$\left.\left.+ \frac{4}{s_i}\sum_{j\in
M_{j\to i} \right] $ N_v\left( i \right) } M_{j\to i} \right)\right) $
$r_i \leftarrow r_i + \omega \left( s_i - y_i \right)$ $r_i \leftarrow r_i + \omega \left( s_i - y_i \right)$
end for end for
end while end while
@@ -216,7 +216,7 @@ return $\tilde{\boldsymbol{c}}$
\end{subfigure}% \end{subfigure}%
\caption{The proximal gradient method and \ac{LP} decoding using \ac{ADMM} \caption{Proximal decoding and \ac{LP} decoding using \ac{ADMM}
as message passing algorithms} as message passing algorithms}
\label{fig:comp:message_passing} \label{fig:comp:message_passing}
\end{figure}% \end{figure}%
@@ -232,7 +232,7 @@ With proximal decoding this minimization is performed for all constraints at onc
in an approximative manner, while with \ac{LP} decoding using \ac{ADMM} it is in an approximative manner, while with \ac{LP} decoding using \ac{ADMM} it is
performed for each constraint individually and with exact results. performed for each constraint individually and with exact results.
In terms of time complexity, both algorithms are linear with In terms of time complexity, both algorithms are linear with
respect to $n$ and are heavily parallelisable. respect to $n$ and are heavily parallelizable.
@@ -241,18 +241,18 @@ respect to $n$ and are heavily parallelisable.
\label{sec:comp:res} \label{sec:comp:res}
The decoding performance of the two algorithms is compared in figure The decoding performance of the two algorithms is compared in figure
\ref{fig:comp:prox_admm_dec} in the form of the \ac{FER}. \ref{fig:comp:prox_admm_dec} in form of the \ac{FER}.
Shown as well is the performance of the improved proximal decoding Shown as well is the performance of the improved proximal decoding
algorithm presented in section \ref{sec:prox:Improved Implementation}. algorithm presented in section \ref{sec:prox:Improved Implementation}.
The \ac{FER} resulting from decoding using \ac{BP} and, The \ac{FER} resulting from decoding using \ac{BP} and,
wherever available, the \ac{FER} of \ac{ML} decoding taken from wherever available, the \ac{FER} of \ac{ML} decoding, taken from
\cite{lautern_channelcodes} are plotted as a reference. \cite{lautern_channelcodes}, are plotted as a reference.
The parameters chosen for the proximal and improved proximal decoders are The parameters chosen for the proximal and improved proximal decoders are
$\gamma=0.05$, $\omega=0.05$, $K=200$, $\eta = 1.5$ and $N=12$. $\gamma=0.05$, $\omega=0.05$, $K=200$, $\eta = 1.5$ and $N=12$.
The parameters chosen for \ac{LP} decoding using \ac{ADMM} are $\mu = 5$, The parameters chosen for \ac{LP} decoding using \ac{ADMM} are $\mu = 5$,
$\rho = 1$, $K=200$, $\epsilon_\text{pri} = 10^{-5}$ and $\rho = 1$, $K=200$, $\epsilon_\text{pri} = 10^{-5}$ and
$\epsilon_\text{dual} = 10^{-5}$. $\epsilon_\text{dual} = 10^{-5}$.
For all codes considered in the scope of this work, \ac{LP} decoding using For all codes considered within the scope of this work, \ac{LP} decoding using
\ac{ADMM} consistently outperforms both proximal decoding and the improved \ac{ADMM} consistently outperforms both proximal decoding and the improved
version, reaching very similar performance to \ac{BP}. version, reaching very similar performance to \ac{BP}.
The decoding gain heavily depends on the code, evidently becoming greater for The decoding gain heavily depends on the code, evidently becoming greater for
@@ -268,8 +268,12 @@ calculations performed in each case.
With proximal decoding, the calculations are approximate, leading With proximal decoding, the calculations are approximate, leading
to the constraints never being quite satisfied. to the constraints never being quite satisfied.
With \ac{LP} decoding using \ac{ADMM}, With \ac{LP} decoding using \ac{ADMM},
the constraints are fulfilled for each parity check individualy after each the constraints are fulfilled for each parity check individually after each
iteration of the decoding process. iteration of the decoding process.
A further contributing factor might be the structure of the optimization
process, as the alternating minimization with respect to the same variable
leads to oscillatory behavior, as explained in section
\ref{subsec:prox:conv_properties}.
It should be noted that while in this thesis proximal decoding was It should be noted that while in this thesis proximal decoding was
examined with respect to its performance in \ac{AWGN} channels, in examined with respect to its performance in \ac{AWGN} channels, in
\cite{proximal_paper} it is presented as a method applicable to non-trivial \cite{proximal_paper} it is presented as a method applicable to non-trivial
@@ -279,21 +283,21 @@ broadening its usefulness beyond what is shown here.
The timing requirements of the decoding algorithms are visualized in figure The timing requirements of the decoding algorithms are visualized in figure
\ref{fig:comp:time}. \ref{fig:comp:time}.
The datapoints have been generated by evaluating the metadata from \ac{FER} The datapoints have been generated by evaluating the metadata from \ac{FER}
and \ac{BER} simulations using the parameters mentioned earlier when and \ac{BER} simulations and using the parameters mentioned earlier when
discussing the decoding performance. discussing the decoding performance.
The codes considered are the same as in sections \ref{subsec:prox:comp_perf} The codes considered are the same as in sections \ref{subsec:prox:comp_perf}
and \ref{subsec:admm:comp_perf}. and \ref{subsec:admm:comp_perf}.
While the \ac{ADMM} implementation seems to be faster the the proximal While the \ac{ADMM} implementation seems to be faster than the proximal
decoding and improved proximal decoding implementations, infering some decoding and improved proximal decoding implementations, inferring some
general behavior is difficult in this case. general behavior is difficult in this case.
This is because of the comparison of actual implementations, making the This is because of the comparison of actual implementations, making the
results dependent on factors such as the grade of optimization of each of the results dependent on factors such as the grade of optimization of each of the
implementations. implementations.
Nevertheless, the run time of both the proximal decoding and the \ac{LP} Nevertheless, the run time of both the proximal decoding and the \ac{LP}
decoding using \ac{ADMM} implementations is similar and both are decoding using \ac{ADMM} implementations is similar, and both are
reasonably performant, owing to the parallelisable structure of the reasonably performant, owing to the parallelizable structure of the
algorithms. algorithms.
%
\begin{figure}[h] \begin{figure}[h]
\centering \centering
@@ -328,8 +332,6 @@ algorithms.
\label{fig:comp:time} \label{fig:comp:time}
\end{figure}% \end{figure}%
% %
\footnotetext{asdf}
%
\begin{figure}[h] \begin{figure}[h]
\centering \centering
@@ -340,7 +342,7 @@ algorithms.
\begin{tikzpicture} \begin{tikzpicture}
\begin{axis}[ \begin{axis}[
grid=both, grid=both,
xlabel={$E_b / N_0$}, ylabel={FER}, xlabel={$E_b / N_0$ (dB)}, ylabel={FER},
ymode=log, ymode=log,
ymax=1.5, ymin=8e-5, ymax=1.5, ymin=8e-5,
width=\textwidth, width=\textwidth,
@@ -376,7 +378,7 @@ algorithms.
\begin{tikzpicture} \begin{tikzpicture}
\begin{axis}[ \begin{axis}[
grid=both, grid=both,
xlabel={$E_b / N_0$}, ylabel={FER}, xlabel={$E_b / N_0$ (dB)}, ylabel={FER},
ymode=log, ymode=log,
ymax=1.5, ymin=8e-5, ymax=1.5, ymin=8e-5,
width=\textwidth, width=\textwidth,
@@ -414,7 +416,7 @@ algorithms.
\begin{tikzpicture} \begin{tikzpicture}
\begin{axis}[ \begin{axis}[
grid=both, grid=both,
xlabel={$E_b / N_0$}, ylabel={FER}, xlabel={$E_b / N_0$ (dB)}, ylabel={FER},
ymode=log, ymode=log,
ymax=1.5, ymin=8e-5, ymax=1.5, ymin=8e-5,
width=\textwidth, width=\textwidth,
@@ -455,7 +457,7 @@ algorithms.
\begin{tikzpicture} \begin{tikzpicture}
\begin{axis}[ \begin{axis}[
grid=both, grid=both,
xlabel={$E_b / N_0$}, ylabel={FER}, xlabel={$E_b / N_0$ (dB)}, ylabel={FER},
ymode=log, ymode=log,
ymax=1.5, ymin=8e-5, ymax=1.5, ymin=8e-5,
width=\textwidth, width=\textwidth,
@@ -490,7 +492,7 @@ algorithms.
\begin{tikzpicture} \begin{tikzpicture}
\begin{axis}[ \begin{axis}[
grid=both, grid=both,
xlabel={$E_b / N_0$}, ylabel={FER}, xlabel={$E_b / N_0$ (dB)}, ylabel={FER},
ymode=log, ymode=log,
ymax=1.5, ymin=8e-5, ymax=1.5, ymin=8e-5,
width=\textwidth, width=\textwidth,
@@ -523,7 +525,7 @@ algorithms.
\begin{tikzpicture} \begin{tikzpicture}
\begin{axis}[ \begin{axis}[
grid=both, grid=both,
xlabel={$E_b / N_0$}, ylabel={FER}, xlabel={$E_b / N_0$ (dB)}, ylabel={FER},
ymode=log, ymode=log,
ymax=1.5, ymin=8e-5, ymax=1.5, ymin=8e-5,
width=\textwidth, width=\textwidth,
@@ -572,7 +574,7 @@ algorithms.
\addlegendentry{\acs{LP} decoding using \acs{ADMM}} \addlegendentry{\acs{LP} decoding using \acs{ADMM}}
\addlegendimage{RoyalPurple, line width=1pt, mark=*, solid} \addlegendimage{RoyalPurple, line width=1pt, mark=*, solid}
\addlegendentry{\acs{BP} (20 iterations)} \addlegendentry{\acs{BP} (200 iterations)}
\addlegendimage{Black, line width=1pt, mark=*, solid} \addlegendimage{Black, line width=1pt, mark=*, solid}
\addlegendentry{\acs{ML} decoding} \addlegendentry{\acs{ML} decoding}
@@ -580,8 +582,8 @@ algorithms.
\end{tikzpicture} \end{tikzpicture}
\end{subfigure} \end{subfigure}
\caption{Comparison of decoding performance between proximal decoding and \ac{LP} decoding \caption{Comparison of the decoding performance of the different decoder
using \ac{ADMM}} implementations for various codes}
\label{fig:comp:prox_admm_dec} \label{fig:comp:prox_admm_dec}
\end{figure} \end{figure}

View File

@@ -1,44 +1,44 @@
\chapter{Conclusion}% \chapter{Conclusion and Outlook}%
\label{chapter:conclusion} \label{chapter:conclusion}
In the context of this thesis, two decoding algorithms were considered: In the context of this thesis, two decoding algorithms were considered:
proximal decoding and \ac{LP} decoding using \ac{ADMM}. proximal decoding and \ac{LP} decoding using \ac{ADMM}.
The two algorithms were first analyzed individually, before comparing them The two algorithms were first analyzed individually, before comparing them
based on simulation results as well as their theoretical structure. based on simulation results as well as on their theoretical structure.
For proximal decoding, the effect of each parameter on the behavior of the For proximal decoding, the effect of each parameter on the behavior of the
decoder was examined, leading to an approach to choosing the value of each decoder was examined, leading to an approach to optimally choose the value
of the parameters. of each parameter.
The convergence properties of the algorithm were investigated in the context The convergence properties of the algorithm were investigated in the context
of the relatively high decoding failure rate, to derive an approach to correct of the relatively high decoding failure rate, to derive an approach to correct
possible wrong componets of the estimate. possibly wrong components of the estimate.
Based on this approach, an improvement over proximal decoding was suggested, Based on this approach, an improvement of proximal decoding was suggested,
leading to a decoding gain of up to $\SI{1}{dB}$, depending on the code and leading to a decoding gain of up to $\SI{1}{dB}$, depending on the code and
the parameters considered. the parameters considered.
For \ac{LP} decoding using \ac{ADMM}, the circumstances brought about via the For \ac{LP} decoding using \ac{ADMM}, the circumstances brought about by the
relaxation while formulating the \ac{LP} decoding problem were first explored. \ac{LP} relaxation were first explored.
The decomposable nature arising from the relocation of the constraints into The decomposable nature arising from the relocation of the constraints into
the objective function itself was recognized as the major driver in enabling the objective function itself was recognized as the major driver in enabling
the efficent implementation of the decoding algorithm. an efficient implementation of the decoding algorithm.
Based on simulation results, general guidelines for choosing each parameter Based on simulation results, general guidelines for choosing each parameter
were again derived. were derived.
The decoding performance, in form of the \ac{FER}, of the algorithm was The decoding performance, in form of the \ac{FER}, of the algorithm was
analyzed, observing that \ac{LP} decoding using \ac{ADMM} nearly reaches that analyzed, observing that \ac{LP} decoding using \ac{ADMM} nearly reaches that
of \ac{BP}, staying within approximately $\SI{0.5}{dB}$ depending on the code of \ac{BP}, staying within approximately $\SI{0.5}{dB}$ depending on the code
in question. in question.
Finally, strong parallells were discovered with regard to the theoretical Finally, strong parallels were discovered with regard to the theoretical
structure of the two algorithms, both in the constitution of their respective structure of the two algorithms, both in the constitution of their respective
objective functions as in the iterative approaches used to minimize them. objective functions as well as in the iterative approaches used to minimize them.
One difference noted was the approximate nature of the minimization in the One difference noted was the approximate nature of the minimization in the
case of proximal decoding, leading to the constraints never being truly case of proximal decoding, leading to the constraints never being truly
satisfied. satisfied.
In conjunction with the alternating minimization with respect to the same In conjunction with the alternating minimization with respect to the same
variable leading to oscillatory behavior, this was identified as the variable, leading to oscillatory behavior, this was identified as
root cause of its comparatively worse decoding performance. a possible cause of its comparatively worse decoding performance.
Furthermore, both algorithms were expressed as message passing algorithms, Furthermore, both algorithms were expressed as message passing algorithms,
justifying their similar computational performance. illustrating their similar computational performance.
While the modified proximal decoding algorithm presented in section While the modified proximal decoding algorithm presented in section
\ref{sec:prox:Improved Implementation} shows some promising results, further \ref{sec:prox:Improved Implementation} shows some promising results, further
@@ -46,7 +46,13 @@ investigation is required to determine how different choices of parameters
affect the decoding performance. affect the decoding performance.
Additionally, a more mathematically rigorous foundation for determining the Additionally, a more mathematically rigorous foundation for determining the
potentially wrong components of the estimate is desirable. potentially wrong components of the estimate is desirable.
Another area benefiting from future work is the expantion of the \ac{ADMM} A different method to improve proximal decoding might be to use
moment-based optimization techniques such as \textit{Adam} \cite{adam}
to try to mitigate the effect of local minima introduced in the objective
function as well as the adversarial structure of the minimization when employing
proximal decoding.
Another area benefiting from future work is the expansion of the \ac{ADMM}
based \ac{LP} decoder into a decoder approximating \ac{ML} performance, based \ac{LP} decoder into a decoder approximating \ac{ML} performance,
using \textit{adaptive \ac{LP} decoding}. using \textit{adaptive \ac{LP} decoding}.
With this method, the successive addition of redundant parity checks is used With this method, the successive addition of redundant parity checks is used

View File

@@ -1,16 +1,51 @@
\chapter{Introduction}% \chapter{Introduction}%
\label{chapter:introduction} \label{chapter:introduction}
Channel coding using binary linear codes is a way of enhancing the reliability
of data by detecting and correcting any errors that may occur during
its transmission or storage.
One class of binary linear codes, \ac{LDPC} codes, has become especially
popular due to being able to reach arbitrarily small probabilities of error
at code rates up to the capacity of the channel \cite[Sec. II.B.]{mackay_rediscovery},
while retaining a structure that allows for very efficient decoding.
While the established decoders for \ac{LDPC} codes, such as \ac{BP} and the
\textit{min-sum algorithm}, offer good decoding performance, they are suboptimal
in most cases and exhibit an \textit{error floor} for high \acp{SNR}
\cite[Sec. 15.3]{ryan_lin_2009}, making them unsuitable for applications
with extreme reliability requirements.
\begin{itemize} Optimization based decoding algorithms are an entirely different way of approaching
\item Problem definition the decoding problem.
\item Motivation The first introduction of optimization techniques as a way of decoding binary
\begin{itemize} linear codes was conducted in Feldman's 2003 Ph.D. thesis and a subsequent paper,
\item Error floor when decoding with BP (seems to not be persent with LP decoding establishing the field of \ac{LP} decoding \cite{feldman_thesis}, \cite{feldman_paper}.
\cite[Sec. I]{original_admm}) There, the \ac{ML} decoding problem is approximated by a \textit{linear program}, i.e.,
\item Strong theoretical guarantees that allow for better and better approximations a linear, convex optimization problem, which can subsequently be solved using
of ML decoding \cite[Sec. I]{original_admm} several different algorithms \cite{alp}, \cite{interior_point},
\end{itemize} \cite{original_admm}, \cite{pdd}.
\item Results summary More recently, novel approaches such as \textit{proximal decoding} have been
\end{itemize} introduced. Proximal decoding is based on a non-convex optimization formulation
of the \ac{MAP} decoding problem \cite{proximal_paper}.
The motivation behind applying optimization methods to channel decoding is to
utilize existing techniques in the broad field of optimization theory, as well
as to find new decoding methods not suffering from the same disadvantages as
existing message passing based approaches or exhibiting other desirable properties.
\Ac{LP} decoding, for example, comes with strong theoretical guarantees
allowing it to be used as a way of closely approximating \ac{ML} decoding
\cite[Sec. I]{original_admm},
and proximal decoding is applicable to non-trivial channel models such
as \ac{LDPC}-coded massive \ac{MIMO} channels \cite{proximal_paper}.
This thesis aims to further the analysis of optimization based decoding
algorithms as well as to verify and complement the considerations present in
the existing literature.
Specifically, the proximal decoding algorithm and \ac{LP} decoding using
the \ac{ADMM} \cite{original_admm} are explored within the context of
\ac{BPSK} modulated \ac{AWGN} channels.
Implementations of both decoding methods are produced, and based on simulation
results from those implementations the algorithms are examined and compared.
Approaches to determine the optimal value of each parameter are derived and
the computational and decoding performance of the algorithms is examined.
An improvement on proximal decoding is suggested, achieving up to 1 dB of gain,
depending on the parameters chosen and the code considered.

View File

@@ -5,14 +5,12 @@ This chapter is concerned with \ac{LP} decoding - the reformulation of the
decoding problem as a linear program. decoding problem as a linear program.
More specifically, the \ac{LP} decoding problem is solved using \ac{ADMM}. More specifically, the \ac{LP} decoding problem is solved using \ac{ADMM}.
First, the general field of \ac{LP} decoding is introduced. First, the general field of \ac{LP} decoding is introduced.
The application of \ac{ADMM} to the decoding problem is explained. The application of \ac{ADMM} to the decoding problem is explained and some
Some notable implementation details are mentioned. notable implementation details are mentioned.
Finally, the behavior of the algorithm is examined based on simulation Finally, the behavior of the algorithm is examined based on simulation
results. results.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{LP Decoding}% \section{LP Decoding}%
\label{sec:lp:LP Decoding} \label{sec:lp:LP Decoding}
@@ -547,7 +545,7 @@ parity-checks until a valid result is returned \cite[Sec. IV.]{alp}.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Decoding Algorithm}% \section{Decoding Algorithm and Implementation}%
\label{sec:lp:Decoding Algorithm} \label{sec:lp:Decoding Algorithm}
The \ac{LP} decoding formulation in section \ref{sec:lp:LP Decoding} The \ac{LP} decoding formulation in section \ref{sec:lp:LP Decoding}
@@ -689,7 +687,6 @@ handled at the same time.
This can also be understood by interpreting the decoding process as a message-passing This can also be understood by interpreting the decoding process as a message-passing
algorithm \cite[Sec. III. D.]{original_admm}, \cite[Sec. II. B.]{efficient_lp_dec_admm}, algorithm \cite[Sec. III. D.]{original_admm}, \cite[Sec. II. B.]{efficient_lp_dec_admm},
depicted in algorithm \ref{alg:admm}. depicted in algorithm \ref{alg:admm}.
\todo{How are the variables being initialized?}
\begin{genericAlgorithm}[caption={\ac{LP} decoding using \ac{ADMM} interpreted \begin{genericAlgorithm}[caption={\ac{LP} decoding using \ac{ADMM} interpreted
as a message passing algorithm\protect\footnotemark{}}, label={alg:admm}, as a message passing algorithm\protect\footnotemark{}}, label={alg:admm},
@@ -735,7 +732,7 @@ before the $\boldsymbol{z}_j$ and $\boldsymbol{u}_j$ update steps (lines 4 and
subsequently replacing $\boldsymbol{T}_j \tilde{\boldsymbol{c}}$ with the subsequently replacing $\boldsymbol{T}_j \tilde{\boldsymbol{c}}$ with the
computed value in the two updates \cite[Sec. 3.4.3]{distr_opt_book}. computed value in the two updates \cite[Sec. 3.4.3]{distr_opt_book}.
The main computational effort in solving the linear program then amounts to The main computational effort in solving the linear program amounts to
computing the projection operation $\Pi_{\mathcal{P}_{d_j}} \left( \cdot \right) $ computing the projection operation $\Pi_{\mathcal{P}_{d_j}} \left( \cdot \right) $
onto each check polytope. Various different methods to perform this projection onto each check polytope. Various different methods to perform this projection
have been proposed (e.g., in \cite{original_admm}, \cite{efficient_lp_dec_admm}, have been proposed (e.g., in \cite{original_admm}, \cite{efficient_lp_dec_admm},
@@ -743,14 +740,14 @@ have been proposed (e.g., in \cite{original_admm}, \cite{efficient_lp_dec_admm},
The method chosen here is the one presented in \cite{original_admm}. The method chosen here is the one presented in \cite{original_admm}.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Implementation Details}% %\section{Implementation Details}%
\label{sec:lp:Implementation Details} %\label{sec:lp:Implementation Details}
The development process used to implement this decoding algorithm was the same The development process used to implement this decoding algorithm was the same
as outlined in section as outlined in section
\ref{sec:prox:Implementation Details} for proximal decoding. \ref{sec:prox:Decoding Algorithm} for proximal decoding.
At first, an initial version was implemented in Python, before repeating the First, an initial version was implemented in Python, before repeating the
process using C++ to achieve higher performance. process using C++ to achieve higher performance.
Again, the performance can be increased by reframing the operations in such Again, the performance can be increased by reframing the operations in such
a way that the computation can take place primarily with element-wise a way that the computation can take place primarily with element-wise
@@ -788,9 +785,13 @@ expression to be rewritten as%
.\end{align*} .\end{align*}
% %
Defining% Defining%
\footnote{
In this case $d_1, \ldots, d_n$ refer to the degree of the variable nodes,
i.e., $d_i,\hspace{1mm}i\in\mathcal{I}$.
}
% %
\begin{align*} \begin{align*}
\boldsymbol{D} := \begin{bmatrix} \boldsymbol{d} := \begin{bmatrix}
d_1 \\ d_1 \\
\vdots \\ \vdots \\
d_n d_n
@@ -800,19 +801,18 @@ Defining%
\hspace{5mm}% \hspace{5mm}%
\boldsymbol{s} := \sum_{j\in\mathcal{J}} \boldsymbol{T}_j^\text{T} \boldsymbol{s} := \sum_{j\in\mathcal{J}} \boldsymbol{T}_j^\text{T}
\left( \boldsymbol{z}_j - \boldsymbol{u}_j \right) \left( \boldsymbol{z}_j - \boldsymbol{u}_j \right)
\end{align*}% ,\end{align*}%
\todo{Rename $\boldsymbol{D}$}%
% %
the $\tilde{\boldsymbol{c}}$ update can then be rewritten as% the $\tilde{\boldsymbol{c}}$ update can then be rewritten as%
% %
\begin{align*} \begin{align*}
\tilde{\boldsymbol{c}} \leftarrow \boldsymbol{D}^{\circ \left(-1\right)} \circ \tilde{\boldsymbol{c}} \leftarrow \boldsymbol{d}^{\circ \left(-1\right)} \circ
\left( \boldsymbol{s} - \frac{1}{\mu}\boldsymbol{\gamma} \right) \left( \boldsymbol{s} - \frac{1}{\mu}\boldsymbol{\gamma} \right)
.\end{align*} .\end{align*}
% %
This modified version of the decoding process is depicted in algorithm \ref{alg:admm:mod}. This modified version of the decoding process is depicted in algorithm \ref{alg:admm:mod}.
\begin{genericAlgorithm}[caption={\ac{LP} decoding using \ac{ADMM} algorithm with rewritten \begin{genericAlgorithm}[caption={The \ac{LP} decoding using \ac{ADMM} algorithm with rewritten
update steps}, label={alg:admm:mod}, update steps}, label={alg:admm:mod},
basicstyle=\fontsize{11}{16}\selectfont basicstyle=\fontsize{11}{16}\selectfont
] ]
@@ -831,16 +831,13 @@ while $\sum_{j\in\mathcal{J}} \lVert \boldsymbol{T}_j\tilde{\boldsymbol{c}}
\left( \boldsymbol{z}_j - \boldsymbol{u}_j \right) $ \left( \boldsymbol{z}_j - \boldsymbol{u}_j \right) $
end for end for
for $i$ in $\mathcal{I}$ do for $i$ in $\mathcal{I}$ do
$\tilde{\boldsymbol{c}} \leftarrow \boldsymbol{D}^{\circ \left( -1\right)} \circ $\tilde{\boldsymbol{c}} \leftarrow \boldsymbol{d}^{\circ \left( -1\right)} \circ
\left( \boldsymbol{s} - \frac{1}{\mu}\boldsymbol{\gamma} \right) $ \left( \boldsymbol{s} - \frac{1}{\mu}\boldsymbol{\gamma} \right) $
end for end for
end while end while
return $\tilde{\boldsymbol{c}}$ return $\tilde{\boldsymbol{c}}$
\end{genericAlgorithm} \end{genericAlgorithm}
\todo{Projection onto $[0, 1]^n$?}
\todo{Variable initialization}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Analysis and Simulation Results}% \section{Analysis and Simulation Results}%
@@ -855,6 +852,12 @@ Subsequently, the decoding performance is observed and compared to that of
Finally, the computational performance of the implementation and time Finally, the computational performance of the implementation and time
complexity of the algorithm are studied. complexity of the algorithm are studied.
As was the case in chapter \ref{chapter:proximal_decoding} for proximal decoding,
the following simulation results are based on Monte Carlo simulations
and the BER and FER curves have been generated by producing at least 100
frame errors for each data point, except in cases where this is explicitly
specified otherwise.
\subsection{Choice of Parameters} \subsection{Choice of Parameters}
The first two parameters to be investigated are the penalty parameter $\mu$ The first two parameters to be investigated are the penalty parameter $\mu$
@@ -868,8 +871,8 @@ The code chosen for this examination is a (3,6) regular \ac{LDPC} code with
$n=204$ and $k=102$ \cite[\text{204.33.484}]{mackay_enc}. $n=204$ and $k=102$ \cite[\text{204.33.484}]{mackay_enc}.
When varying $\mu$, $\rho$ is set to 1 and when varying When varying $\mu$, $\rho$ is set to 1 and when varying
$\rho$, $\mu$ is set to 5. $\rho$, $\mu$ is set to 5.
$K$ is set to 200 and $\epsilon_\text{dual}$ and $\epsilon_\text{pri}$ to The maximum number of iterations $K$ is set to 200 and
$10^{-5}$. $\epsilon_\text{dual}$ and $\epsilon_\text{pri}$ to $10^{-5}$.
The behavior that can be observed is very similar to that of the The behavior that can be observed is very similar to that of the
parameter $\gamma$ in proximal decoding, analyzed in section parameter $\gamma$ in proximal decoding, analyzed in section
\ref{sec:prox:Analysis and Simulation Results}. \ref{sec:prox:Analysis and Simulation Results}.
@@ -877,7 +880,7 @@ A single optimal value giving optimal performance does not exist; rather,
as long as the value is chosen within a certain range, the performance is as long as the value is chosen within a certain range, the performance is
approximately equally good. approximately equally good.
\begin{figure}[h] \begin{figure}[H]
\centering \centering
\begin{subfigure}[c]{0.48\textwidth} \begin{subfigure}[c]{0.48\textwidth}
@@ -971,8 +974,9 @@ The values chosen for the rest of the parameters are the same as before.
It is visible that choosing a large value for $\rho$ as well as a small value It is visible that choosing a large value for $\rho$ as well as a small value
for $\mu$ minimizes the average number of iterations and thus the average for $\mu$ minimizes the average number of iterations and thus the average
run time of the decoding process. run time of the decoding process.
The same behavior can be observed when looking at various%
% %
\begin{figure}[h] \begin{figure}[H]
\centering \centering
\begin{tikzpicture} \begin{tikzpicture}
@@ -1007,10 +1011,240 @@ run time of the decoding process.
\label{fig:admm:mu_rho_iterations} \label{fig:admm:mu_rho_iterations}
\end{figure}% \end{figure}%
% %
The same behavior can be observed when looking at a number of different codes, \noindent different codes, as shown in figure \ref{fig:admm:mu_rho_multiple}.
as shown in figure \ref{fig:admm:mu_rho_multiple}.
To get an estimate for the maximum number of iterations $K$ necessary,
the average error during decoding can be used.
This is shown in figure \ref{fig:admm:avg_error} as an average of
$\SI{100000}{}$ decodings.
$\mu$ is set to 5 and $\rho$ is set to $1$ and the rest of the parameters are
again chosen as $\epsilon_\text{pri}=10^{-5}$ and
$\epsilon_\text{dual}=10^{-5}$.
Similarly to the results in section \ref{subsec:prox:choice}, a dip is
visible around the $20$ iteration mark.
This is due to the fact that as the number of iterations increases,
more and more decodings converge, leaving only the mistaken ones to be
averaged.
The point at which the wrong decodings start to become dominant and the
decoding performance does not increase any longer is largely independent of
the \ac{SNR}, allowing the maximum number of iterations to be chosen without
considering the \ac{SNR}.
\begin{figure}[H]
\centering
\begin{tikzpicture}
\begin{axis}[
grid=both,
width=0.6\textwidth,
height=0.45\textwidth,
xlabel={Iteration}, ylabel={Average $\left\Vert \hat{\boldsymbol{c}}
- \boldsymbol{c} \right\Vert$}
]
\addplot[ForestGreen, line width=1pt]
table [col sep=comma, x=k, y=err,
discard if not={SNR}{1.0},
discard if gt={k}{100}]
{res/admm/avg_error_20433484.csv};
\addlegendentry{$E_b / N_0 = \SI{1}{dB}$}
\addplot[RedOrange, line width=1pt]
table [col sep=comma, x=k, y=err,
discard if not={SNR}{2.0},
discard if gt={k}{100}]
{res/admm/avg_error_20433484.csv};
\addlegendentry{$E_b / N_0 = \SI{2}{dB}$}
\addplot[NavyBlue, line width=1pt]
table [col sep=comma, x=k, y=err,
discard if not={SNR}{3.0},
discard if gt={k}{100}]
{res/admm/avg_error_20433484.csv};
\addlegendentry{$E_b / N_0 = \SI{3}{dB}$}
\addplot[RoyalPurple, line width=1pt]
table [col sep=comma, x=k, y=err,
discard if not={SNR}{4.0},
discard if gt={k}{100}]
{res/admm/avg_error_20433484.csv};
\addlegendentry{$E_b / N_0 = \SI{4}{dB}$}
\end{axis}
\end{tikzpicture}
\caption{Average error for $\SI{100000}{}$ decodings. (3,6)
regular \ac{LDPC} code with $n=204, k=102$ \cite[\text{204.33.484}]{mackay_enc}}
\label{fig:admm:avg_error}
\end{figure}%
The last two parameters remaining to be examined are the tolerances for the
stopping criterion of the algorithm, $\epsilon_\text{pri}$ and
$\epsilon_\text{dual}$.
These are both set to the same value $\epsilon$.
The effect of their value on the decoding performance is visualized in figure
\ref{fig:admm:epsilon}.
All parameters except $\epsilon_\text{pri}$ and $\epsilon_\text{dual}$ are
kept constant, with $\mu=5$, $\rho=1$ and $E_b / N_0 = \SI{4}{dB}$ and
performing a maximum of 200 iterations.
A lower value for the tolerance initially leads to a dramatic decrease in the
\ac{FER}, this effect fading as the tolerance becomes increasingly lower.
\begin{figure}[H]
\centering
\begin{tikzpicture}
\begin{axis}[
grid=both,
xlabel={$\epsilon$}, ylabel={\acs{FER}},
ymode=log,
xmode=log,
x dir=reverse,
width=0.6\textwidth,
height=0.45\textwidth,
]
\addplot[NavyBlue, line width=1pt, densely dashed, mark=*]
table [col sep=comma, x=epsilon, y=FER,
discard if not={SNR}{3.0},]
{res/admm/fer_epsilon_20433484.csv};
\end{axis}
\end{tikzpicture}
\caption{Effect of the value of the parameters $\epsilon_\text{pri}$ and
$\epsilon_\text{dual}$ on the \acs{FER}. (3,6) regular \ac{LDPC} code with
$n=204, k=102$ \cite[\text{204.33.484}]{mackay_enc}}
\label{fig:admm:epsilon}
\end{figure}%
In conclusion, the parameters $\mu$ and $\rho$ should be chosen comparatively
small and large, respectively, to reduce the average runtime of the decoding
process, while keeping them within a certain range as to not compromise the
decoding performance.
The maximum number of iterations performed can be chosen independently
of the \ac{SNR}.
Finally, small values should be given to the parameters
$\epsilon_{\text{pri}}$ and $\epsilon_{\text{dual}}$ to achieve the lowest
possible error rate.
\subsection{Decoding Performance}
In figure \ref{fig:admm:results}, the simulation results for the ``Margulis''
\ac{LDPC} code ($n=2640$, $k=1320$) presented by Barman et al. in
\cite{original_admm} are compared to the results from the simulations
conducted in the context of this thesis.
The parameters chosen were $\mu=3.3$, $\rho=1.9$, $K=1000$,
$\epsilon_\text{pri}=10^{-5}$ and $\epsilon_\text{dual}=10^{-5}$,
the same as in \cite{original_admm}.
The two \ac{FER} curves are practically identical.
Also shown is the curve resulting from \ac{BP} decoding, performing
1000 iterations.
The two algorithms perform relatively similarly, staying within $\SI{0.5}{dB}$
of one another.
\begin{figure}[H]
\centering
\begin{tikzpicture}
\begin{axis}[
grid=both,
xlabel={$E_b / N_0 \left( \text{dB} \right) $}, ylabel={\acs{FER}},
ymode=log,
width=0.6\textwidth,
height=0.45\textwidth,
legend style={at={(0.5,-0.57)},anchor=south},
legend cell align={left},
]
\addplot[Turquoise, line width=1pt, mark=*]
table [col sep=comma, x=SNR, y=FER,
discard if gt={SNR}{2.2},
]
{res/admm/fer_paper_margulis.csv};
\addlegendentry{\acs{ADMM} (Barman et al.)}
\addplot[NavyBlue, densely dashed, line width=1pt, mark=triangle]
table [col sep=comma, x=SNR, y=FER,]
{res/admm/ber_margulis264013203.csv};
\addlegendentry{\acs{ADMM} (Own results)}
\addplot[RoyalPurple, line width=1pt, mark=*]
table [col sep=comma, x=SNR, y=FER, discard if gt={SNR}{2.2},]
{res/generic/fer_bp_mackay_margulis.csv};
\addlegendentry{\acs{BP} (Barman et al.)}
\end{axis}
\end{tikzpicture}
\caption{Comparison of datapoints from Barman et al. with own simulation results.
``Margulis'' \ac{LDPC} code with $n = 2640$, $k = 1320$
\cite[\text{Margulis2640.1320.3}]{mackay_enc}}
\label{fig:admm:results}
\end{figure}%
% %
\begin{figure}[h] In figure \ref{fig:admm:bp_multiple}, \ac{FER} curves for \ac{LP} decoding
using \ac{ADMM} and \ac{BP} are shown for various codes.
To ensure comparability, in all cases the number of iterations was set to
$K=200$.
The values of the other parameters were chosen as $\mu = 5$, $\rho = 1$,
$\epsilon_\text{pri} = 10^{-5}$ and $\epsilon_\text{dual}=10^{-5}$.
Comparing the simulation results for the different codes, it is apparent that
the difference in decoding performance depends on the code being
considered.
For all codes considered here, however, the performance of \ac{LP} decoding
using \ac{ADMM} comes close to that of \ac{BP}, again staying withing
approximately $\SI{0.5}{dB}$.
\subsection{Computational Performance}
\label{subsec:admm:comp_perf}
In terms of time complexity, the three steps of the decoding algorithm
in equations (\ref{eq:admm:c_update}) - (\ref{eq:admm:u_update}) have to be
considered.
The $\tilde{\boldsymbol{c}}$- and $\boldsymbol{u}_j$-update steps are
$\mathcal{O}\left( n \right)$ \cite[Sec. III. C.]{original_admm}.
The complexity of the $\boldsymbol{z}_j$-update step depends on the projection
algorithm employed.
Since for the implementation completed for this work the projection algorithm
presented in \cite{original_admm} is used, the $\boldsymbol{z}_j$-update step
also has linear time complexity.
\begin{figure}[H]
\centering
\begin{tikzpicture}
\begin{axis}[grid=both,
xlabel={$n$}, ylabel={Time per frame (s)},
width=0.6\textwidth,
height=0.45\textwidth,
legend style={at={(0.5,-0.42)},anchor=south},
legend cell align={left},]
\addplot[NavyBlue, only marks, mark=triangle*]
table [col sep=comma, x=n, y=spf]
{res/admm/fps_vs_n.csv};
\end{axis}
\end{tikzpicture}
\caption{Timing requirements of the \ac{LP} decoding using \ac{ADMM} implementation}
\label{fig:admm:time}
\end{figure}%
Simulation results from a range of different codes can be used to verify this
analysis.
Figure \ref{fig:admm:time} shows the average time needed to decode one
frame as a function of its length.
The codes used for this consideration are the same as in section \ref{subsec:prox:comp_perf}
The results are necessarily skewed because these vary not only
in their length, but also in their construction scheme and rate.
Additionally, different optimization opportunities arise depending on the
length of a code, since for smaller codes dynamic memory allocation can be
completely omitted.
This may explain why the datapoint at $n=504$ is higher then would be expected
with linear behavior.
Nonetheless, the simulation results roughly match the expected behavior
following from the theoretical considerations.
\begin{figure}[H]
\centering
\vspace*{5cm}
\end{figure}
\begin{figure}[H]
\centering \centering
\begin{subfigure}[t]{0.48\textwidth} \begin{subfigure}[t]{0.48\textwidth}
@@ -1204,239 +1438,185 @@ as shown in figure \ref{fig:admm:mu_rho_multiple}.
\end{subfigure} \end{subfigure}
\caption{Dependence of the \ac{BER} on the value of the parameter $\gamma$ for various codes} \caption{Dependence of the average number of iterations required on the parameters
$\mu$ and $\rho$ for $E_b / N_0 = \SI{4}{dB}$ for various codes}
\label{fig:admm:mu_rho_multiple} \label{fig:admm:mu_rho_multiple}
\end{figure} \end{figure}
To get an estimate for the parameter $K$, the average error during decoding \vfill
can be used.
This is shown in figure \ref{fig:admm:avg_error} as an average of
$\SI{100000}{}$ decodings.
$\mu$ is set to 5 and $\rho$ is set to $1$ and the rest of the parameters are
again chosen as $K=200, \epsilon_\text{pri}=10^{-5}$ and $ \epsilon_\text{dual}=10^{-5}$.
Similarly to the results in section
\ref{sec:prox:Analysis and Simulation Results}, a dip is visible around the
$20$ iteration mark.
This is due to the fact that as the number of iterations increases
more and more decodings converge, leaving only the mistaken ones to be
averaged.
The point at which the wrong decodings start to become dominant and the
decoding performance does not increase any longer is largely independent of
the \ac{SNR}, allowing the value of $K$ to be chosen without considering the
\ac{SNR}.
\begin{figure}[h] \newpage
\begin{figure}[H]
\centering \centering
\begin{tikzpicture}
\begin{axis}[
grid=both,
width=0.6\textwidth,
height=0.45\textwidth,
xlabel={Iteration}, ylabel={Average $\left\Vert \hat{\boldsymbol{c}}
- \boldsymbol{c} \right\Vert$}
]
\addplot[ForestGreen, line width=1pt]
table [col sep=comma, x=k, y=err,
discard if not={SNR}{1.0},
discard if gt={k}{100}]
{res/admm/avg_error_20433484.csv};
\addlegendentry{$E_b / N_0 = \SI{1}{dB}$}
\addplot[RedOrange, line width=1pt]
table [col sep=comma, x=k, y=err,
discard if not={SNR}{2.0},
discard if gt={k}{100}]
{res/admm/avg_error_20433484.csv};
\addlegendentry{$E_b / N_0 = \SI{2}{dB}$}
\addplot[NavyBlue, line width=1pt]
table [col sep=comma, x=k, y=err,
discard if not={SNR}{3.0},
discard if gt={k}{100}]
{res/admm/avg_error_20433484.csv};
\addlegendentry{$E_b / N_0 = \SI{3}{dB}$}
\addplot[RoyalPurple, line width=1pt]
table [col sep=comma, x=k, y=err,
discard if not={SNR}{4.0},
discard if gt={k}{100}]
{res/admm/avg_error_20433484.csv};
\addlegendentry{$E_b / N_0 = \SI{4}{dB}$}
\end{axis}
\end{tikzpicture}
\caption{Average error for $\SI{100000}{}$ decodings. (3,6)
regular \ac{LDPC} code with $n=204, k=102$ \cite[\text{204.33.484}]{mackay_enc}}
\label{fig:admm:avg_error}
\end{figure}%
The last two parameters remaining to be examined are the tolerances for the
stopping criterion of the algorithm, $\epsilon_\text{pri}$ and
$\epsilon_\text{dual}$.
These are both set to the same value $\epsilon$.
The effect of their value on the decoding performance is visualized in figure
\ref{fig:admm:epsilon}.
All parameters except $\epsilon_\text{pri}$ and $\epsilon_\text{dual}$ are
kept constant, with $K=200$, $\mu=5$, $\rho=1$ and $E_b / N_0 = \SI{4}{dB}$.
A lower value for the tolerance initially leads to a dramatic decrease in the
\ac{FER}, this effect fading as the tolerance becomes increasingly lower.
\begin{figure}[h]
\centering
\begin{tikzpicture}
\begin{axis}[
grid=both,
xlabel={$\epsilon$}, ylabel={\acs{FER}},
ymode=log,
xmode=log,
x dir=reverse,
width=0.6\textwidth,
height=0.45\textwidth,
]
\addplot[NavyBlue, line width=1pt, densely dashed, mark=*]
table [col sep=comma, x=epsilon, y=FER,
discard if not={SNR}{3.0},]
{res/admm/fer_epsilon_20433484.csv};
\end{axis}
\end{tikzpicture}
\caption{Effect of the value of the parameters $\epsilon_\text{pri}$ and
$\epsilon_\text{dual}$ on the \acs{FER}. (3,6) regular \ac{LDPC} code with
$n=204, k=102$ \cite[\text{204.33.484}]{mackay_enc}}
\label{fig:admm:epsilon}
\end{figure}%
In conclusion, the parameters $\mu$ and $\rho$ should be chosen comparatively
small and large, respectively, to reduce the average runtime of the decoding
process, while keeping them within a certain range as to not compromise the
decoding performance.
The maximum number of iterations $K$ performed can be chosen independantly
of the \ac{SNR}.
Finally, relatively small values should be given to the parameters
$\epsilon_{\text{pri}}$ and $\epsilon_{\text{dual}}$ to achieve the lowest
possible error rate.
\subsection{Decoding Performance}
In figure \ref{fig:admm:results}, the simulation results for the ``Margulis''
\ac{LDPC} code ($n=2640$, $k=1320$) presented by Barman et al. in
\cite{original_admm} are compared to the results from the simulations
conducted in the context of this thesis.
The parameters chosen were $\mu=3.3$, $\rho=1.9$, $K=1000$,
$\epsilon_\text{pri}=10^{-5}$ and $\epsilon_\text{dual}=10^{-5}$,
the same as in \cite{original_admm}.
The two \ac{FER} curves are practically identical.
Also shown is the curve resulting from \ac{BP} decoding, performing
1000 iterations.
The two algorithms perform relatively similarly, coming within $\SI{0.5}{dB}$
of one another.
\begin{figure}[h]
\centering
\begin{tikzpicture}
\begin{axis}[
grid=both,
xlabel={$E_b / N_0 \left( \text{dB} \right) $}, ylabel={\acs{FER}},
ymode=log,
width=0.6\textwidth,
height=0.45\textwidth,
legend style={at={(0.5,-0.57)},anchor=south},
legend cell align={left},
]
\addplot[Turquoise, line width=1pt, mark=*]
table [col sep=comma, x=SNR, y=FER,
discard if gt={SNR}{2.2},
]
{res/admm/fer_paper_margulis.csv};
\addlegendentry{\acs{ADMM} (Barman et al.)}
\addplot[NavyBlue, densely dashed, line width=1pt, mark=triangle]
table [col sep=comma, x=SNR, y=FER,]
{res/admm/ber_margulis264013203.csv};
\addlegendentry{\acs{ADMM} (Own results)}
\addplot[RoyalPurple, line width=1pt, mark=*]
table [col sep=comma, x=SNR, y=FER, discard if gt={SNR}{2.2},]
{res/generic/fer_bp_mackay_margulis.csv};
\addlegendentry{\acs{BP} (Barman et al.)}
\end{axis}
\end{tikzpicture}
\caption{Comparison of datapoints from Barman et al. with own simulation results. \begin{subfigure}[t]{0.48\textwidth}
``Margulis'' \ac{LDPC} code with $n = 2640$, $k = 1320$
\cite[\text{Margulis2640.1320.3}]{mackay_enc}\protect\footnotemark{}}
\label{fig:admm:results}
\end{figure}%
%
In figure \ref{fig:admm:ber_fer}, the \ac{BER} and \ac{FER} for \ac{LP} decoding
using\ac{ADMM} and \ac{BP} are shown for a (3, 6) regular \ac{LDPC} code with
$n=204$.
To ensure comparability, in both cases the number of iterations was set to
$K=200$.
The values of the other parameters were chosen as $\mu = 5$, $\rho = 1$,
$\epsilon = 10^{-5}$ and $\epsilon=10^{-5}$.
Comparing figures \ref{fig:admm:results} and \ref{fig:admm:ber_fer} it is
apparent that the difference in decoding performance depends on the code being
considered.
More simulation results are presented in figure \ref{fig:comp:prox_admm_dec}
in section \ref{sec:comp:res}.
\begin{figure}[h]
\centering
\begin{subfigure}[c]{0.48\textwidth}
\centering \centering
\begin{tikzpicture} \begin{tikzpicture}
\begin{axis}[ \begin{axis}[
grid=both, grid=both,
xlabel={$\mu$}, ylabel={\acs{BER}}, xlabel={$E_b / N_0$ (dB)}, ylabel={FER},
ymode=log, ymode=log,
ymax=1.5, ymin=8e-5,
width=\textwidth, width=\textwidth,
height=0.75\textwidth, height=0.75\textwidth,
ymax=1.5, ymin=3e-7,
] ]
\addplot[Turquoise, line width=1pt, mark=*] \addplot[Turquoise, line width=1pt, mark=*]
table [col sep=comma, x=SNR, y=BER, table [x=SNR, y=FER, col sep=comma, discard if not={mu}{3.0}]
discard if not={mu}{5.0}, %{res/hybrid/2d_ber_fer_dfr_963965.csv};
discard if gt={SNR}{4.5}] {res/admm/ber_2d_963965.csv};
{res/admm/ber_2d_20433484.csv}; \addplot [RoyalPurple, mark=*, line width=1pt]
\addplot[RoyalPurple, line width=1pt, mark=*] table [x=SNR, y=FER, col sep=comma]
table [col sep=comma, x=SNR, y=BER, {res/generic/bp_963965.csv};
discard if gt={SNR}{4.5}]
{/home/andreas/bp_20433484.csv};
\end{axis} \end{axis}
\end{tikzpicture} \end{tikzpicture}
\caption{$\left( 3, 6 \right)$-regular \ac{LDPC} code with $n=96, k=48$
\cite[\text{96.3.965}]{mackay_enc}}
\end{subfigure}% \end{subfigure}%
\hfill% \hfill%
\begin{subfigure}[c]{0.48\textwidth} \begin{subfigure}[t]{0.48\textwidth}
\centering \centering
\begin{tikzpicture} \begin{tikzpicture}
\begin{axis}[ \begin{axis}[
grid=both, grid=both,
xlabel={$\rho$}, ylabel={\acs{FER}}, xlabel={$E_b / N_0$ (dB)}, ylabel={FER},
ymode=log, ymode=log,
ymax=1.5, ymin=8e-5,
width=\textwidth, width=\textwidth,
height=0.75\textwidth, height=0.75\textwidth,
ymax=1.5, ymin=3e-7,
] ]
\addplot[Turquoise, line width=1pt, mark=*] \addplot[Turquoise, line width=1pt, mark=*]
table [col sep=comma, x=SNR, y=FER, table [x=SNR, y=FER, col sep=comma, discard if not={mu}{3.0}]
discard if not={mu}{5.0}, {res/admm/ber_2d_bch_31_26.csv};
discard if gt={SNR}{4.5}] \addplot [RoyalPurple, mark=*, line width=1pt]
{res/admm/ber_2d_20433484.csv}; table [x=SNR, y=FER, col sep=comma]
\addplot[RoyalPurple, line width=1pt, mark=*] {res/generic/bp_bch_31_26.csv};
table [col sep=comma, x=SNR, y=FER,
discard if gt={SNR}{4.5}]
{/home/andreas/bp_20433484.csv};
\end{axis} \end{axis}
\end{tikzpicture} \end{tikzpicture}
\caption{BCH code with $n=31, k=26$}
\end{subfigure}%
\vspace{3mm}
\begin{subfigure}[t]{0.48\textwidth}
\centering
\begin{tikzpicture}
\begin{axis}[
grid=both,
xlabel={$E_b / N_0$ (dB)}, ylabel={FER},
ymode=log,
ymax=1.5, ymin=8e-5,
width=\textwidth,
height=0.75\textwidth,
]
\addplot[Turquoise, line width=1pt, mark=*]
table [x=SNR, y=FER, col sep=comma,
discard if not={mu}{3.0},
discard if gt={SNR}{5.5}]
{res/admm/ber_2d_20433484.csv};
\addplot [RoyalPurple, mark=*, line width=1pt]
table [x=SNR, y=FER, col sep=comma]
{res/generic/bp_20433484.csv};
\end{axis}
\end{tikzpicture}
\caption{$\left( 3, 6 \right)$-regular \ac{LDPC} code with $n=204, k=102$
\cite[\text{204.33.484}]{mackay_enc}}
\end{subfigure}%
\hfill%
\begin{subfigure}[t]{0.48\textwidth}
\centering
\begin{tikzpicture}
\begin{axis}[
grid=both,
xlabel={$E_b / N_0$ (dB)}, ylabel={FER},
ymode=log,
ymax=1.5, ymin=8e-5,
width=\textwidth,
height=0.75\textwidth,
]
\addplot[Turquoise, line width=1pt, mark=*]
table [x=SNR, y=FER, col sep=comma, discard if not={mu}{3.0}]
{res/admm/ber_2d_20455187.csv};
\addplot [RoyalPurple, mark=*, line width=1pt,
discard if gt={SNR}{5}]
table [x=SNR, y=FER, col sep=comma]
{res/generic/bp_20455187.csv};
\end{axis}
\end{tikzpicture}
\caption{$\left( 5, 10 \right)$-regular \ac{LDPC} code with $n=204, k=102$
\cite[\text{204.55.187}]{mackay_enc}}
\end{subfigure}% \end{subfigure}%
\vspace{3mm}
\begin{subfigure}[t]{0.48\textwidth}
\centering
\begin{tikzpicture}
\begin{axis}[
grid=both,
xlabel={$E_b / N_0$ (dB)}, ylabel={FER},
ymode=log,
ymax=1.5, ymin=8e-5,
width=\textwidth,
height=0.75\textwidth,
]
\addplot[Turquoise, line width=1pt, mark=*]
table [x=SNR, y=FER, col sep=comma, discard if not={mu}{3.0}]
{res/admm/ber_2d_40833844.csv};
\addplot [RoyalPurple, mark=*, line width=1pt,
discard if gt={SNR}{3}]
table [x=SNR, y=FER, col sep=comma]
{res/generic/bp_40833844.csv};
\end{axis}
\end{tikzpicture}
\caption{$\left( 3, 6 \right)$-regular \ac{LDPC} code with $n=204, k=102$
\cite[\text{204.33.484}]{mackay_enc}}
\end{subfigure}%
\hfill%
\begin{subfigure}[t]{0.48\textwidth}
\centering
\begin{tikzpicture}
\begin{axis}[
grid=both,
xlabel={$E_b / N_0$ (dB)}, ylabel={FER},
ymode=log,
ymax=1.5, ymin=8e-5,
width=\textwidth,
height=0.75\textwidth,
]
\addplot[Turquoise, line width=1pt, mark=*]
table [x=SNR, y=FER, col sep=comma, discard if not={mu}{3.0}]
{res/admm/ber_2d_pegreg252x504.csv};
\addplot [RoyalPurple, mark=*, line width=1pt]
table [x=SNR, y=FER, col sep=comma,
discard if gt={SNR}{3}]
{res/generic/bp_pegreg252x504.csv};
\end{axis}
\end{tikzpicture}
\caption{LDPC code (progressive edge growth construction) with $n=504, k=252$
\cite[\text{PEGReg252x504}]{mackay_enc}}
\end{subfigure}%
\vspace{5mm}
\begin{subfigure}[t]{\textwidth} \begin{subfigure}[t]{\textwidth}
\centering \centering
@@ -1444,74 +1624,20 @@ in section \ref{sec:comp:res}.
\begin{axis}[hide axis, \begin{axis}[hide axis,
xmin=10, xmax=50, xmin=10, xmax=50,
ymin=0, ymax=0.4, ymin=0, ymax=0.4,
legend columns=3, legend columns=1,
legend style={draw=white!15!black,legend cell align=left}] legend cell align={left},
legend style={draw=white!15!black}]
\addlegendimage{Turquoise, line width=1pt, mark=*} \addlegendimage{Turquoise, line width=1pt, mark=*}
\addlegendentry{\acs{LP} decoding using \acs{ADMM}} \addlegendentry{\acs{LP} decoding using \acs{ADMM}}
\addlegendimage{RoyalPurple, line width=1pt, mark=*}
\addlegendentry{BP (200 iterations)} \addlegendimage{RoyalPurple, line width=1pt, mark=*, solid}
\addlegendentry{\acs{BP} (200 iterations)}
\end{axis} \end{axis}
\end{tikzpicture} \end{tikzpicture}
\end{subfigure} \end{subfigure}
\caption{Comparison of the decoding performance of \acs{LP} decoding using \caption{Comparison of the decoding performance of \ac{LP} decoding using \ac{ADMM}
\acs{ADMM} and \acs{BP}. (3,6) regular \ac{LDPC} code with $n = 204$, $k = 102$ and \ac{BP} for various codes}
\cite[\text{204.33.484}]{mackay_enc}} \label{fig:admm:bp_multiple}
\label{fig:admm:ber_fer} \end{figure}
\end{figure}%
In summary, the decoding performance of \ac{LP} decoding using \ac{ADMM} comes
close to that of \ac{BP}, their difference staying in the range of
approximately $\SI{0.5}{dB}$, depending on the code in question.
\subsection{Computational Performance}
\label{subsec:admm:comp_perf}
In terms of time complexity, the three steps of the decoding algorithm
in equations (\ref{eq:admm:c_update}) - (\ref{eq:admm:u_update}) have to be
considered.
The $\tilde{\boldsymbol{c}}$- and $\boldsymbol{u}_j$-update steps are
$\mathcal{O}\left( n \right)$ \cite[Sec. III. C.]{original_admm}.
The complexity of the $\boldsymbol{z}_j$-update step depends on the projection
algorithm employed.
Since for the implementation completed for this work the projection algorithm
presented in \cite{original_admm} is used, the $\boldsymbol{z}_j$-update step
also has linear time complexity.
\begin{figure}[h]
\centering
\begin{tikzpicture}
\begin{axis}[grid=both,
xlabel={$n$}, ylabel={Time per frame (s)},
width=0.6\textwidth,
height=0.45\textwidth,
legend style={at={(0.5,-0.42)},anchor=south},
legend cell align={left},]
\addplot[NavyBlue, only marks, mark=triangle*]
table [col sep=comma, x=n, y=spf]
{res/admm/fps_vs_n.csv};
\end{axis}
\end{tikzpicture}
\caption{Timing requirements of the \ac{LP} decoding using \ac{ADMM} implementation}
\label{fig:admm:time}
\end{figure}%
Simulation results from a range of different codes can be used to verify this
analysis.
Figure \ref{fig:admm:time} shows the average time needed to decode one
frame as a function of its length.
The codes used for this consideration are the same as in section \ref{subsec:prox:comp_perf}
The results are necessarily skewed because these vary not only
in their length, but also in their construction scheme and rate.
Additionally, different optimization opportunities arise depending on the
length of a code, since for smaller codes dynamic memory allocation can be
completely omitted.
This may explain why the datapoint at $n=504$ is higher then would be expected
with linear behavior.
Nonetheless, the simulation results roughly match the expected behavior
following from the theoretical considerations.

File diff suppressed because it is too large Load Diff

View File

@@ -1,13 +1,13 @@
\chapter{Theoretical Background}% \chapter{Theoretical Background}%
\label{chapter:theoretical_background} \label{chapter:theoretical_background}
In this chapter, the theoretical background necessary to understand this In this chapter, the theoretical background necessary to understand the
work is given. decoding algorithms examined in this work is given.
First, the notation used is clarified. First, the notation used is clarified.
The physical aspects are detailed - the used modulation scheme and channel model. The physical layer is detailed - the used modulation scheme and channel model.
A short introduction to channel coding with binary linear codes and especially A short introduction to channel coding with binary linear codes and especially
\ac{LDPC} codes is given. \ac{LDPC} codes is given.
The established methods of decoding LPDC codes are briefly explained. The established methods of decoding \ac{LDPC} codes are briefly explained.
Lastly, the general process of decoding using optimization techniques is described Lastly, the general process of decoding using optimization techniques is described
and an overview of the utilized optimization methods is given. and an overview of the utilized optimization methods is given.
@@ -31,7 +31,7 @@ Additionally, a shorthand notation will be used, denoting a set of indices as%
\hspace{5mm} m < n, \hspace{2mm} m,n\in\mathbb{Z} \hspace{5mm} m < n, \hspace{2mm} m,n\in\mathbb{Z}
.\end{align*} .\end{align*}
% %
In order to designate elemen-twise operations, in particular the \textit{Hadamard product} In order to designate element-wise operations, in particular the \textit{Hadamard product}
and the \textit{Hadamard power}, the operator $\circ$ will be used:% and the \textit{Hadamard power}, the operator $\circ$ will be used:%
% %
\begin{alignat*}{3} \begin{alignat*}{3}
@@ -45,7 +45,7 @@ and the \textit{Hadamard power}, the operator $\circ$ will be used:%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Preliminaries: Channel Model and Modulation} \section{Channel Model and Modulation}
\label{sec:theo:Preliminaries: Channel Model and Modulation} \label{sec:theo:Preliminaries: Channel Model and Modulation}
In order to transmit a bit-word $\boldsymbol{c} \in \mathbb{F}_2^n$ of length In order to transmit a bit-word $\boldsymbol{c} \in \mathbb{F}_2^n$ of length
@@ -82,7 +82,7 @@ conducting this process, whereby \textit{data words} are mapped onto longer
\textit{codewords}, which carry redundant information. \textit{codewords}, which carry redundant information.
\Ac{LDPC} codes have become especially popular, since they are able to \Ac{LDPC} codes have become especially popular, since they are able to
reach arbitrarily small probabilities of error at code rates up to the capacity reach arbitrarily small probabilities of error at code rates up to the capacity
of the channel \cite[Sec. II.B.]{mackay_rediscovery} while having a structure of the channel \cite[Sec. II.B.]{mackay_rediscovery}, while having a structure
that allows for very efficient decoding. that allows for very efficient decoding.
The lengths of the data words and codewords are denoted by $k\in\mathbb{N}$ The lengths of the data words and codewords are denoted by $k\in\mathbb{N}$
@@ -97,7 +97,7 @@ the number of parity-checks:%
\boldsymbol{H}\boldsymbol{c}^\text{T} = \boldsymbol{0} \right\} \boldsymbol{H}\boldsymbol{c}^\text{T} = \boldsymbol{0} \right\}
.\end{align*} .\end{align*}
% %
A data word $\boldsymbol{u} \in \mathbb{F}_2^k$ can be mapped onto a codword A data word $\boldsymbol{u} \in \mathbb{F}_2^k$ can be mapped onto a codeword
$\boldsymbol{c} \in \mathbb{F}_2^n$ using the \textit{generator matrix} $\boldsymbol{c} \in \mathbb{F}_2^n$ using the \textit{generator matrix}
$\boldsymbol{G} \in \mathbb{F}_2^{k\times n}$:% $\boldsymbol{G} \in \mathbb{F}_2^{k\times n}$:%
% %
@@ -179,9 +179,9 @@ codewords:
&= \argmax_{c\in\mathcal{C}} \frac{f_{\boldsymbol{Y} \mid \boldsymbol{C}} &= \argmax_{c\in\mathcal{C}} \frac{f_{\boldsymbol{Y} \mid \boldsymbol{C}}
\left( \boldsymbol{y} \mid \boldsymbol{c} \right) p_{\boldsymbol{C}} \left( \boldsymbol{y} \mid \boldsymbol{c} \right) p_{\boldsymbol{C}}
\left( \boldsymbol{c} \right)}{f_{\boldsymbol{Y}}\left( \boldsymbol{y} \right) } \\ \left( \boldsymbol{c} \right)}{f_{\boldsymbol{Y}}\left( \boldsymbol{y} \right) } \\
&= \argmax_{c\in\mathcal{C}} f_{\boldsymbol{Y} \mid \boldsymbol{C}} % &= \argmax_{c\in\mathcal{C}} f_{\boldsymbol{Y} \mid \boldsymbol{C}}
\left( \boldsymbol{y} \mid \boldsymbol{c} \right) p_{\boldsymbol{C}} % \left( \boldsymbol{y} \mid \boldsymbol{c} \right) p_{\boldsymbol{C}}
\left( \boldsymbol{c} \right) \\ % \left( \boldsymbol{c} \right) \\
&= \argmax_{c\in\mathcal{C}}f_{\boldsymbol{Y} \mid \boldsymbol{C}} &= \argmax_{c\in\mathcal{C}}f_{\boldsymbol{Y} \mid \boldsymbol{C}}
\left( \boldsymbol{y} \mid \boldsymbol{c} \right) \left( \boldsymbol{y} \mid \boldsymbol{c} \right)
.\end{align*} .\end{align*}
@@ -204,7 +204,7 @@ Each row of $\boldsymbol{H}$, which represents one parity-check, is viewed as a
Each component of the codeword $\boldsymbol{c}$ is interpreted as a \ac{VN}. Each component of the codeword $\boldsymbol{c}$ is interpreted as a \ac{VN}.
The relationship between \acp{CN} and \acp{VN} can then be plotted by noting The relationship between \acp{CN} and \acp{VN} can then be plotted by noting
which components of $\boldsymbol{c}$ are considered for which parity-check. which components of $\boldsymbol{c}$ are considered for which parity-check.
Figure \ref{fig:theo:tanner_graph} shows the tanner graph for the Figure \ref{fig:theo:tanner_graph} shows the Tanner graph for the
(7,4) Hamming code, which has the following parity-check matrix (7,4) Hamming code, which has the following parity-check matrix
\cite[Example 5.7.]{ryan_lin_2009}:% \cite[Example 5.7.]{ryan_lin_2009}:%
% %
@@ -263,7 +263,7 @@ Figure \ref{fig:theo:tanner_graph} shows the tanner graph for the
\draw (cn3) -- (c7); \draw (cn3) -- (c7);
\end{tikzpicture} \end{tikzpicture}
\caption{Tanner graph for the (7,4)-Hamming-code} \caption{Tanner graph for the (7,4) Hamming code}
\label{fig:theo:tanner_graph} \label{fig:theo:tanner_graph}
\end{figure}% \end{figure}%
% %
@@ -285,15 +285,16 @@ Message passing algorithms are based on the notion of passing messages between
\acp{CN} and \acp{VN}. \acp{CN} and \acp{VN}.
\Ac{BP} is one such algorithm that is commonly used to decode \ac{LDPC} codes. \Ac{BP} is one such algorithm that is commonly used to decode \ac{LDPC} codes.
It aims to compute the posterior probabilities It aims to compute the posterior probabilities
$p_{C_i \mid \boldsymbol{Y}}\left(c_i = 1 | \boldsymbol{y} \right),\hspace{2mm} i\in\mathcal{I}$ $p_{C_i \mid \boldsymbol{Y}}\left(c_i = 1 | \boldsymbol{y} \right),\hspace{2mm} i\in\mathcal{I}$,
\cite[Sec. III.]{mackay_rediscovery} and use them to calculate the estimate $\hat{\boldsymbol{c}}$. see \cite[Sec. III.]{mackay_rediscovery} and use them to calculate the estimate
$\hat{\boldsymbol{c}}$.
For cycle-free graphs this goal is reached after a finite For cycle-free graphs this goal is reached after a finite
number of steps and \ac{BP} is equivalent to \ac{MAP} decoding. number of steps and \ac{BP} is equivalent to \ac{MAP} decoding.
When the graph contains cycles, however, \ac{BP} only approximates the probabilities When the graph contains cycles, however, \ac{BP} only approximates the \ac{MAP} probabilities
and is sub-optimal. and is sub-optimal.
This leads to generally worse performance than \ac{MAP} decoding for practical codes. This leads to generally worse performance than \ac{MAP} decoding for practical codes.
Additionally, an \textit{error floor} appears for very high \acp{SNR}, making Additionally, an \textit{error floor} appears for very high \acp{SNR}, making
the use of \ac{BP} impractical for applications where a very low \ac{BER} is the use of \ac{BP} impractical for applications where a very low error rate is
desired \cite[Sec. 15.3]{ryan_lin_2009}. desired \cite[Sec. 15.3]{ryan_lin_2009}.
Another popular decoding method for \ac{LDPC} codes is the Another popular decoding method for \ac{LDPC} codes is the
\textit{min-sum algorithm}. \textit{min-sum algorithm}.
@@ -341,7 +342,7 @@ In contrast to the established message-passing decoding algorithms,
the perspective then changes from observing the decoding process in its the perspective then changes from observing the decoding process in its
Tanner graph representation with \acp{VN} and \acp{CN} (as shown in figure \ref{fig:dec:tanner}) Tanner graph representation with \acp{VN} and \acp{CN} (as shown in figure \ref{fig:dec:tanner})
to a spatial representation (figure \ref{fig:dec:spatial}), to a spatial representation (figure \ref{fig:dec:spatial}),
where the codewords are some of the edges of a hypercube. where the codewords are some of the vertices of a hypercube.
The goal is to find the point $\tilde{\boldsymbol{c}}$, The goal is to find the point $\tilde{\boldsymbol{c}}$,
which minimizes the objective function $g$. which minimizes the objective function $g$.
@@ -457,29 +458,38 @@ which minimizes the objective function $g$.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{An introduction to the proximal gradient method and ADMM} \section{A Short Introduction to the Proximal Gradient Method and ADMM}
\label{sec:theo:Optimization Methods} \label{sec:theo:Optimization Methods}
In this section, the general ideas behind the optimization methods used in
this work are outlined.
The application of these optimization methods to channel decoding decoding
will be discussed in later chapters.
Two methods are introduced, the \textit{proximal gradient method} and
\ac{ADMM}.
\textit{Proximal algorithms} are algorithms for solving convex optimization \textit{Proximal algorithms} are algorithms for solving convex optimization
problems, that rely on the use of \textit{proximal operators}. problems that rely on the use of \textit{proximal operators}.
The proximal operator $\textbf{prox}_{\lambda f} : \mathbb{R}^n \rightarrow \mathbb{R}^n$ The proximal operator $\textbf{prox}_{\lambda f} : \mathbb{R}^n \rightarrow \mathbb{R}^n$
of a function $f:\mathbb{R}^n \rightarrow \mathbb{R}$ is defined by of a function $f:\mathbb{R}^n \rightarrow \mathbb{R}$ is defined by
\cite[Sec. 1.1]{proximal_algorithms}% \cite[Sec. 1.1]{proximal_algorithms}%
% %
\begin{align*} \begin{align*}
\textbf{prox}_{\lambda f}\left( \boldsymbol{v} \right) = \argmin_{\boldsymbol{x}} \left( \textbf{prox}_{\lambda f}\left( \boldsymbol{v} \right)
f\left( \boldsymbol{x} \right) + \frac{1}{2\lambda}\lVert \boldsymbol{x} = \argmin_{\boldsymbol{x} \in \mathbb{R}^n} \left(
- \boldsymbol{v} \rVert_2^2 \right) f\left( \boldsymbol{x} \right) + \frac{1}{2\lambda}\lVert \boldsymbol{x}
- \boldsymbol{v} \rVert_2^2 \right)
.\end{align*} .\end{align*}
% %
This operator computes a point that is a compromise between minimizing $f$ This operator computes a point that is a compromise between minimizing $f$
and staying in the proximity of $\boldsymbol{v}$. and staying in the proximity of $\boldsymbol{v}$.
The parameter $\lambda$ determines how heavily each term is weighed. The parameter $\lambda$ determines how each term is weighed.
The \textit{proximal gradient method} is an iterative optimization method The proximal gradient method is an iterative optimization method
utilizing proximal operators, used to solve problems of the form% utilizing proximal operators, used to solve problems of the form%
% %
\begin{align*} \begin{align*}
\text{minimize}\hspace{5mm}f\left( \boldsymbol{x} \right) + g\left( \boldsymbol{x} \right) \underset{\boldsymbol{x} \in \mathbb{R}^n}{\text{minimize}}\hspace{5mm}
f\left( \boldsymbol{x} \right) + g\left( \boldsymbol{x} \right)
\end{align*} \end{align*}
% %
that consists of two steps: minimizing $f$ with gradient descent that consists of two steps: minimizing $f$ with gradient descent
@@ -492,14 +502,14 @@ and minimizing $g$ using the proximal operator
,\end{align*} ,\end{align*}
% %
Since $g$ is minimized with the proximal operator and is thus not required Since $g$ is minimized with the proximal operator and is thus not required
to be differentiable, it can be used to encode the constraints of the problem to be differentiable, it can be used to encode the constraints of the optimization problem
(e.g., in the form of an \textit{indicator function}, as mentioned in (e.g., in the form of an \textit{indicator function}, as mentioned in
\cite[Sec. 1.2]{proximal_algorithms}). \cite[Sec. 1.2]{proximal_algorithms}).
The \ac{ADMM} is another optimization method. \ac{ADMM} is another optimization method.
In this thesis it will be used to solve a \textit{linear program}, which In this thesis it will be used to solve a \textit{linear program}, which
is a special type of convex optimization problem, where the objective function is a special type of convex optimization problem in which the objective function
is linear, and the constraints consist of linear equalities and inequalities. is linear and the constraints consist of linear equalities and inequalities.
Generally, any linear program can be expressed in \textit{standard form}% Generally, any linear program can be expressed in \textit{standard form}%
\footnote{The inequality $\boldsymbol{x} \ge \boldsymbol{0}$ is to be \footnote{The inequality $\boldsymbol{x} \ge \boldsymbol{0}$ is to be
interpreted componentwise.} interpreted componentwise.}
@@ -507,38 +517,53 @@ interpreted componentwise.}
% %
\begin{alignat}{3} \begin{alignat}{3}
\begin{alignedat}{3} \begin{alignedat}{3}
\text{minimize }\hspace{2mm} && \boldsymbol{\gamma}^\text{T} \boldsymbol{x} \\ \underset{\boldsymbol{x}\in\mathbb{R}^n}{\text{minimize }}\hspace{2mm}
&& \boldsymbol{\gamma}^\text{T} \boldsymbol{x} \\
\text{subject to }\hspace{2mm} && \boldsymbol{A}\boldsymbol{x} & = \boldsymbol{b} \\ \text{subject to }\hspace{2mm} && \boldsymbol{A}\boldsymbol{x} & = \boldsymbol{b} \\
&& \boldsymbol{x} & \ge \boldsymbol{0}. && \boldsymbol{x} & \ge \boldsymbol{0},
\end{alignedat} \end{alignedat}
\label{eq:theo:admm_standard} \label{eq:theo:admm_standard}
\end{alignat}% \end{alignat}%
% %
A technique called \textit{Lagrangian relaxation} \cite[Sec. 11.4]{intro_to_lin_opt_book} where $\boldsymbol{x}, \boldsymbol{\gamma} \in \mathbb{R}^n$, $\boldsymbol{b} \in \mathbb{R}^m$
can then be applied. and $\boldsymbol{A}\in\mathbb{R}^{m \times n}$.
A technique called \textit{Lagrangian relaxation} can then be applied
\cite[Sec. 11.4]{intro_to_lin_opt_book}.
First, some of the constraints are moved into the objective function itself First, some of the constraints are moved into the objective function itself
and weights $\boldsymbol{\lambda}$ are introduced. A new, relaxed problem and weights $\boldsymbol{\lambda}$ are introduced. A new, relaxed problem
is then formulated as is formulated as
% %
\begin{align} \begin{align}
\begin{aligned} \begin{aligned}
\text{minimize }\hspace{2mm} & \boldsymbol{\gamma}^\text{T}\boldsymbol{x} \underset{\boldsymbol{x}\in\mathbb{R}^n}{\text{minimize }}\hspace{2mm}
+ \boldsymbol{\lambda}^\text{T}\left(\boldsymbol{b} & \boldsymbol{\gamma}^\text{T}\boldsymbol{x}
- \boldsymbol{A}\boldsymbol{x} \right) \\ + \boldsymbol{\lambda}^\text{T}\left(
\boldsymbol{A}\boldsymbol{x} - \boldsymbol{b}\right) \\
\text{subject to }\hspace{2mm} & \boldsymbol{x} \ge \boldsymbol{0}, \text{subject to }\hspace{2mm} & \boldsymbol{x} \ge \boldsymbol{0},
\end{aligned} \end{aligned}
\label{eq:theo:admm_relaxed} \label{eq:theo:admm_relaxed}
\end{align}% \end{align}%
% %
the new objective function being the \textit{Lagrangian}% the new objective function being the \textit{Lagrangian}%
\footnote{
Depending on what literature is consulted, the definition of the Lagrangian differs
in the order of $\boldsymbol{A}\boldsymbol{x}$ and $\boldsymbol{b}$.
As will subsequently be seen, however, the only property of the Lagrangian having
any bearing on the optimization process is that minimizing it gives a lower bound
on the optimal objective of the original problem.
This property is satisfied no matter the order of the terms and the order
chosen here is the one used in the \ac{LP} decoding literature making use of
\ac{ADMM}.
}%
% %
\begin{align*} \begin{align*}
\mathcal{L}\left( \boldsymbol{x}, \boldsymbol{\lambda} \right) \mathcal{L}\left( \boldsymbol{x}, \boldsymbol{\lambda} \right)
= \boldsymbol{\gamma}^\text{T}\boldsymbol{x} = \boldsymbol{\gamma}^\text{T}\boldsymbol{x}
+ \boldsymbol{\lambda}^\text{T}\left(\boldsymbol{b} + \boldsymbol{\lambda}^\text{T}\left(
- \boldsymbol{A}\boldsymbol{x} \right) \boldsymbol{A}\boldsymbol{x} - \boldsymbol{b}\right)
.\end{align*}% .\end{align*}%
% %
This problem is not directly equivalent to the original one, as the This problem is not directly equivalent to the original one, as the
solution now depends on the choice of the \textit{Lagrange multipliers} solution now depends on the choice of the \textit{Lagrange multipliers}
$\boldsymbol{\lambda}$. $\boldsymbol{\lambda}$.
@@ -562,12 +587,12 @@ Furthermore, for uniquely solvable linear programs \textit{strong duality}
always holds \cite[Theorem 4.4]{intro_to_lin_opt_book}. always holds \cite[Theorem 4.4]{intro_to_lin_opt_book}.
This means that not only is it a lower bound, the tightest lower This means that not only is it a lower bound, the tightest lower
bound actually reaches the value itself: bound actually reaches the value itself:
In other words, with the optimal choice of $\boldsymbol{\lambda}$, in other words, with the optimal choice of $\boldsymbol{\lambda}$,
the optimal objectives of the problems (\ref{eq:theo:admm_relaxed}) the optimal objectives of the problems (\ref{eq:theo:admm_relaxed})
and (\ref{eq:theo:admm_standard}) have the same value. and (\ref{eq:theo:admm_standard}) have the same value, i.e.,
% %
\begin{align*} \begin{align*}
\max_{\boldsymbol{\lambda}} \, \min_{\boldsymbol{x} \ge \boldsymbol{0}} \max_{\boldsymbol{\lambda}\in\mathbb{R}^m} \, \min_{\boldsymbol{x} \ge \boldsymbol{0}}
\mathcal{L}\left( \boldsymbol{x}, \boldsymbol{\lambda} \right) \mathcal{L}\left( \boldsymbol{x}, \boldsymbol{\lambda} \right)
= \min_{\substack{\boldsymbol{x} \ge \boldsymbol{0} \\ \boldsymbol{A}\boldsymbol{x} = \min_{\substack{\boldsymbol{x} \ge \boldsymbol{0} \\ \boldsymbol{A}\boldsymbol{x}
= \boldsymbol{b}}} = \boldsymbol{b}}}
@@ -577,7 +602,7 @@ and (\ref{eq:theo:admm_standard}) have the same value.
Thus, we can define the \textit{dual problem} as the search for the tightest lower bound:% Thus, we can define the \textit{dual problem} as the search for the tightest lower bound:%
% %
\begin{align} \begin{align}
\underset{\boldsymbol{\lambda}}{\text{maximize }}\hspace{2mm} \underset{\boldsymbol{\lambda}\in\mathbb{R}^m}{\text{maximize }}\hspace{2mm}
& \min_{\boldsymbol{x} \ge \boldsymbol{0}} \mathcal{L} & \min_{\boldsymbol{x} \ge \boldsymbol{0}} \mathcal{L}
\left( \boldsymbol{x}, \boldsymbol{\lambda} \right) \left( \boldsymbol{x}, \boldsymbol{\lambda} \right)
\label{eq:theo:dual} \label{eq:theo:dual}
@@ -600,7 +625,7 @@ using equation (\ref{eq:theo:admm_obtain_primal}); then, update $\boldsymbol{\la
using gradient descent \cite[Sec. 2.1]{distr_opt_book}:% using gradient descent \cite[Sec. 2.1]{distr_opt_book}:%
% %
\begin{align*} \begin{align*}
\boldsymbol{x} &\leftarrow \argmin_{\boldsymbol{x}} \mathcal{L}\left( \boldsymbol{x} &\leftarrow \argmin_{\boldsymbol{x} \ge \boldsymbol{0}} \mathcal{L}\left(
\boldsymbol{x}, \boldsymbol{\lambda} \right) \\ \boldsymbol{x}, \boldsymbol{\lambda} \right) \\
\boldsymbol{\lambda} &\leftarrow \boldsymbol{\lambda} \boldsymbol{\lambda} &\leftarrow \boldsymbol{\lambda}
+ \alpha\left( \boldsymbol{A}\boldsymbol{x} - \boldsymbol{b} \right), + \alpha\left( \boldsymbol{A}\boldsymbol{x} - \boldsymbol{b} \right),
@@ -608,12 +633,12 @@ using gradient descent \cite[Sec. 2.1]{distr_opt_book}:%
.\end{align*} .\end{align*}
% %
The algorithm can be improved by observing that when the objective function The algorithm can be improved by observing that when the objective function
$g: \mathbb{R}^n \rightarrow \mathbb{R}$ is separable into a number $g: \mathbb{R}^n \rightarrow \mathbb{R}$ is separable into a sum of
$N \in \mathbb{N}$ of sub-functions $N \in \mathbb{N}$ sub-functions
$g_i: \mathbb{R}^{n_i} \rightarrow \mathbb{R}$, $g_i: \mathbb{R}^{n_i} \rightarrow \mathbb{R}$,
i.e., $g\left( \boldsymbol{x} \right) = \sum_{i=1}^{N} g_i i.e., $g\left( \boldsymbol{x} \right) = \sum_{i=1}^{N} g_i
\left( \boldsymbol{x}_i \right)$, \left( \boldsymbol{x}_i \right)$,
where $\boldsymbol{x}_i,\hspace{1mm} i\in [1:N]$ are subvectors of where $\boldsymbol{x}_i\in\mathbb{R}^{n_i},\hspace{1mm} i\in [1:N]$ are subvectors of
$\boldsymbol{x}$, the Lagrangian is as well: $\boldsymbol{x}$, the Lagrangian is as well:
% %
\begin{align*} \begin{align*}
@@ -624,18 +649,18 @@ $\boldsymbol{x}$, the Lagrangian is as well:
\begin{align*} \begin{align*}
\mathcal{L}\left( \left( \boldsymbol{x}_i \right)_{i=1}^N, \boldsymbol{\lambda} \right) \mathcal{L}\left( \left( \boldsymbol{x}_i \right)_{i=1}^N, \boldsymbol{\lambda} \right)
= \sum_{i=1}^{N} g_i\left( \boldsymbol{x}_i \right) = \sum_{i=1}^{N} g_i\left( \boldsymbol{x}_i \right)
+ \boldsymbol{\lambda}^\text{T} \left( \boldsymbol{b} + \boldsymbol{\lambda}^\text{T} \left(
- \sum_{i=1}^{N} \boldsymbol{A}_i\boldsymbol{x_i} \right) \sum_{i=1}^{N} \boldsymbol{A}_i\boldsymbol{x_i} - \boldsymbol{b}\right)
.\end{align*}% .\end{align*}%
% %
The matrices $\boldsymbol{A}_i, \hspace{1mm} i \in [1:N]$ are partitions of The matrices $\boldsymbol{A}_i \in \mathbb{R}^{m \times n_i}, \hspace{1mm} i \in [1:N]$
the matrix $\boldsymbol{A}$, corresponding to form a partition of $\boldsymbol{A}$, corresponding to
$\boldsymbol{A} = \begin{bmatrix} $\boldsymbol{A} = \begin{bmatrix}
\boldsymbol{A}_1 & \boldsymbol{A}_1 &
\ldots & \ldots &
\boldsymbol{A}_N \boldsymbol{A}_N
\end{bmatrix}$. \end{bmatrix}$.
The minimization of each term can then happen in parallel, in a distributed The minimization of each term can happen in parallel, in a distributed
fashion \cite[Sec. 2.2]{distr_opt_book}. fashion \cite[Sec. 2.2]{distr_opt_book}.
In each minimization step, only one subvector $\boldsymbol{x}_i$ of In each minimization step, only one subvector $\boldsymbol{x}_i$ of
$\boldsymbol{x}$ is considered, regarding all other subvectors as being $\boldsymbol{x}$ is considered, regarding all other subvectors as being
@@ -643,7 +668,7 @@ constant.
This modified version of dual ascent is called \textit{dual decomposition}: This modified version of dual ascent is called \textit{dual decomposition}:
% %
\begin{align*} \begin{align*}
\boldsymbol{x}_i &\leftarrow \argmin_{\boldsymbol{x}_i}\mathcal{L}\left( \boldsymbol{x}_i &\leftarrow \argmin_{\boldsymbol{x}_i \ge \boldsymbol{0}}\mathcal{L}\left(
\left( \boldsymbol{x}_i \right)_{i=1}^N, \boldsymbol{\lambda}\right) \left( \boldsymbol{x}_i \right)_{i=1}^N, \boldsymbol{\lambda}\right)
\hspace{5mm} \forall i \in [1:N]\\ \hspace{5mm} \forall i \in [1:N]\\
\boldsymbol{\lambda} &\leftarrow \boldsymbol{\lambda} \boldsymbol{\lambda} &\leftarrow \boldsymbol{\lambda}
@@ -657,14 +682,15 @@ This modified version of dual ascent is called \textit{dual decomposition}:
It only differs in the use of an \textit{augmented Lagrangian} It only differs in the use of an \textit{augmented Lagrangian}
$\mathcal{L}_\mu\left( \left( \boldsymbol{x} \right)_{i=1}^N, \boldsymbol{\lambda} \right)$ $\mathcal{L}_\mu\left( \left( \boldsymbol{x} \right)_{i=1}^N, \boldsymbol{\lambda} \right)$
in order to strengthen the convergence properties. in order to strengthen the convergence properties.
The augmented Lagrangian extends the ordinary one with an additional penalty term The augmented Lagrangian extends the classical one with an additional penalty term
with the penaly parameter $\mu$: with the penalty parameter $\mu$:
% %
\begin{align*} \begin{align*}
\mathcal{L}_\mu \left( \left( \boldsymbol{x} \right)_{i=1}^N, \boldsymbol{\lambda} \right) \mathcal{L}_\mu \left( \left( \boldsymbol{x} \right)_{i=1}^N, \boldsymbol{\lambda} \right)
= \underbrace{\sum_{i=1}^{N} g_i\left( \boldsymbol{x_i} \right) = \underbrace{\sum_{i=1}^{N} g_i\left( \boldsymbol{x_i} \right)
+ \boldsymbol{\lambda}^\text{T}\left( \boldsymbol{b} + \boldsymbol{\lambda}^\text{T}\left(\sum_{i=1}^{N}
- \sum_{i=1}^{N} \boldsymbol{A}_i\boldsymbol{x}_i \right)}_{\text{Ordinary Lagrangian}} \boldsymbol{A}_i\boldsymbol{x}_i - \boldsymbol{b}\right)}
_{\text{Classical Lagrangian}}
+ \underbrace{\frac{\mu}{2}\left\Vert \sum_{i=1}^{N} \boldsymbol{A}_i\boldsymbol{x}_i + \underbrace{\frac{\mu}{2}\left\Vert \sum_{i=1}^{N} \boldsymbol{A}_i\boldsymbol{x}_i
- \boldsymbol{b} \right\Vert_2^2}_{\text{Penalty term}}, - \boldsymbol{b} \right\Vert_2^2}_{\text{Penalty term}},
\hspace{5mm} \mu > 0 \hspace{5mm} \mu > 0
@@ -674,21 +700,20 @@ The steps to solve the problem are the same as with dual decomposition, with the
condition that the step size be $\mu$:% condition that the step size be $\mu$:%
% %
\begin{align*} \begin{align*}
\boldsymbol{x}_i &\leftarrow \argmin_{\boldsymbol{x}_i}\mathcal{L}_\mu\left( \boldsymbol{x}_i &\leftarrow \argmin_{\boldsymbol{x}_i \ge \boldsymbol{0}}\mathcal{L}_\mu\left(
\left( \boldsymbol{x} \right)_{i=1}^N, \boldsymbol{\lambda}\right) \left( \boldsymbol{x} \right)_{i=1}^N, \boldsymbol{\lambda}\right)
\hspace{5mm} \forall i \in [1:N]\\ \hspace{5mm} \forall i \in [1:N]\\
\boldsymbol{\lambda} &\leftarrow \boldsymbol{\lambda} \boldsymbol{\lambda} &\leftarrow \boldsymbol{\lambda}
+ \mu\left( \sum_{i=1}^{N} \boldsymbol{A}_i\boldsymbol{x}_i + \mu\left( \sum_{i=1}^{N} \boldsymbol{A}_i\boldsymbol{x}_i
- \boldsymbol{b} \right), - \boldsymbol{b} \right),
\hspace{5mm} \mu > 0 \hspace{5mm} \mu > 0
% \boldsymbol{x}_1 &\leftarrow \argmin_{\boldsymbol{x}_1}\mathcal{L}_\mu\left(
% \boldsymbol{x}_1, \boldsymbol{x_2}, \boldsymbol{\lambda}\right) \\
% \boldsymbol{x}_2 &\leftarrow \argmin_{\boldsymbol{x}_2}\mathcal{L}_\mu\left(
% \boldsymbol{x}_1, \boldsymbol{x_2}, \boldsymbol{\lambda}\right) \\
% \boldsymbol{\lambda} &\leftarrow \boldsymbol{\lambda}
% + \mu\left( \boldsymbol{A}_1\boldsymbol{x}_1 + \boldsymbol{A}_2\boldsymbol{x}_2
% - \boldsymbol{b} \right),
% \hspace{5mm} \mu > 0
.\end{align*} .\end{align*}
% %
In subsequent chapters, the decoding problem will be reformulated as an
optimization problem using two different methodologies.
In chapter \ref{chapter:proximal_decoding}, a non-convex optimization approach
is chosen and addressed using the proximal gradient method.
In chapter \ref{chapter:lp_dec_using_admm}, an \ac{LP} based optimization problem is
formulated and solved using \ac{ADMM}.

BIN
latex/thesis/thesis.pdf Normal file

Binary file not shown.

View File

@@ -14,7 +14,7 @@
\thesisSupervisor{Dr.-Ing. Holger Jäkel} \thesisSupervisor{Dr.-Ing. Holger Jäkel}
\thesisStartDate{24.10.2022} \thesisStartDate{24.10.2022}
\thesisEndDate{24.04.2023} \thesisEndDate{24.04.2023}
\thesisSignatureDate{Signature date} % TODO: Signature date \thesisSignatureDate{24.04.2023} % TODO: Signature date
\thesisLanguage{english} \thesisLanguage{english}
\setlanguage \setlanguage
@@ -35,6 +35,7 @@
\usetikzlibrary{spy} \usetikzlibrary{spy}
\usetikzlibrary{shapes.geometric} \usetikzlibrary{shapes.geometric}
\usetikzlibrary{arrows.meta,arrows} \usetikzlibrary{arrows.meta,arrows}
\tikzset{>=latex}
\pgfplotsset{compat=newest} \pgfplotsset{compat=newest}
\usepgfplotslibrary{colorbrewer} \usepgfplotslibrary{colorbrewer}
@@ -209,6 +210,7 @@
% %
% 6. Conclusion % 6. Conclusion
\include{chapters/acknowledgements}
\tableofcontents \tableofcontents
\cleardoublepage % make sure multipage TOCs are numbered correctly \cleardoublepage % make sure multipage TOCs are numbered correctly
@@ -220,7 +222,7 @@
\include{chapters/comparison} \include{chapters/comparison}
% \include{chapters/discussion} % \include{chapters/discussion}
\include{chapters/conclusion} \include{chapters/conclusion}
\include{chapters/appendix} % \include{chapters/appendix}
%\listoffigures %\listoffigures