Finished first version of theoretical comparison text

This commit is contained in:
Andreas Tsouchlos 2023-04-12 22:54:33 +02:00
parent 55c9e2808b
commit a1a1fa1f71
3 changed files with 180 additions and 63 deletions

View File

@ -810,6 +810,10 @@ $\gamma \in \left\{ 0.01, 0.05, 0.15 \right\}$.
\end{figure}
\chapter{Proximal Decoding as a Message Passing Algorithm}
\label{chapter:Proximal Decoding as a Message Passing Algorithm}
%\chapter{\acs{LP} Decoding using \acs{ADMM} as a Proximal Algorithm}%
%\label{chapter:LD Decoding using ADMM as a Proximal Algorithm}
%

View File

@ -1,15 +1,11 @@
\chapter{Comparison of Proximal Decoding and \acs{LP} Decoding using \acs{ADMM}}%
\label{chapter:comparison}
TODO
In this chapter, proximal decoding and \ac{LP} Decoding using \ac{ADMM} are compared.
First the two algorithms are compared on a theoretical basis.
Subsequently, their respective simulation results are examined and their
differences are interpreted on the basis of their theoretical structure.
\todo{Note: This chapter is currently only a very rough draft and is not yet ready for correction}
%In this chapter, proximal decoding and \ac{LP} Decoding using \ac{ADMM} are compared.
%First the two algorithms are compared on a theoretical basis.
%Subsequently, their respective simulation results are examined and their
%differences interpreted on the basis of their theoretical structure.
%
%some similarities between the proximal decoding algorithm
%and \ac{LP} decoding using \ac{ADMM} are be pointed out.
%The two algorithms are compared and their different computational and decoding
@ -21,19 +17,39 @@ TODO
\label{sec:comp:theo}
\ac{ADMM} and the proximal gradient method can both be expressed in terms of
proximal operators.
They are both composed of an iterative approach consisting of two
alternating steps.
In both cases each step minimizes one distinct part of the objective function.
They do, however, have some fundametal differences.
In figure \ref{fig:ana:theo_comp_alg} the two algorithms are juxtaposed in their
proximal operator form, in conjuction with the optimization problems they
are meant to solve.%
proximal operators \cite[Sec. 4.4]{proximal_algorithms}.
When using \ac{ADMM} as an optimization method to solve the \ac{LP} decoding
problem specifically, this is not quite possible because of the multiple
constraints.
In spite of that, the two algorithms still show some striking similarities.
To see the first of these similarities, the \ac{LP} decoding problem in
equation (\ref{eq:lp:relaxed_formulation}) can be slightly rewritten using the
\textit{indicator functions} $g_j : \mathbb{R}^{d_j} \rightarrow
\left\{ 0, +\infty \right\} \hspace{1mm}, j\in\mathcal{J}$ for the polytopes
$\mathcal{P}_{d_j}, \hspace{1mm} j\in\mathcal{J}$, defined as%
%
\begin{figure}[H]
\begin{align*}
g_j\left( \boldsymbol{t} \right) := \begin{cases}
0, & \boldsymbol{t} \in \mathcal{P}_{d_j} \\
+\infty, & \boldsymbol{t} \not\in \mathcal{P}_{d_j}
\end{cases}
,\end{align*}
%
by moving the constraints into the objective function, as shown in figure
\ref{fig:ana:theo_comp_alg:admm}.
Both algorithms are composed of an iterative approach consisting of two
alternating steps.
The objective functions of both problems are similar in that they
both comprise two parts: one associated to the likelihood that a given
codeword was sent and one associated to the constraints the decoding process
is subjected to.
%
\begin{figure}[h]
\centering
\begin{subfigure}{0.48\textwidth}
\begin{subfigure}{0.42\textwidth}
\centering
\begin{align*}
@ -45,10 +61,10 @@ are meant to solve.%
\end{align*}
\begin{genericAlgorithm}[caption={}, label={},
basicstyle=\fontsize{11}{17}\selectfont
basicstyle=\fontsize{10}{18}\selectfont
]
Initialize $\boldsymbol{r}, \boldsymbol{s}, \omega, \gamma$
while stopping critierion not satisfied do
while stopping critierion unfulfilled do
$\boldsymbol{r} \leftarrow \boldsymbol{r}
+ \omega \nabla L\left( \boldsymbol{y} \mid \boldsymbol{s} \right) $
$\boldsymbol{s} \leftarrow
@ -58,44 +74,42 @@ end while
return $\boldsymbol{s}$
\end{genericAlgorithm}
\caption{Proximal gradient method}
\caption{Proximal decoding}
\label{fig:ana:theo_comp_alg:prox}
\end{subfigure}\hfill%
\begin{subfigure}{0.48\textwidth}
\begin{subfigure}{0.55\textwidth}
\centering
\begin{align*}
\text{minimize}\hspace{5mm} &
\underbrace{\boldsymbol{\gamma}^\text{T}\tilde{\boldsymbol{c}}}
_{\text{Likelihood}}
+ \underbrace{g\left( \boldsymbol{T}\tilde{\boldsymbol{c}} \right) }
+ \underbrace{\sum\nolimits_{j\in\mathcal{J}} g_j\left( \boldsymbol{T}_j\tilde{\boldsymbol{c}} \right) }
_{\text{Constraints}} \\
\text{subject to}\hspace{5mm} &
\tilde{\boldsymbol{c}} \in \mathbb{R}^n
% \boldsymbol{T}_j\tilde{\boldsymbol{c}} = \boldsymbol{z}_j\hspace{3mm}
% \forall j\in\mathcal{J}
\end{align*}
\begin{genericAlgorithm}[caption={}, label={},
basicstyle=\fontsize{11}{17}\selectfont
basicstyle=\fontsize{10}{18}\selectfont
]
Initialize $\tilde{\boldsymbol{c}}, \boldsymbol{z}, \boldsymbol{u}, \boldsymbol{\gamma}, \nu$
while stopping criterion not satisfied do
$\tilde{\boldsymbol{c}} \leftarrow \textbf{prox}_{
\scaleto{\nu \cdot \boldsymbol{\gamma}^{\text{T}}\tilde{\boldsymbol{c}}}{8.5pt}}
\left( \tilde{\boldsymbol{c}}
- \frac{\mu}{\lambda}\boldsymbol{T}^\text{T}\left( \boldsymbol{T}\tilde{\boldsymbol{c}}
- \boldsymbol{z} + \boldsymbol{u} \right) \right)$
$\boldsymbol{z} \leftarrow \textbf{prox}_{\scaleto{g}{7pt}}
\left( \boldsymbol{T}\tilde{\boldsymbol{c}}
+ \boldsymbol{u} \right)$
$\boldsymbol{u} \leftarrow \boldsymbol{u}
+ \tilde{\boldsymbol{c}} - \boldsymbol{z}$
Initialize $\tilde{\boldsymbol{c}}, \boldsymbol{z}, \boldsymbol{u}, \boldsymbol{\gamma}, \rho$
while stopping criterion unfulfilled do
$\tilde{\boldsymbol{c}} \leftarrow \argmin_{\tilde{\boldsymbol{c}}}
\left( \boldsymbol{\gamma}^\text{T}\tilde{\boldsymbol{c}}
+ \frac{\rho}{2}\sum_{j\in\mathcal{J}} \left\Vert
\boldsymbol{T}_j\tilde{\boldsymbol{c}} - \boldsymbol{z}_j
+ \boldsymbol{u}_j \right\Vert \right)$
$\boldsymbol{z}_j \leftarrow \textbf{prox}_{g_j}
\left( \boldsymbol{T}_j\tilde{\boldsymbol{c}}
+ \boldsymbol{u}_j \right), \hspace{5mm}\forall j\in\mathcal{J}$
$\boldsymbol{u}_j \leftarrow \boldsymbol{u}_j
+ \tilde{\boldsymbol{c}} - \boldsymbol{z}_j, \hspace{15.25mm}\forall j\in\mathcal{J}$
end while
return $\tilde{\boldsymbol{c}}$
\end{genericAlgorithm}
\caption{\ac{ADMM}}
\caption{LP decoding using \ac{ADMM}}
\label{fig:ana:theo_comp_alg:admm}
\end{subfigure}%
@ -104,10 +118,6 @@ return $\tilde{\boldsymbol{c}}$
\label{fig:ana:theo_comp_alg}
\end{figure}%
%
\noindent The objective functions of both problems are similar in that they
both comprise two parts: one associated to the likelihood that a given
codeword was sent and one associated to the constraints the codeword is
subjected to.
Their major differece is that while with proximal decoding the constraints
are regarded in a global context, considering all parity checks at the same
@ -124,19 +134,129 @@ with \ac{ADMM} it reduces to a number of projections onto the parity polytopes
$\mathcal{P}_{d_j}$ which always provide exact results.
The contrasting treatment of the constraints (global and approximate with
proximal decoding, local and exact with \ac{ADMM}) also leads to different
prospects when the decoding process gets stuck in a local minimum.
proximal decoding as opposed to local and exact with \ac{LP} decoding using
\ac{ADMM}) also leads to different prospects when the decoding process gets
stuck in a local minimum.
With proximal decoding this occurrs due to the approximate nature of the
calculation, whereas with \ac{ADMM} it occurs due to the approximate
formulation of the constraints - not depending on the optimization method
itself.
The advantage which arises because of this when using \ac{ADMM} is that
it can be easily detected, when the algorithm gets stuck - the algorithm
returns a pseudocodeword, the components of which are fractional.
\todo{Additional constraints can then be successively added, until a valid
codeword is returned}
calculation, whereas with \ac{LP} decoding it occurs due to the approximate
formulation of the constraints - independent of the optimization method
itself.
The advantage which arises because of this when employing \ac{LP} decoding is
that it can be easily detected, when the algorithm gets stuck - it
returns a solution corresponding to a pseudocodeword, the components of which
are fractional.
Moreover, when a valid codeword is returned, it is also the \ac{ML} codeword.
This means that additional redundant parity-checks can be added successively
until the codeword returned is valid and thus the \ac{ML} solution is found
\cite[Sec. IV.]{alp}.
\todo{Compare time complexity using Big-O notation}
In terms of time complexity, the two decoding algorithms are comparable.
Each of the operations required for proximal decoding can be performed
in linear time for \ac{LDPC} codes (see section \ref{subsec:prox:comp_perf}).
The same is true for the $\tilde{\boldsymbol{c}}$- and $\boldsymbol{u}$-update
steps of \ac{LP} decoding using \ac{ADMM}, while
the projection step has a worst-case time complexity of
$\mathcal{O}\left( n^2 \right)$ and an average complexity of
$\mathcal{O}\left( n \right)$ (see section TODO, \cite[Sec. VIII.]{lautern}).
Both algorithms can be understood as message-passing algorithms, \ac{LP}
decoding using \ac{ADMM} as similarly to \cite[Sec. III. D.]{original_admm}
or \cite[Sec. II. B.]{efficient_lp_dec_admm} and proximal decoding as shown in
appendix \ref{chapter:Proximal Decoding as a Message Passing Algorithm}.
The algorithms in their message-passing form are depicted in figure
\ref{fig:comp:message_passing}.
$M_{j\to i}$ denotes the message transmitted from \ac{CN} j to \ac{VN} i.
$M_{j\to}$ signifies the special case where a \ac{VN} transmits the same
message to all \acp{VN}.
%
\begin{figure}[h]
\centering
\begin{subfigure}{0.5\textwidth}
\centering
\begin{genericAlgorithm}[caption={}, label={},
% basicstyle=\fontsize{10}{16}\selectfont
]
Initialize $\boldsymbol{r}, \boldsymbol{s}, \omega, \gamma$
while stopping critierion unfulfilled do
for j in $\mathcal{J}$ do
$p_j \leftarrow \prod_{i\in N_c\left( j \right) } r_i $
$M_{j\to} \leftarrow p_j^2 - p_j$|\Suppressnumber|
|\vspace{0.22mm}\Reactivatenumber|
end for
for i in $\mathcal{I}$ do
$s_i \leftarrow s_i + \gamma \left[ 4\left( s_i^2 - 1 \right)s_i
\phantom{\frac{4}{s_i}}\right.$|\Suppressnumber|
|\Reactivatenumber|$\left.+ \frac{4}{s_i}\sum_{j\in N_v\left( i \right) }
M_{j\to} \right] $
$r_i \leftarrow r_i + \omega \left( s_i - y_i \right)$
end for
end while
\end{genericAlgorithm}
\caption{Proximal decoding}
\label{fig:comp:message_passing:proximal}
\end{subfigure}%
\begin{subfigure}{0.5\textwidth}
\centering
\begin{genericAlgorithm}[caption={}, label={},
% basicstyle=\fontsize{10}{16}\selectfont
]
Initialize $\tilde{\boldsymbol{c}}, \boldsymbol{z}, \boldsymbol{u}, \boldsymbol{\gamma}, \rho$
while stopping criterion unfulfilled do
for j in $\mathcal{J}$ do
$\boldsymbol{z}_j \leftarrow \Pi_{P_{d_j}}\left(
\boldsymbol{T}_j\tilde{\boldsymbol{c}} + \boldsymbol{u}_j\right)$
$\boldsymbol{u}_j \leftarrow \boldsymbol{u}_j + \boldsymbol{T}_j\tilde{\boldsymbol{c}}
- \boldsymbol{z}_j$
$M_{j\to i} \leftarrow \left( z_j \right)_i - \left( u_j \right)_i,
\hspace{3mm} \forall i \in N_c\left( j \right) $
end for
for i in $\mathcal{I}$ do
$\tilde{c}_i \leftarrow \frac{1}{d_i}
\left(\sum_{j\in N_v\left( i \right) } M_{j\to i}
- \frac{\gamma_i}{\mu} \right)$|\Suppressnumber|
|\vspace{7mm}\Reactivatenumber|
end for
end while
\end{genericAlgorithm}
\caption{\ac{LP} decoding using \ac{ADMM}}
\label{fig:comp:message_passing:admm}
\end{subfigure}%
\caption{The proximal gradient method and \ac{LP} decoding using \ac{ADMM}
as message passing algorithms}
\label{fig:comp:message_passing}
\end{figure}%
%
It is evident that while the two algorithms are very similar in their general
structure, with \ac{LP} decoding using \ac{ADMM}, multiple messages have to be
computed for each check node (line 6 in figure
\ref{fig:comp:message_passing:admm}), whereas
with proximal decoding, the same message is transmitted to all \acp{VN}
(line 5 of figure \ref{fig:comp:message_passing:proximal}).
This means that while both algorithms have an averege time complexity of
$\mathcal{O}\left( n \right)$, more arithmetic operations are required for the
\ac{ADMM} case.
In conclusion, the two algorithms have a very similar structure, where the
parts of the objective function relating to the likelihood and to the
constraints are minimized in an alternating fashion.
With proximal decoding this minimization is performed for all constraints at once
in an approximative manner, while with \ac{LP} decoding using \ac{ADMM} it is
performed for each constraint individually and with exact results.
In terms of time complexity, both algorithms are, on average, linear with
respect to $n$, although for \ac{LP} decoding using \ac{ADMM} significantly
more arithmetic operations are necessary in each iteration.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Comparison of Simulation Results}%
\label{sec:comp:res}
\begin{itemize}
\item The comparison of actual implementations is always debatable /
@ -153,11 +273,3 @@ codeword is returned}
\item Proximal decoding faster than \ac{ADMM} $\rightarrow$ dafuq
(larger number of iterations before convergence? More values to compute for ADMM?)
\end{itemize}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Comparison of Simulation Results}%
\label{sec:comp:res}
TODO

View File

@ -1169,6 +1169,7 @@ an invalid codeword.
\subsection{Computational Performance}
\label{subsec:prox:comp_perf}
In order to determine the time complexity of proximal decoding for an
\ac{AWGN} channel, the decoding process as depicted in algorithm
@ -1180,7 +1181,7 @@ with $n$, the two
recursive steps in lines 3 and 4 have time complexity
$\mathcal{O}\left( n \right)$ \cite[Sec. 4.1]{proximal_paper}.
The $\text{sign}$ operation in line 5 also has $\mathcal{O}\left( n \right) $
time complexity as it only depends on the $n$ components of $\boldsymbol{s}$.
time complexity, as it only depends on the $n$ components of $\boldsymbol{s}$.
Given the sparsity of the matrix $\boldsymbol{H}$, evaluating the parity-check
condition has linear time complexity as well, since at most $n$ additions and
multiplications have to be performed.