Finished first version of theoretical comparison text
This commit is contained in:
parent
55c9e2808b
commit
a1a1fa1f71
@ -810,6 +810,10 @@ $\gamma \in \left\{ 0.01, 0.05, 0.15 \right\}$.
|
||||
\end{figure}
|
||||
|
||||
|
||||
\chapter{Proximal Decoding as a Message Passing Algorithm}
|
||||
\label{chapter:Proximal Decoding as a Message Passing Algorithm}
|
||||
|
||||
|
||||
%\chapter{\acs{LP} Decoding using \acs{ADMM} as a Proximal Algorithm}%
|
||||
%\label{chapter:LD Decoding using ADMM as a Proximal Algorithm}
|
||||
%
|
||||
|
||||
@ -1,15 +1,11 @@
|
||||
\chapter{Comparison of Proximal Decoding and \acs{LP} Decoding using \acs{ADMM}}%
|
||||
\label{chapter:comparison}
|
||||
|
||||
TODO
|
||||
In this chapter, proximal decoding and \ac{LP} Decoding using \ac{ADMM} are compared.
|
||||
First the two algorithms are compared on a theoretical basis.
|
||||
Subsequently, their respective simulation results are examined and their
|
||||
differences are interpreted on the basis of their theoretical structure.
|
||||
|
||||
\todo{Note: This chapter is currently only a very rough draft and is not yet ready for correction}
|
||||
|
||||
%In this chapter, proximal decoding and \ac{LP} Decoding using \ac{ADMM} are compared.
|
||||
%First the two algorithms are compared on a theoretical basis.
|
||||
%Subsequently, their respective simulation results are examined and their
|
||||
%differences interpreted on the basis of their theoretical structure.
|
||||
%
|
||||
%some similarities between the proximal decoding algorithm
|
||||
%and \ac{LP} decoding using \ac{ADMM} are be pointed out.
|
||||
%The two algorithms are compared and their different computational and decoding
|
||||
@ -21,19 +17,39 @@ TODO
|
||||
\label{sec:comp:theo}
|
||||
|
||||
\ac{ADMM} and the proximal gradient method can both be expressed in terms of
|
||||
proximal operators.
|
||||
They are both composed of an iterative approach consisting of two
|
||||
alternating steps.
|
||||
In both cases each step minimizes one distinct part of the objective function.
|
||||
They do, however, have some fundametal differences.
|
||||
In figure \ref{fig:ana:theo_comp_alg} the two algorithms are juxtaposed in their
|
||||
proximal operator form, in conjuction with the optimization problems they
|
||||
are meant to solve.%
|
||||
proximal operators \cite[Sec. 4.4]{proximal_algorithms}.
|
||||
When using \ac{ADMM} as an optimization method to solve the \ac{LP} decoding
|
||||
problem specifically, this is not quite possible because of the multiple
|
||||
constraints.
|
||||
In spite of that, the two algorithms still show some striking similarities.
|
||||
|
||||
To see the first of these similarities, the \ac{LP} decoding problem in
|
||||
equation (\ref{eq:lp:relaxed_formulation}) can be slightly rewritten using the
|
||||
\textit{indicator functions} $g_j : \mathbb{R}^{d_j} \rightarrow
|
||||
\left\{ 0, +\infty \right\} \hspace{1mm}, j\in\mathcal{J}$ for the polytopes
|
||||
$\mathcal{P}_{d_j}, \hspace{1mm} j\in\mathcal{J}$, defined as%
|
||||
%
|
||||
\begin{figure}[H]
|
||||
\begin{align*}
|
||||
g_j\left( \boldsymbol{t} \right) := \begin{cases}
|
||||
0, & \boldsymbol{t} \in \mathcal{P}_{d_j} \\
|
||||
+\infty, & \boldsymbol{t} \not\in \mathcal{P}_{d_j}
|
||||
\end{cases}
|
||||
,\end{align*}
|
||||
%
|
||||
by moving the constraints into the objective function, as shown in figure
|
||||
\ref{fig:ana:theo_comp_alg:admm}.
|
||||
Both algorithms are composed of an iterative approach consisting of two
|
||||
alternating steps.
|
||||
The objective functions of both problems are similar in that they
|
||||
both comprise two parts: one associated to the likelihood that a given
|
||||
codeword was sent and one associated to the constraints the decoding process
|
||||
is subjected to.
|
||||
%
|
||||
|
||||
\begin{figure}[h]
|
||||
\centering
|
||||
|
||||
\begin{subfigure}{0.48\textwidth}
|
||||
\begin{subfigure}{0.42\textwidth}
|
||||
\centering
|
||||
|
||||
\begin{align*}
|
||||
@ -45,10 +61,10 @@ are meant to solve.%
|
||||
\end{align*}
|
||||
|
||||
\begin{genericAlgorithm}[caption={}, label={},
|
||||
basicstyle=\fontsize{11}{17}\selectfont
|
||||
basicstyle=\fontsize{10}{18}\selectfont
|
||||
]
|
||||
Initialize $\boldsymbol{r}, \boldsymbol{s}, \omega, \gamma$
|
||||
while stopping critierion not satisfied do
|
||||
while stopping critierion unfulfilled do
|
||||
$\boldsymbol{r} \leftarrow \boldsymbol{r}
|
||||
+ \omega \nabla L\left( \boldsymbol{y} \mid \boldsymbol{s} \right) $
|
||||
$\boldsymbol{s} \leftarrow
|
||||
@ -58,44 +74,42 @@ end while
|
||||
return $\boldsymbol{s}$
|
||||
\end{genericAlgorithm}
|
||||
|
||||
\caption{Proximal gradient method}
|
||||
\caption{Proximal decoding}
|
||||
\label{fig:ana:theo_comp_alg:prox}
|
||||
\end{subfigure}\hfill%
|
||||
\begin{subfigure}{0.48\textwidth}
|
||||
\begin{subfigure}{0.55\textwidth}
|
||||
\centering
|
||||
|
||||
\begin{align*}
|
||||
\text{minimize}\hspace{5mm} &
|
||||
\underbrace{\boldsymbol{\gamma}^\text{T}\tilde{\boldsymbol{c}}}
|
||||
_{\text{Likelihood}}
|
||||
+ \underbrace{g\left( \boldsymbol{T}\tilde{\boldsymbol{c}} \right) }
|
||||
+ \underbrace{\sum\nolimits_{j\in\mathcal{J}} g_j\left( \boldsymbol{T}_j\tilde{\boldsymbol{c}} \right) }
|
||||
_{\text{Constraints}} \\
|
||||
\text{subject to}\hspace{5mm} &
|
||||
\tilde{\boldsymbol{c}} \in \mathbb{R}^n
|
||||
% \boldsymbol{T}_j\tilde{\boldsymbol{c}} = \boldsymbol{z}_j\hspace{3mm}
|
||||
% \forall j\in\mathcal{J}
|
||||
\end{align*}
|
||||
|
||||
\begin{genericAlgorithm}[caption={}, label={},
|
||||
basicstyle=\fontsize{11}{17}\selectfont
|
||||
basicstyle=\fontsize{10}{18}\selectfont
|
||||
]
|
||||
Initialize $\tilde{\boldsymbol{c}}, \boldsymbol{z}, \boldsymbol{u}, \boldsymbol{\gamma}, \nu$
|
||||
while stopping criterion not satisfied do
|
||||
$\tilde{\boldsymbol{c}} \leftarrow \textbf{prox}_{
|
||||
\scaleto{\nu \cdot \boldsymbol{\gamma}^{\text{T}}\tilde{\boldsymbol{c}}}{8.5pt}}
|
||||
\left( \tilde{\boldsymbol{c}}
|
||||
- \frac{\mu}{\lambda}\boldsymbol{T}^\text{T}\left( \boldsymbol{T}\tilde{\boldsymbol{c}}
|
||||
- \boldsymbol{z} + \boldsymbol{u} \right) \right)$
|
||||
$\boldsymbol{z} \leftarrow \textbf{prox}_{\scaleto{g}{7pt}}
|
||||
\left( \boldsymbol{T}\tilde{\boldsymbol{c}}
|
||||
+ \boldsymbol{u} \right)$
|
||||
$\boldsymbol{u} \leftarrow \boldsymbol{u}
|
||||
+ \tilde{\boldsymbol{c}} - \boldsymbol{z}$
|
||||
Initialize $\tilde{\boldsymbol{c}}, \boldsymbol{z}, \boldsymbol{u}, \boldsymbol{\gamma}, \rho$
|
||||
while stopping criterion unfulfilled do
|
||||
$\tilde{\boldsymbol{c}} \leftarrow \argmin_{\tilde{\boldsymbol{c}}}
|
||||
\left( \boldsymbol{\gamma}^\text{T}\tilde{\boldsymbol{c}}
|
||||
+ \frac{\rho}{2}\sum_{j\in\mathcal{J}} \left\Vert
|
||||
\boldsymbol{T}_j\tilde{\boldsymbol{c}} - \boldsymbol{z}_j
|
||||
+ \boldsymbol{u}_j \right\Vert \right)$
|
||||
$\boldsymbol{z}_j \leftarrow \textbf{prox}_{g_j}
|
||||
\left( \boldsymbol{T}_j\tilde{\boldsymbol{c}}
|
||||
+ \boldsymbol{u}_j \right), \hspace{5mm}\forall j\in\mathcal{J}$
|
||||
$\boldsymbol{u}_j \leftarrow \boldsymbol{u}_j
|
||||
+ \tilde{\boldsymbol{c}} - \boldsymbol{z}_j, \hspace{15.25mm}\forall j\in\mathcal{J}$
|
||||
end while
|
||||
return $\tilde{\boldsymbol{c}}$
|
||||
\end{genericAlgorithm}
|
||||
|
||||
\caption{\ac{ADMM}}
|
||||
\caption{LP decoding using \ac{ADMM}}
|
||||
\label{fig:ana:theo_comp_alg:admm}
|
||||
\end{subfigure}%
|
||||
|
||||
@ -104,10 +118,6 @@ return $\tilde{\boldsymbol{c}}$
|
||||
\label{fig:ana:theo_comp_alg}
|
||||
\end{figure}%
|
||||
%
|
||||
\noindent The objective functions of both problems are similar in that they
|
||||
both comprise two parts: one associated to the likelihood that a given
|
||||
codeword was sent and one associated to the constraints the codeword is
|
||||
subjected to.
|
||||
|
||||
Their major differece is that while with proximal decoding the constraints
|
||||
are regarded in a global context, considering all parity checks at the same
|
||||
@ -124,19 +134,129 @@ with \ac{ADMM} it reduces to a number of projections onto the parity polytopes
|
||||
$\mathcal{P}_{d_j}$ which always provide exact results.
|
||||
|
||||
The contrasting treatment of the constraints (global and approximate with
|
||||
proximal decoding, local and exact with \ac{ADMM}) also leads to different
|
||||
prospects when the decoding process gets stuck in a local minimum.
|
||||
proximal decoding as opposed to local and exact with \ac{LP} decoding using
|
||||
\ac{ADMM}) also leads to different prospects when the decoding process gets
|
||||
stuck in a local minimum.
|
||||
With proximal decoding this occurrs due to the approximate nature of the
|
||||
calculation, whereas with \ac{ADMM} it occurs due to the approximate
|
||||
formulation of the constraints - not depending on the optimization method
|
||||
itself.
|
||||
The advantage which arises because of this when using \ac{ADMM} is that
|
||||
it can be easily detected, when the algorithm gets stuck - the algorithm
|
||||
returns a pseudocodeword, the components of which are fractional.
|
||||
\todo{Additional constraints can then be successively added, until a valid
|
||||
codeword is returned}
|
||||
calculation, whereas with \ac{LP} decoding it occurs due to the approximate
|
||||
formulation of the constraints - independent of the optimization method
|
||||
itself.
|
||||
The advantage which arises because of this when employing \ac{LP} decoding is
|
||||
that it can be easily detected, when the algorithm gets stuck - it
|
||||
returns a solution corresponding to a pseudocodeword, the components of which
|
||||
are fractional.
|
||||
Moreover, when a valid codeword is returned, it is also the \ac{ML} codeword.
|
||||
This means that additional redundant parity-checks can be added successively
|
||||
until the codeword returned is valid and thus the \ac{ML} solution is found
|
||||
\cite[Sec. IV.]{alp}.
|
||||
|
||||
\todo{Compare time complexity using Big-O notation}
|
||||
In terms of time complexity, the two decoding algorithms are comparable.
|
||||
Each of the operations required for proximal decoding can be performed
|
||||
in linear time for \ac{LDPC} codes (see section \ref{subsec:prox:comp_perf}).
|
||||
The same is true for the $\tilde{\boldsymbol{c}}$- and $\boldsymbol{u}$-update
|
||||
steps of \ac{LP} decoding using \ac{ADMM}, while
|
||||
the projection step has a worst-case time complexity of
|
||||
$\mathcal{O}\left( n^2 \right)$ and an average complexity of
|
||||
$\mathcal{O}\left( n \right)$ (see section TODO, \cite[Sec. VIII.]{lautern}).
|
||||
|
||||
Both algorithms can be understood as message-passing algorithms, \ac{LP}
|
||||
decoding using \ac{ADMM} as similarly to \cite[Sec. III. D.]{original_admm}
|
||||
or \cite[Sec. II. B.]{efficient_lp_dec_admm} and proximal decoding as shown in
|
||||
appendix \ref{chapter:Proximal Decoding as a Message Passing Algorithm}.
|
||||
The algorithms in their message-passing form are depicted in figure
|
||||
\ref{fig:comp:message_passing}.
|
||||
$M_{j\to i}$ denotes the message transmitted from \ac{CN} j to \ac{VN} i.
|
||||
$M_{j\to}$ signifies the special case where a \ac{VN} transmits the same
|
||||
message to all \acp{VN}.
|
||||
%
|
||||
\begin{figure}[h]
|
||||
\centering
|
||||
|
||||
\begin{subfigure}{0.5\textwidth}
|
||||
\centering
|
||||
|
||||
\begin{genericAlgorithm}[caption={}, label={},
|
||||
% basicstyle=\fontsize{10}{16}\selectfont
|
||||
]
|
||||
Initialize $\boldsymbol{r}, \boldsymbol{s}, \omega, \gamma$
|
||||
while stopping critierion unfulfilled do
|
||||
for j in $\mathcal{J}$ do
|
||||
$p_j \leftarrow \prod_{i\in N_c\left( j \right) } r_i $
|
||||
$M_{j\to} \leftarrow p_j^2 - p_j$|\Suppressnumber|
|
||||
|\vspace{0.22mm}\Reactivatenumber|
|
||||
end for
|
||||
for i in $\mathcal{I}$ do
|
||||
$s_i \leftarrow s_i + \gamma \left[ 4\left( s_i^2 - 1 \right)s_i
|
||||
\phantom{\frac{4}{s_i}}\right.$|\Suppressnumber|
|
||||
|\Reactivatenumber|$\left.+ \frac{4}{s_i}\sum_{j\in N_v\left( i \right) }
|
||||
M_{j\to} \right] $
|
||||
$r_i \leftarrow r_i + \omega \left( s_i - y_i \right)$
|
||||
end for
|
||||
end while
|
||||
\end{genericAlgorithm}
|
||||
|
||||
\caption{Proximal decoding}
|
||||
\label{fig:comp:message_passing:proximal}
|
||||
\end{subfigure}%
|
||||
\begin{subfigure}{0.5\textwidth}
|
||||
\centering
|
||||
|
||||
\begin{genericAlgorithm}[caption={}, label={},
|
||||
% basicstyle=\fontsize{10}{16}\selectfont
|
||||
]
|
||||
Initialize $\tilde{\boldsymbol{c}}, \boldsymbol{z}, \boldsymbol{u}, \boldsymbol{\gamma}, \rho$
|
||||
while stopping criterion unfulfilled do
|
||||
for j in $\mathcal{J}$ do
|
||||
$\boldsymbol{z}_j \leftarrow \Pi_{P_{d_j}}\left(
|
||||
\boldsymbol{T}_j\tilde{\boldsymbol{c}} + \boldsymbol{u}_j\right)$
|
||||
$\boldsymbol{u}_j \leftarrow \boldsymbol{u}_j + \boldsymbol{T}_j\tilde{\boldsymbol{c}}
|
||||
- \boldsymbol{z}_j$
|
||||
$M_{j\to i} \leftarrow \left( z_j \right)_i - \left( u_j \right)_i,
|
||||
\hspace{3mm} \forall i \in N_c\left( j \right) $
|
||||
end for
|
||||
for i in $\mathcal{I}$ do
|
||||
$\tilde{c}_i \leftarrow \frac{1}{d_i}
|
||||
\left(\sum_{j\in N_v\left( i \right) } M_{j\to i}
|
||||
- \frac{\gamma_i}{\mu} \right)$|\Suppressnumber|
|
||||
|\vspace{7mm}\Reactivatenumber|
|
||||
end for
|
||||
end while
|
||||
\end{genericAlgorithm}
|
||||
\caption{\ac{LP} decoding using \ac{ADMM}}
|
||||
\label{fig:comp:message_passing:admm}
|
||||
\end{subfigure}%
|
||||
|
||||
|
||||
\caption{The proximal gradient method and \ac{LP} decoding using \ac{ADMM}
|
||||
as message passing algorithms}
|
||||
\label{fig:comp:message_passing}
|
||||
\end{figure}%
|
||||
%
|
||||
It is evident that while the two algorithms are very similar in their general
|
||||
structure, with \ac{LP} decoding using \ac{ADMM}, multiple messages have to be
|
||||
computed for each check node (line 6 in figure
|
||||
\ref{fig:comp:message_passing:admm}), whereas
|
||||
with proximal decoding, the same message is transmitted to all \acp{VN}
|
||||
(line 5 of figure \ref{fig:comp:message_passing:proximal}).
|
||||
This means that while both algorithms have an averege time complexity of
|
||||
$\mathcal{O}\left( n \right)$, more arithmetic operations are required for the
|
||||
\ac{ADMM} case.
|
||||
|
||||
In conclusion, the two algorithms have a very similar structure, where the
|
||||
parts of the objective function relating to the likelihood and to the
|
||||
constraints are minimized in an alternating fashion.
|
||||
With proximal decoding this minimization is performed for all constraints at once
|
||||
in an approximative manner, while with \ac{LP} decoding using \ac{ADMM} it is
|
||||
performed for each constraint individually and with exact results.
|
||||
In terms of time complexity, both algorithms are, on average, linear with
|
||||
respect to $n$, although for \ac{LP} decoding using \ac{ADMM} significantly
|
||||
more arithmetic operations are necessary in each iteration.
|
||||
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\section{Comparison of Simulation Results}%
|
||||
\label{sec:comp:res}
|
||||
|
||||
\begin{itemize}
|
||||
\item The comparison of actual implementations is always debatable /
|
||||
@ -153,11 +273,3 @@ codeword is returned}
|
||||
\item Proximal decoding faster than \ac{ADMM} $\rightarrow$ dafuq
|
||||
(larger number of iterations before convergence? More values to compute for ADMM?)
|
||||
\end{itemize}
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\section{Comparison of Simulation Results}%
|
||||
\label{sec:comp:res}
|
||||
|
||||
TODO
|
||||
|
||||
|
||||
@ -1169,6 +1169,7 @@ an invalid codeword.
|
||||
|
||||
|
||||
\subsection{Computational Performance}
|
||||
\label{subsec:prox:comp_perf}
|
||||
|
||||
In order to determine the time complexity of proximal decoding for an
|
||||
\ac{AWGN} channel, the decoding process as depicted in algorithm
|
||||
@ -1180,7 +1181,7 @@ with $n$, the two
|
||||
recursive steps in lines 3 and 4 have time complexity
|
||||
$\mathcal{O}\left( n \right)$ \cite[Sec. 4.1]{proximal_paper}.
|
||||
The $\text{sign}$ operation in line 5 also has $\mathcal{O}\left( n \right) $
|
||||
time complexity as it only depends on the $n$ components of $\boldsymbol{s}$.
|
||||
time complexity, as it only depends on the $n$ components of $\boldsymbol{s}$.
|
||||
Given the sparsity of the matrix $\boldsymbol{H}$, evaluating the parity-check
|
||||
condition has linear time complexity as well, since at most $n$ additions and
|
||||
multiplications have to be performed.
|
||||
|
||||
Loading…
Reference in New Issue
Block a user