Continued writing theoretical comparison of admm and proximal decoding
This commit is contained in:
parent
2ade886191
commit
70eac9515f
@ -43,10 +43,10 @@ proximal operators.
|
|||||||
They are both composed of an iterative approach consisting of two
|
They are both composed of an iterative approach consisting of two
|
||||||
alternating steps.
|
alternating steps.
|
||||||
In both cases each step minimizes one distinct part of the objective function.
|
In both cases each step minimizes one distinct part of the objective function.
|
||||||
The approaches they are based on, however, are fundamentally different.
|
They do, however, have some fundametal differences.
|
||||||
In figure \ref{fig:ana:theo_comp_alg} the two algorithms are juxtaposed,
|
In figure \ref{fig:ana:theo_comp_alg} the two algorithms are juxtaposed in their
|
||||||
in conjuction with the optimization problems they are meant to solve, in their
|
proximal operator form, in conjuction with the optimization problems they
|
||||||
proximal operator form.%
|
are meant to solve.%
|
||||||
%
|
%
|
||||||
\begin{figure}[H]
|
\begin{figure}[H]
|
||||||
\centering
|
\centering
|
||||||
@ -86,8 +86,7 @@ return $\boldsymbol{s}$
|
|||||||
\text{minimize}\hspace{5mm} &
|
\text{minimize}\hspace{5mm} &
|
||||||
\underbrace{\boldsymbol{\gamma}^\text{T}\tilde{\boldsymbol{c}}}
|
\underbrace{\boldsymbol{\gamma}^\text{T}\tilde{\boldsymbol{c}}}
|
||||||
_{\text{Likelihood}}
|
_{\text{Likelihood}}
|
||||||
+ \underbrace{\sum_{j\in\mathcal{J}} g_j\left(
|
+ \underbrace{g\left( \boldsymbol{T}\tilde{\boldsymbol{c}} \right) }
|
||||||
\boldsymbol{T}_j\tilde{\boldsymbol{c}} \right) }
|
|
||||||
_{\text{Constraints}} \\
|
_{\text{Constraints}} \\
|
||||||
\text{subject to}\hspace{5mm} &
|
\text{subject to}\hspace{5mm} &
|
||||||
\tilde{\boldsymbol{c}} \in \mathbb{R}^n
|
\tilde{\boldsymbol{c}} \in \mathbb{R}^n
|
||||||
@ -102,10 +101,12 @@ Initialize $\tilde{\boldsymbol{c}}, \boldsymbol{z}, \boldsymbol{u}, \boldsymbol{
|
|||||||
while stopping criterion not satisfied do
|
while stopping criterion not satisfied do
|
||||||
$\tilde{\boldsymbol{c}} \leftarrow \textbf{prox}_{
|
$\tilde{\boldsymbol{c}} \leftarrow \textbf{prox}_{
|
||||||
\scaleto{\nu \cdot \boldsymbol{\gamma}^{\text{T}}\tilde{\boldsymbol{c}}}{8.5pt}}
|
\scaleto{\nu \cdot \boldsymbol{\gamma}^{\text{T}}\tilde{\boldsymbol{c}}}{8.5pt}}
|
||||||
\left( \boldsymbol{z} - \boldsymbol{u} \right) $
|
\left( \tilde{\boldsymbol{c}}
|
||||||
$\boldsymbol{z}_j \leftarrow \textbf{prox}_{\scaleto{g_j}{7pt}}
|
- \frac{\mu}{\lambda}\boldsymbol{T}^\text{T}\left( \boldsymbol{T}\tilde{\boldsymbol{c}}
|
||||||
\left( \boldsymbol{T}_j\tilde{\boldsymbol{c}}
|
- \boldsymbol{z} + \boldsymbol{u} \right) \right)$
|
||||||
+ \boldsymbol{T}_j\boldsymbol{u} \right) \hspace{5mm}\forall j\in\mathcal{J}$
|
$\boldsymbol{z} \leftarrow \textbf{prox}_{\scaleto{g}{7pt}}
|
||||||
|
\left( \boldsymbol{T}\tilde{\boldsymbol{c}}
|
||||||
|
+ \boldsymbol{u} \right)$
|
||||||
$\boldsymbol{u} \leftarrow \boldsymbol{u}
|
$\boldsymbol{u} \leftarrow \boldsymbol{u}
|
||||||
+ \tilde{\boldsymbol{c}} - \boldsymbol{z}$
|
+ \tilde{\boldsymbol{c}} - \boldsymbol{z}$
|
||||||
end while
|
end while
|
||||||
@ -121,35 +122,37 @@ return $\tilde{\boldsymbol{c}}$
|
|||||||
\label{fig:ana:theo_comp_alg}
|
\label{fig:ana:theo_comp_alg}
|
||||||
\end{figure}%
|
\end{figure}%
|
||||||
%
|
%
|
||||||
\todo{Show how $\tilde{\boldsymbol{c}} \leftarrow \textbf{prox}
|
|
||||||
_{1 / \mu \cdot \boldsymbol{\gamma}^{\text{T}}\tilde{\boldsymbol{c}}}
|
|
||||||
\left( \boldsymbol{z} - \boldsymbol{u} \right) $
|
|
||||||
is the same as
|
|
||||||
$\boldsymbol{\gamma}^\text{T}\tilde{\boldsymbol{c}}
|
|
||||||
+ \sum_{j\in\mathcal{J}} \boldsymbol{\lambda}^\text{T}_j
|
|
||||||
\left( \boldsymbol{T}_j\tilde{\boldsymbol{c}} - \boldsymbol{z}_j \right)
|
|
||||||
+ \frac{\mu}{2}\sum_{j\in\mathcal{J}}
|
|
||||||
\lVert \boldsymbol{T}_j\tilde{\boldsymbol{c}} - \boldsymbol{z}_j \rVert^2_2$}%
|
|
||||||
%
|
|
||||||
\noindent The objective functions of both problems are similar in that they
|
\noindent The objective functions of both problems are similar in that they
|
||||||
both comprise two parts: one associated to the likelihood that a given
|
both comprise two parts: one associated to the likelihood that a given
|
||||||
codeword was sent and one associated to the constraints the codeword is
|
codeword was sent and one associated to the constraints the codeword is
|
||||||
subjected to.
|
subjected to.
|
||||||
Their major difference is that the two parts of the objective minimized with
|
|
||||||
proximal decoding are both functions of the same variable
|
Their major differece is that while with proximal decoding the constraints
|
||||||
$\tilde{\boldsymbol{x}}$, whereas with \ac{ADMM} the two parts are functions
|
are regarded in a global context, considering all parity checks at the same
|
||||||
of different variables: $\tilde{\boldsymbol{c}}$ and $\boldsymbol{z}_{[1:m]}$.
|
time in the second step, with \ac{ADMM} each parity check is
|
||||||
|
considered separately, in a more local context (line 4 in both algorithms).
|
||||||
This difference means that while with proximal decoding the alternating
|
This difference means that while with proximal decoding the alternating
|
||||||
minimization of the two parts of the objective function inevitably leads to
|
minimization of the two parts of the objective function inevitably leads to
|
||||||
oscillatory behaviour (as explained in section (TODO)), this is not the
|
oscillatory behaviour (as explained in section (TODO)), this is not the
|
||||||
case with \ac{ADMM}.
|
case with \ac{ADMM}, which partly explains the disparate decoding performance
|
||||||
|
of the two methods.
|
||||||
|
Furthermore, while with proximal decoding the step considering the constraints
|
||||||
|
is realized using gradient descent - amounting to an approximation -
|
||||||
|
with \ac{ADMM} it reduces to a number of projections onto the parity polytopes
|
||||||
|
$\mathcal{P}_{d_j}$ (see
|
||||||
|
\ref{chapter:LD Decoding using ADMM as a Proximal Algorithm}),
|
||||||
|
which always provide exact results.
|
||||||
|
|
||||||
Another aspect partly explaining the disparate decoding performance is the
|
The contrasting treatment of the constraints (global and approximate with
|
||||||
difference in the minimization step handling the constraints.
|
proximal decoding, local and exact with \ac{ADMM}) also leads to different
|
||||||
While with proximal decoding it is performed using gradient
|
prospects when the decoding process gets stuck in a local minimum.
|
||||||
descent - amounting to an approximation - with \ac{ADMM} it reduces to a
|
With proximal decoding this occurrs due to the approximate nature of the
|
||||||
number of projections onto the parity polytopes $\mathcal{P}_{d_j}$ - which
|
calculation, whereas with \ac{ADMM} it occurs due to the approximate
|
||||||
always provide exact results.
|
formulation of the constraints - not depending on the optimization method
|
||||||
|
itself.
|
||||||
|
The advantage which arises because of this when using \ac{ADMM} is that
|
||||||
|
it can be easily detected, when the algorithm gets stuck - the algorithm
|
||||||
|
returns a pseudocodeword, the components of which are fractional.
|
||||||
|
|
||||||
\begin{itemize}
|
\begin{itemize}
|
||||||
\item The comparison of actual implementations is always debatable /
|
\item The comparison of actual implementations is always debatable /
|
||||||
|
|||||||
Loading…
Reference in New Issue
Block a user