From 70eac9515f07d30cc547ee4455e4858a3392de98 Mon Sep 17 00:00:00 2001 From: Andreas Tsouchlos Date: Tue, 4 Apr 2023 18:16:44 +0200 Subject: [PATCH] Continued writing theoretical comparison of admm and proximal decoding --- latex/thesis/chapters/analysis_of_results.tex | 65 ++++++++++--------- 1 file changed, 34 insertions(+), 31 deletions(-) diff --git a/latex/thesis/chapters/analysis_of_results.tex b/latex/thesis/chapters/analysis_of_results.tex index 3b75382..37d3122 100644 --- a/latex/thesis/chapters/analysis_of_results.tex +++ b/latex/thesis/chapters/analysis_of_results.tex @@ -43,10 +43,10 @@ proximal operators. They are both composed of an iterative approach consisting of two alternating steps. In both cases each step minimizes one distinct part of the objective function. -The approaches they are based on, however, are fundamentally different. -In figure \ref{fig:ana:theo_comp_alg} the two algorithms are juxtaposed, -in conjuction with the optimization problems they are meant to solve, in their -proximal operator form.% +They do, however, have some fundametal differences. +In figure \ref{fig:ana:theo_comp_alg} the two algorithms are juxtaposed in their +proximal operator form, in conjuction with the optimization problems they +are meant to solve.% % \begin{figure}[H] \centering @@ -86,8 +86,7 @@ return $\boldsymbol{s}$ \text{minimize}\hspace{5mm} & \underbrace{\boldsymbol{\gamma}^\text{T}\tilde{\boldsymbol{c}}} _{\text{Likelihood}} - + \underbrace{\sum_{j\in\mathcal{J}} g_j\left( - \boldsymbol{T}_j\tilde{\boldsymbol{c}} \right) } + + \underbrace{g\left( \boldsymbol{T}\tilde{\boldsymbol{c}} \right) } _{\text{Constraints}} \\ \text{subject to}\hspace{5mm} & \tilde{\boldsymbol{c}} \in \mathbb{R}^n @@ -102,10 +101,12 @@ Initialize $\tilde{\boldsymbol{c}}, \boldsymbol{z}, \boldsymbol{u}, \boldsymbol{ while stopping criterion not satisfied do $\tilde{\boldsymbol{c}} \leftarrow \textbf{prox}_{ \scaleto{\nu \cdot \boldsymbol{\gamma}^{\text{T}}\tilde{\boldsymbol{c}}}{8.5pt}} - \left( \boldsymbol{z} - \boldsymbol{u} \right) $ - $\boldsymbol{z}_j \leftarrow \textbf{prox}_{\scaleto{g_j}{7pt}} - \left( \boldsymbol{T}_j\tilde{\boldsymbol{c}} - + \boldsymbol{T}_j\boldsymbol{u} \right) \hspace{5mm}\forall j\in\mathcal{J}$ + \left( \tilde{\boldsymbol{c}} + - \frac{\mu}{\lambda}\boldsymbol{T}^\text{T}\left( \boldsymbol{T}\tilde{\boldsymbol{c}} + - \boldsymbol{z} + \boldsymbol{u} \right) \right)$ + $\boldsymbol{z} \leftarrow \textbf{prox}_{\scaleto{g}{7pt}} + \left( \boldsymbol{T}\tilde{\boldsymbol{c}} + + \boldsymbol{u} \right)$ $\boldsymbol{u} \leftarrow \boldsymbol{u} + \tilde{\boldsymbol{c}} - \boldsymbol{z}$ end while @@ -121,35 +122,37 @@ return $\tilde{\boldsymbol{c}}$ \label{fig:ana:theo_comp_alg} \end{figure}% % -\todo{Show how $\tilde{\boldsymbol{c}} \leftarrow \textbf{prox} - _{1 / \mu \cdot \boldsymbol{\gamma}^{\text{T}}\tilde{\boldsymbol{c}}} - \left( \boldsymbol{z} - \boldsymbol{u} \right) $ -is the same as -$\boldsymbol{\gamma}^\text{T}\tilde{\boldsymbol{c}} - + \sum_{j\in\mathcal{J}} \boldsymbol{\lambda}^\text{T}_j - \left( \boldsymbol{T}_j\tilde{\boldsymbol{c}} - \boldsymbol{z}_j \right) - + \frac{\mu}{2}\sum_{j\in\mathcal{J}} - \lVert \boldsymbol{T}_j\tilde{\boldsymbol{c}} - \boldsymbol{z}_j \rVert^2_2$}% -% \noindent The objective functions of both problems are similar in that they both comprise two parts: one associated to the likelihood that a given codeword was sent and one associated to the constraints the codeword is subjected to. -Their major difference is that the two parts of the objective minimized with -proximal decoding are both functions of the same variable -$\tilde{\boldsymbol{x}}$, whereas with \ac{ADMM} the two parts are functions -of different variables: $\tilde{\boldsymbol{c}}$ and $\boldsymbol{z}_{[1:m]}$. + +Their major differece is that while with proximal decoding the constraints +are regarded in a global context, considering all parity checks at the same +time in the second step, with \ac{ADMM} each parity check is +considered separately, in a more local context (line 4 in both algorithms). This difference means that while with proximal decoding the alternating minimization of the two parts of the objective function inevitably leads to oscillatory behaviour (as explained in section (TODO)), this is not the -case with \ac{ADMM}. +case with \ac{ADMM}, which partly explains the disparate decoding performance +of the two methods. +Furthermore, while with proximal decoding the step considering the constraints +is realized using gradient descent - amounting to an approximation - +with \ac{ADMM} it reduces to a number of projections onto the parity polytopes +$\mathcal{P}_{d_j}$ (see +\ref{chapter:LD Decoding using ADMM as a Proximal Algorithm}), +which always provide exact results. -Another aspect partly explaining the disparate decoding performance is the -difference in the minimization step handling the constraints. -While with proximal decoding it is performed using gradient -descent - amounting to an approximation - with \ac{ADMM} it reduces to a -number of projections onto the parity polytopes $\mathcal{P}_{d_j}$ - which -always provide exact results. +The contrasting treatment of the constraints (global and approximate with +proximal decoding, local and exact with \ac{ADMM}) also leads to different +prospects when the decoding process gets stuck in a local minimum. +With proximal decoding this occurrs due to the approximate nature of the +calculation, whereas with \ac{ADMM} it occurs due to the approximate +formulation of the constraints - not depending on the optimization method +itself. +The advantage which arises because of this when using \ac{ADMM} is that +it can be easily detected, when the algorithm gets stuck - the algorithm +returns a pseudocodeword, the components of which are fractional. \begin{itemize} \item The comparison of actual implementations is always debatable /