diff --git a/latex/thesis/chapters/analysis_of_results.tex b/latex/thesis/chapters/analysis_of_results.tex index 1677f7c..4622937 100644 --- a/latex/thesis/chapters/analysis_of_results.tex +++ b/latex/thesis/chapters/analysis_of_results.tex @@ -28,3 +28,127 @@ results when comparing implementations) \end{itemize} + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\section{Comparison of the Proximal Decoding and LP Decoding using ADMM algorithms}% +\label{sec:Comparison of the Proximal Decoding and LP Decoding using ADMM algorithms} + +In this section, some similarities between the proximal decoding algorithm +and \ac{LP} decoding using \ac{ADMM} are be pointed out. +The two algorithms are compared and their different computational and decoding +performance is explained on the basis of their theoretical structure. + +\ac{ADMM} and the proximal gradient method can both be expressed in terms of +proximal operators. +They are both composed of an iterative approach consisting of two +alternating steps. +In both cases each step minimizes one distinct part of the objective function. +The approaches they are based on, however, are fundamentally different. +In figure \ref{fig:ana:theo_comp_alg} the two algorithms are juxtaposed, +in conjuction with the optimization problems they are meant to solve, in their +proximal operator form.% +% +\begin{figure}[H] + \centering + + \begin{subfigure}{0.48\textwidth} + \centering + + \begin{align*} + \text{minimize}\hspace{2mm} & \underbrace{L\left( \boldsymbol{y} \mid + \tilde{\boldsymbol{x}} \right)}_{\text{Likelihood}} + + \underbrace{\gamma h\left( \tilde{\boldsymbol{x}} \right)} + _{\text{Constraints}} \\ + \text{subject to}\hspace{2mm} &\tilde{\boldsymbol{x}} \in \mathbb{R}^n + \end{align*} + + \begin{genericAlgorithm}[caption={}, label={}, + basicstyle=\fontsize{10.5}{15}\selectfont + ] +Initialize variables +while stopping critierion not satisfied do + $\boldsymbol{r} \leftarrow \boldsymbol{r} + + \omega \nabla L\left( \boldsymbol{y} \mid \boldsymbol{s} \right) $ + $\boldsymbol{s} \leftarrow + \text{prox}_{\gamma h}\left( \boldsymbol{r} \right) $|\Suppressnumber| +|\Reactivatenumber| +end while +return $\boldsymbol{s}$ + \end{genericAlgorithm} + + \caption{Proximal gradient method} + \label{fig:ana:theo_comp_alg:prox} + \end{subfigure}\hfill% + \begin{subfigure}{0.48\textwidth} + \centering + + \begin{align*} + \text{minimize}\hspace{5mm} & + \underbrace{\boldsymbol{\gamma}^\text{T}\tilde{\boldsymbol{c}}} + _{\text{Likelihood}} + + \underbrace{\text{TODO}}_{\text{Constraints}} \\ +% + \sum_{j\in\mathcal{J}} \boldsymbol{\lambda}^\text{T}_j +% \left( \boldsymbol{T}_j\tilde{\boldsymbol{c}} - \boldsymbol{z}_j \right) +% + \frac{\mu}{2}\sum_{j\in\mathcal{J}} +% \lVert \boldsymbol{T}_j\tilde{\boldsymbol{c}} - \boldsymbol{z}_j \rVert^2_2 \\ + \text{subject to}\hspace{5mm} & \text{TODO} + \end{align*} + + \begin{genericAlgorithm}[caption={}, label={}, + basicstyle=\fontsize{10.5}{15}\selectfont + ] +Initialize variables +while stopping criterion not satisfied do + $\tilde{\boldsymbol{c}} \leftarrow \text{prox}_{\text{TODO}}\left( \text{TODO} \right) $ + $\boldsymbol{z} \leftarrow \text{prox}_{\text{TODO}}\left( \text{TODO} \right) $ + $\boldsymbol{u}_j \leftarrow \boldsymbol{u}_j + + \boldsymbol{T}_j\tilde{\boldsymbol{c}} + - \boldsymbol{z}_j$ +end while +return $\tilde{\boldsymbol{c}}$ + \end{genericAlgorithm} + + \caption{\ac{ADMM}} + \label{fig:ana:theo_comp_alg:admm} + \end{subfigure}% + + + \caption{Comparison of the proximal gradient method and \ac{ADMM}} + \label{fig:ana:theo_comp_alg} +\end{figure}% +% +\noindent The objective functions of both problems are similar in that they +both comprise two parts: one associated to the likelihood that a given +codeword was sent and one associated to the constraints the codeword is +subjected to. +Their major difference is that the two parts of the objective minimized with +proximal decoding are both functions of the same variable +$\tilde{\boldsymbol{x}}$, whereas with \ac{ADMM} the two parts depend on +different variables: $\tilde{\boldsymbol{c}}$ and $\boldsymbol{z}$. +This difference means that while with proximal decoding the alternating +minimization of the two parts of the objective function inevitably leads to +oscillatory behaviour (as explained in section (TODO)), this is not the +case with \ac{ADMM}. + +Another aspect partly explaining the disparate decoding performance is the +difference in the minimization step handling the constraints. +While with proximal decoding it is performed using gradient +descent - amounting to an approximation - with \ac{ADMM} it reduces to a +number of projections onto the parity polytopes $\mathcal{P}_{d_j}$ - which +always provide exact results. + +\begin{itemize} + \item The comparison of actual implementations is always debatable / + contentious, since it is difficult to separate differences in + algorithm performance from differences in implementation + \item No large difference in computational performance $\rightarrow$ + Parallelism cannot come to fruition as decoding is performed on the + same number of cores for both algorithms (Multiple decodings in parallel) + \item Nonetheless, in realtime applications / applications where the focus + is not the mass decoding of raw data, \ac{ADMM} has advantages, since + the decoding of a single codeword is performed faster + \item \ac{ADMM} faster than proximal decoding $\rightarrow$ + Parallelism + \item Proximal decoding faster than \ac{ADMM} $\rightarrow$ dafuq + (larger number of iterations before convergence?) +\end{itemize}