\chapter{Comparison of Proximal Decoding and \acs{LP} Decoding using \acs{ADMM}}% \label{chapter:comparison} TODO %In this chapter, proximal decoding and \ac{LP} Decoding using \ac{ADMM} are compared. %First the two algorithms are compared on a theoretical basis. %Subsequently, their respective simulation results are examined and their %differences interpreted on the basis of their theoretical structure. % %some similarities between the proximal decoding algorithm %and \ac{LP} decoding using \ac{ADMM} are be pointed out. %The two algorithms are compared and their different computational and decoding %performance is interpreted on the basis of their theoretical structure. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Theoretical Comparison}% \label{sec:comp:theo} \ac{ADMM} and the proximal gradient method can both be expressed in terms of proximal operators. They are both composed of an iterative approach consisting of two alternating steps. In both cases each step minimizes one distinct part of the objective function. They do, however, have some fundametal differences. In figure \ref{fig:ana:theo_comp_alg} the two algorithms are juxtaposed in their proximal operator form, in conjuction with the optimization problems they are meant to solve.% % \begin{figure}[H] \centering \begin{subfigure}{0.48\textwidth} \centering \begin{align*} \text{minimize}\hspace{2mm} & \underbrace{L\left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right)}_{\text{Likelihood}} + \underbrace{\gamma h\left( \tilde{\boldsymbol{x}} \right)} _{\text{Constraints}} \\ \text{subject to}\hspace{2mm} &\tilde{\boldsymbol{x}} \in \mathbb{R}^n \end{align*} \begin{genericAlgorithm}[caption={}, label={}, basicstyle=\fontsize{11}{17}\selectfont ] Initialize $\boldsymbol{r}, \boldsymbol{s}, \omega, \gamma$ while stopping critierion not satisfied do $\boldsymbol{r} \leftarrow \boldsymbol{r} + \omega \nabla L\left( \boldsymbol{y} \mid \boldsymbol{s} \right) $ $\boldsymbol{s} \leftarrow \textbf{prox}_{\scaleto{\gamma h}{7.5pt}}\left( \boldsymbol{r} \right) $|\Suppressnumber| |\Reactivatenumber| end while return $\boldsymbol{s}$ \end{genericAlgorithm} \caption{Proximal gradient method} \label{fig:ana:theo_comp_alg:prox} \end{subfigure}\hfill% \begin{subfigure}{0.48\textwidth} \centering \begin{align*} \text{minimize}\hspace{5mm} & \underbrace{\boldsymbol{\gamma}^\text{T}\tilde{\boldsymbol{c}}} _{\text{Likelihood}} + \underbrace{g\left( \boldsymbol{T}\tilde{\boldsymbol{c}} \right) } _{\text{Constraints}} \\ \text{subject to}\hspace{5mm} & \tilde{\boldsymbol{c}} \in \mathbb{R}^n % \boldsymbol{T}_j\tilde{\boldsymbol{c}} = \boldsymbol{z}_j\hspace{3mm} % \forall j\in\mathcal{J} \end{align*} \begin{genericAlgorithm}[caption={}, label={}, basicstyle=\fontsize{11}{17}\selectfont ] Initialize $\tilde{\boldsymbol{c}}, \boldsymbol{z}, \boldsymbol{u}, \boldsymbol{\gamma}, \nu$ while stopping criterion not satisfied do $\tilde{\boldsymbol{c}} \leftarrow \textbf{prox}_{ \scaleto{\nu \cdot \boldsymbol{\gamma}^{\text{T}}\tilde{\boldsymbol{c}}}{8.5pt}} \left( \tilde{\boldsymbol{c}} - \frac{\mu}{\lambda}\boldsymbol{T}^\text{T}\left( \boldsymbol{T}\tilde{\boldsymbol{c}} - \boldsymbol{z} + \boldsymbol{u} \right) \right)$ $\boldsymbol{z} \leftarrow \textbf{prox}_{\scaleto{g}{7pt}} \left( \boldsymbol{T}\tilde{\boldsymbol{c}} + \boldsymbol{u} \right)$ $\boldsymbol{u} \leftarrow \boldsymbol{u} + \tilde{\boldsymbol{c}} - \boldsymbol{z}$ end while return $\tilde{\boldsymbol{c}}$ \end{genericAlgorithm} \caption{\ac{ADMM}} \label{fig:ana:theo_comp_alg:admm} \end{subfigure}% \caption{Comparison of the proximal gradient method and \ac{ADMM}} \label{fig:ana:theo_comp_alg} \end{figure}% % \noindent The objective functions of both problems are similar in that they both comprise two parts: one associated to the likelihood that a given codeword was sent and one associated to the constraints the codeword is subjected to. Their major differece is that while with proximal decoding the constraints are regarded in a global context, considering all parity checks at the same time, with \ac{ADMM} each parity check is considered separately and in a more local context (line 4 in both algorithms). This difference means that while with proximal decoding the alternating minimization of the two parts of the objective function inevitably leads to oscillatory behaviour (as explained in section \ref{subsec:prox:conv_properties}), this is not the case with \ac{ADMM}, which partly explains the disparate decoding performance of the two methods. Furthermore, while with proximal decoding the step considering the constraints is realized using gradient descent - amounting to an approximation - with \ac{ADMM} it reduces to a number of projections onto the parity polytopes $\mathcal{P}_{d_j}$ (see \ref{chapter:LD Decoding using ADMM as a Proximal Algorithm}), which always provide exact results. The contrasting treatment of the constraints (global and approximate with proximal decoding, local and exact with \ac{ADMM}) also leads to different prospects when the decoding process gets stuck in a local minimum. With proximal decoding this occurrs due to the approximate nature of the calculation, whereas with \ac{ADMM} it occurs due to the approximate formulation of the constraints - not depending on the optimization method itself. The advantage which arises because of this when using \ac{ADMM} is that it can be easily detected, when the algorithm gets stuck - the algorithm returns a pseudocodeword, the components of which are fractional. \todo{Additional constraints can then be successively added, until a valid codeword is returned} \todo{Compare time complexity using Big-O notation} \begin{itemize} \item The comparison of actual implementations is always debatable / contentious, since it is difficult to separate differences in algorithm performance from differences in implementation \item No large difference in computational performance $\rightarrow$ Parallelism cannot come to fruition as decoding is performed on the same number of cores for both algorithms (Multiple decodings in parallel) \item Nonetheless, in realtime applications / applications where the focus is not the mass decoding of raw data, \ac{ADMM} has advantages, since the decoding of a single codeword is performed faster \item \ac{ADMM} faster than proximal decoding $\rightarrow$ Parallelism \item Proximal decoding faster than \ac{ADMM} $\rightarrow$ dafuq (larger number of iterations before convergence? More values to compute for ADMM?) \end{itemize} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Comparison of Simulation Results}% \label{sec:comp:res} TODO