Added first draft of admm vs. proximal decoding comparison

2023-04-01 19:22:50 +02:00 · 2023-04-01 19:22:50 +02:00 · b0c66bb454
commit b0c66bb454
parent 8234f36278
1 changed files with 124 additions and 0 deletions
--- a/latex/thesis/chapters/analysis_of_results.tex
+++ b/latex/thesis/chapters/analysis_of_results.tex
@ -28,3 +28,127 @@
        results when comparing implementations)
 \end{itemize}

+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+\section{Comparison of the Proximal Decoding and LP Decoding using ADMM algorithms}%
+\label{sec:Comparison of the Proximal Decoding and LP Decoding using ADMM algorithms}
+
+In this section, some similarities between the proximal decoding algorithm
+and \ac{LP} decoding using \ac{ADMM} are be pointed out.
+The two algorithms are compared and their different computational and decoding
+performance is explained on the basis of their theoretical structure.
+
+\ac{ADMM} and the proximal gradient method can both be expressed in terms of
+proximal operators.
+They are both composed of an iterative approach consisting of two
+alternating steps.
+In both cases each step minimizes one distinct part of the objective function.
+The approaches they are based on, however, are fundamentally different.
+In figure \ref{fig:ana:theo_comp_alg} the two algorithms are juxtaposed,
+in conjuction with the optimization problems they are meant to solve, in their
+proximal operator form.%
+%
+\begin{figure}[H]
+    \centering
+   
+    \begin{subfigure}{0.48\textwidth}
+        \centering
+   
+        \begin{align*}
+            \text{minimize}\hspace{2mm}   & \underbrace{L\left( \boldsymbol{y} \mid
+                    \tilde{\boldsymbol{x}} \right)}_{\text{Likelihood}}
+                    + \underbrace{\gamma h\left( \tilde{\boldsymbol{x}} \right)}
+                        _{\text{Constraints}} \\
+            \text{subject to}\hspace{2mm} &\tilde{\boldsymbol{x}} \in \mathbb{R}^n
+        \end{align*}
+        
+        \begin{genericAlgorithm}[caption={}, label={},
+            basicstyle=\fontsize{10.5}{15}\selectfont
+            ]
+Initialize variables
+while stopping critierion not satisfied do
+    $\boldsymbol{r} \leftarrow \boldsymbol{r}
+        + \omega \nabla L\left( \boldsymbol{y} \mid \boldsymbol{s} \right) $
+    $\boldsymbol{s} \leftarrow
+        \text{prox}_{\gamma h}\left( \boldsymbol{r} \right) $|\Suppressnumber|
+|\Reactivatenumber|
+end while
+return $\boldsymbol{s}$
+        \end{genericAlgorithm}
+
+        \caption{Proximal gradient method}
+        \label{fig:ana:theo_comp_alg:prox}
+    \end{subfigure}\hfill%
+    \begin{subfigure}{0.48\textwidth}
+        \centering
+        
+        \begin{align*}
+            \text{minimize}\hspace{5mm} &
+                \underbrace{\boldsymbol{\gamma}^\text{T}\tilde{\boldsymbol{c}}}
+                    _{\text{Likelihood}}
+                    + \underbrace{\text{TODO}}_{\text{Constraints}} \\
+%                + \sum_{j\in\mathcal{J}} \boldsymbol{\lambda}^\text{T}_j
+%                    \left( \boldsymbol{T}_j\tilde{\boldsymbol{c}} - \boldsymbol{z}_j \right) 
+%                + \frac{\mu}{2}\sum_{j\in\mathcal{J}}
+%                    \lVert \boldsymbol{T}_j\tilde{\boldsymbol{c}} - \boldsymbol{z}_j \rVert^2_2 \\
+            \text{subject to}\hspace{5mm} & \text{TODO}
+        \end{align*}
+    
+        \begin{genericAlgorithm}[caption={}, label={},
+            basicstyle=\fontsize{10.5}{15}\selectfont
+            ]
+Initialize variables
+while stopping criterion not satisfied do
+    $\tilde{\boldsymbol{c}} \leftarrow \text{prox}_{\text{TODO}}\left( \text{TODO} \right) $
+    $\boldsymbol{z} \leftarrow \text{prox}_{\text{TODO}}\left( \text{TODO} \right) $
+    $\boldsymbol{u}_j \leftarrow \boldsymbol{u}_j
+        + \boldsymbol{T}_j\tilde{\boldsymbol{c}}
+        - \boldsymbol{z}_j$
+end while
+return $\tilde{\boldsymbol{c}}$
+        \end{genericAlgorithm}
+    
+        \caption{\ac{ADMM}}
+        \label{fig:ana:theo_comp_alg:admm}
+    \end{subfigure}%
+    
+
+    \caption{Comparison of the proximal gradient method and \ac{ADMM}}
+    \label{fig:ana:theo_comp_alg}
+\end{figure}%
+%
+\noindent The objective functions of both problems are similar in that they
+both comprise two parts: one associated to the likelihood that a given
+codeword was sent and one associated to the constraints the codeword is
+subjected to.
+Their major difference is that the two parts of the objective minimized with
+proximal decoding are both functions of the same variable
+$\tilde{\boldsymbol{x}}$, whereas with \ac{ADMM} the two parts depend on
+different variables: $\tilde{\boldsymbol{c}}$ and $\boldsymbol{z}$.
+This difference means that while with proximal decoding the alternating
+minimization of the two parts of the objective function inevitably leads to
+oscillatory behaviour (as explained in section (TODO)), this is not the
+case with \ac{ADMM}.
+
+Another aspect partly explaining the disparate decoding performance is the
+difference in the minimization step handling the constraints.
+While with proximal decoding it is performed using gradient
+descent - amounting to an approximation - with \ac{ADMM} it reduces to a
+number of projections onto the parity polytopes $\mathcal{P}_{d_j}$ - which
+always provide exact results.
+
+\begin{itemize}
+    \item The comparison of actual implementations is always debatable /
+        contentious, since it is difficult to separate differences in
+        algorithm performance from differences in implementation
+    \item No large difference in computational performance $\rightarrow$
+        Parallelism cannot come to fruition as decoding is performed on the
+        same number of cores for both algorithms (Multiple decodings in parallel)
+    \item Nonetheless, in realtime applications / applications where the focus
+        is not the mass decoding of raw data, \ac{ADMM} has advantages, since
+        the decoding of a single codeword is performed faster
+    \item \ac{ADMM} faster than proximal decoding $\rightarrow$
+        Parallelism
+    \item Proximal decoding faster than \ac{ADMM} $\rightarrow$ dafuq
+        (larger number of iterations before convergence?)
+\end{itemize}