Finished first version of theoretical comparison text

2023-04-12 22:54:33 +02:00 · 2023-04-12 22:54:33 +02:00 · a1a1fa1f71
commit a1a1fa1f71
parent 55c9e2808b
3 changed files with 180 additions and 63 deletions
--- a/latex/thesis/chapters/appendix.tex
+++ b/latex/thesis/chapters/appendix.tex
@ -810,6 +810,10 @@ $\gamma \in \left\{ 0.01, 0.05, 0.15 \right\}$.
 \end{figure}


+\chapter{Proximal Decoding as a Message Passing Algorithm}
+\label{chapter:Proximal Decoding as a Message Passing Algorithm}
+
+
 %\chapter{\acs{LP} Decoding using \acs{ADMM} as a Proximal Algorithm}%
 %\label{chapter:LD Decoding using ADMM as a Proximal Algorithm}
 %
--- a/latex/thesis/chapters/comparison.tex
+++ b/latex/thesis/chapters/comparison.tex
@ -1,15 +1,11 @@
 \chapter{Comparison of Proximal Decoding and \acs{LP} Decoding using \acs{ADMM}}%
 \label{chapter:comparison}

-TODO
+In this chapter, proximal decoding and \ac{LP} Decoding using \ac{ADMM} are compared.
+First the two algorithms are compared on a theoretical basis.
+Subsequently, their respective simulation results are examined and their
+differences are interpreted on the basis of their theoretical structure.

-\todo{Note: This chapter is currently only a very rough draft and is not yet ready for correction}
-
-%In this chapter, proximal decoding and \ac{LP} Decoding using \ac{ADMM} are compared.
-%First the two algorithms are compared on a theoretical basis.
-%Subsequently, their respective simulation results are examined and their
-%differences interpreted on the basis of their theoretical structure.
-%
 %some similarities between the proximal decoding algorithm
 %and \ac{LP} decoding using \ac{ADMM} are be pointed out.
 %The two algorithms are compared and their different computational and decoding
@ -21,19 +17,39 @@ TODO
 \label{sec:comp:theo}

 \ac{ADMM} and the proximal gradient method can both be expressed in terms of
-proximal operators.
-They are both composed of an iterative approach consisting of two
-alternating steps.
-In both cases each step minimizes one distinct part of the objective function.
-They do, however, have some fundametal differences.
-In figure \ref{fig:ana:theo_comp_alg} the two algorithms are juxtaposed in their
-proximal operator form, in conjuction with the optimization problems they
-are meant to solve.%
+proximal operators \cite[Sec. 4.4]{proximal_algorithms}.
+When using \ac{ADMM} as an optimization method to solve the \ac{LP} decoding
+problem specifically, this is not quite possible because of the multiple
+constraints.
+In spite of that, the two algorithms still show some striking similarities.
+
+To see the first of these similarities, the \ac{LP} decoding problem in
+equation (\ref{eq:lp:relaxed_formulation}) can be slightly rewritten using the
+\textit{indicator functions} $g_j : \mathbb{R}^{d_j} \rightarrow
+\left\{ 0, +\infty \right\} \hspace{1mm}, j\in\mathcal{J}$ for the polytopes
+$\mathcal{P}_{d_j}, \hspace{1mm} j\in\mathcal{J}$, defined as%
 %
-\begin{figure}[H]
+\begin{align*}
+    g_j\left( \boldsymbol{t} \right) := \begin{cases}
+        0, & \boldsymbol{t} \in \mathcal{P}_{d_j} \\
+        +\infty, & \boldsymbol{t} \not\in \mathcal{P}_{d_j}
+    \end{cases}
+,\end{align*}
+%
+by moving the constraints into the objective function, as shown in figure
+\ref{fig:ana:theo_comp_alg:admm}.
+Both algorithms are composed of an iterative approach consisting of two
+alternating steps.
+The objective functions of both problems are similar in that they
+both comprise two parts: one associated to the likelihood that a given
+codeword was sent and one associated to the constraints the decoding process
+is subjected to.
+%
+
+\begin{figure}[h]
    \centering
   
-    \begin{subfigure}{0.48\textwidth}
+    \begin{subfigure}{0.42\textwidth}
        \centering
   
        \begin{align*}
@ -45,10 +61,10 @@ are meant to solve.%
        \end{align*}
        
        \begin{genericAlgorithm}[caption={}, label={},
-            basicstyle=\fontsize{11}{17}\selectfont
+            basicstyle=\fontsize{10}{18}\selectfont
            ]
 Initialize $\boldsymbol{r}, \boldsymbol{s}, \omega, \gamma$
-while stopping critierion not satisfied do
+while stopping critierion unfulfilled do
    $\boldsymbol{r} \leftarrow \boldsymbol{r}
        + \omega \nabla L\left( \boldsymbol{y} \mid \boldsymbol{s} \right) $
    $\boldsymbol{s} \leftarrow
@ -58,44 +74,42 @@ end while
 return $\boldsymbol{s}$
        \end{genericAlgorithm}

-        \caption{Proximal gradient method}
+        \caption{Proximal decoding}
        \label{fig:ana:theo_comp_alg:prox}
    \end{subfigure}\hfill%
-    \begin{subfigure}{0.48\textwidth}
+    \begin{subfigure}{0.55\textwidth}
        \centering
        
        \begin{align*}
            \text{minimize}\hspace{5mm} &
                \underbrace{\boldsymbol{\gamma}^\text{T}\tilde{\boldsymbol{c}}}
                    _{\text{Likelihood}}
-                + \underbrace{g\left( \boldsymbol{T}\tilde{\boldsymbol{c}} \right) }
+                + \underbrace{\sum\nolimits_{j\in\mathcal{J}} g_j\left( \boldsymbol{T}_j\tilde{\boldsymbol{c}} \right) }
                    _{\text{Constraints}} \\
            \text{subject to}\hspace{5mm} &
                \tilde{\boldsymbol{c}} \in \mathbb{R}^n
-%                \boldsymbol{T}_j\tilde{\boldsymbol{c}} = \boldsymbol{z}_j\hspace{3mm}
-%                    \forall j\in\mathcal{J}
        \end{align*}
    
        \begin{genericAlgorithm}[caption={}, label={},
-            basicstyle=\fontsize{11}{17}\selectfont
+            basicstyle=\fontsize{10}{18}\selectfont
            ]
-Initialize $\tilde{\boldsymbol{c}}, \boldsymbol{z}, \boldsymbol{u}, \boldsymbol{\gamma}, \nu$
-while stopping criterion not satisfied do
-    $\tilde{\boldsymbol{c}} \leftarrow \textbf{prox}_{
-        \scaleto{\nu \cdot \boldsymbol{\gamma}^{\text{T}}\tilde{\boldsymbol{c}}}{8.5pt}}
-        \left( \tilde{\boldsymbol{c}}
-        - \frac{\mu}{\lambda}\boldsymbol{T}^\text{T}\left( \boldsymbol{T}\tilde{\boldsymbol{c}}
-        - \boldsymbol{z} + \boldsymbol{u} \right)  \right)$
-    $\boldsymbol{z} \leftarrow \textbf{prox}_{\scaleto{g}{7pt}}
-        \left( \boldsymbol{T}\tilde{\boldsymbol{c}}
-            + \boldsymbol{u} \right)$
-    $\boldsymbol{u} \leftarrow \boldsymbol{u}
-        + \tilde{\boldsymbol{c}} - \boldsymbol{z}$
+Initialize $\tilde{\boldsymbol{c}}, \boldsymbol{z}, \boldsymbol{u}, \boldsymbol{\gamma}, \rho$
+while stopping criterion unfulfilled do
+    $\tilde{\boldsymbol{c}} \leftarrow \argmin_{\tilde{\boldsymbol{c}}}
+        \left( \boldsymbol{\gamma}^\text{T}\tilde{\boldsymbol{c}}
+        + \frac{\rho}{2}\sum_{j\in\mathcal{J}} \left\Vert
+            \boldsymbol{T}_j\tilde{\boldsymbol{c}} - \boldsymbol{z}_j
+            + \boldsymbol{u}_j \right\Vert \right)$
+    $\boldsymbol{z}_j \leftarrow \textbf{prox}_{g_j}
+        \left( \boldsymbol{T}_j\tilde{\boldsymbol{c}}
+        + \boldsymbol{u}_j \right), \hspace{5mm}\forall j\in\mathcal{J}$
+    $\boldsymbol{u}_j \leftarrow \boldsymbol{u}_j
+        + \tilde{\boldsymbol{c}} - \boldsymbol{z}_j, \hspace{15.25mm}\forall j\in\mathcal{J}$
 end while
 return $\tilde{\boldsymbol{c}}$
        \end{genericAlgorithm}
    
-        \caption{\ac{ADMM}}
+        \caption{LP decoding using \ac{ADMM}}
        \label{fig:ana:theo_comp_alg:admm}
    \end{subfigure}%
    
@ -104,10 +118,6 @@ return $\tilde{\boldsymbol{c}}$
    \label{fig:ana:theo_comp_alg}
 \end{figure}%
 %
-\noindent The objective functions of both problems are similar in that they
-both comprise two parts: one associated to the likelihood that a given
-codeword was sent and one associated to the constraints the codeword is
-subjected to.

 Their major differece is that while with proximal decoding the constraints
 are regarded in a global context, considering all parity checks at the same
@ -124,19 +134,129 @@ with \ac{ADMM} it reduces to a number of projections onto the parity polytopes
 $\mathcal{P}_{d_j}$ which always provide exact results.

 The contrasting treatment of the constraints (global and approximate with
-proximal decoding, local and exact with \ac{ADMM}) also leads to different
-prospects when the decoding process gets stuck in a local minimum.
+proximal decoding as opposed to local and exact with \ac{LP} decoding using
+\ac{ADMM}) also leads to different prospects when the decoding process gets
+stuck in a local minimum.
 With proximal decoding this occurrs due to the approximate nature of the
-calculation, whereas with \ac{ADMM} it occurs due to the approximate
-formulation of the constraints - not depending on the optimization method
-itself.
-The advantage which arises because of this when using \ac{ADMM} is that
-it can be easily detected, when the algorithm gets stuck - the algorithm
-returns a pseudocodeword, the components of which are fractional.
-\todo{Additional constraints can then be successively added, until a valid
-codeword is returned}
+calculation, whereas with \ac{LP} decoding it occurs due to the approximate
+formulation of the constraints - independent of the optimization method
+itself. 
+The advantage which arises because of this when employing \ac{LP} decoding is
+that it can be easily detected, when the algorithm gets stuck - it
+returns a solution corresponding to a pseudocodeword, the components of which
+are fractional.
+Moreover, when a valid codeword is returned, it is also the \ac{ML} codeword.
+This means that additional redundant parity-checks can be added successively
+until the codeword returned is valid and thus the \ac{ML} solution is found
+\cite[Sec. IV.]{alp}.

-\todo{Compare time complexity using Big-O notation}
+In terms of time complexity, the two decoding algorithms are comparable.
+Each of the operations required for proximal decoding can be performed
+in linear time for \ac{LDPC} codes (see section \ref{subsec:prox:comp_perf}).
+The same is true for the $\tilde{\boldsymbol{c}}$- and $\boldsymbol{u}$-update
+steps of \ac{LP} decoding using \ac{ADMM}, while
+the projection step has a worst-case time complexity of
+$\mathcal{O}\left( n^2 \right)$ and an average complexity of
+$\mathcal{O}\left( n \right)$ (see section TODO, \cite[Sec. VIII.]{lautern}).
+
+Both algorithms can be understood as message-passing algorithms, \ac{LP}
+decoding using \ac{ADMM} as similarly to \cite[Sec. III. D.]{original_admm}
+or \cite[Sec. II. B.]{efficient_lp_dec_admm} and proximal decoding as shown in
+appendix \ref{chapter:Proximal Decoding as a Message Passing Algorithm}.
+The algorithms in their message-passing form are depicted in figure
+\ref{fig:comp:message_passing}.
+$M_{j\to i}$ denotes the message transmitted from \ac{CN} j to \ac{VN} i.
+$M_{j\to}$ signifies the special case where a \ac{VN} transmits the same
+message to all \acp{VN}.
+%
+\begin{figure}[h]
+    \centering
+    
+    \begin{subfigure}{0.5\textwidth}
+        \centering
+    
+        \begin{genericAlgorithm}[caption={}, label={},
+%            basicstyle=\fontsize{10}{16}\selectfont
+            ]
+Initialize $\boldsymbol{r}, \boldsymbol{s}, \omega, \gamma$
+while stopping critierion unfulfilled do
+    for j in $\mathcal{J}$ do
+        $p_j \leftarrow \prod_{i\in N_c\left( j \right) } r_i $
+        $M_{j\to} \leftarrow p_j^2 - p_j$|\Suppressnumber|
+|\vspace{0.22mm}\Reactivatenumber|
+    end for
+    for i in $\mathcal{I}$ do
+        $s_i \leftarrow s_i + \gamma \left[ 4\left( s_i^2 - 1 \right)s_i
+            \phantom{\frac{4}{s_i}}\right.$|\Suppressnumber|
+                     |\Reactivatenumber|$\left.+ \frac{4}{s_i}\sum_{j\in N_v\left( i \right) }
+                        M_{j\to} \right] $
+        $r_i \leftarrow r_i + \omega \left( s_i - y_i \right)$
+    end for
+end while
+        \end{genericAlgorithm}
+
+        \caption{Proximal decoding}
+        \label{fig:comp:message_passing:proximal}
+    \end{subfigure}%
+    \begin{subfigure}{0.5\textwidth}
+        \centering
+        
+        \begin{genericAlgorithm}[caption={}, label={},
+%            basicstyle=\fontsize{10}{16}\selectfont
+        ]
+Initialize $\tilde{\boldsymbol{c}}, \boldsymbol{z}, \boldsymbol{u}, \boldsymbol{\gamma}, \rho$
+while stopping criterion unfulfilled do
+    for j in $\mathcal{J}$ do
+        $\boldsymbol{z}_j \leftarrow \Pi_{P_{d_j}}\left(
+            \boldsymbol{T}_j\tilde{\boldsymbol{c}} + \boldsymbol{u}_j\right)$
+        $\boldsymbol{u}_j \leftarrow \boldsymbol{u}_j + \boldsymbol{T}_j\tilde{\boldsymbol{c}}
+            - \boldsymbol{z}_j$
+        $M_{j\to i} \leftarrow \left( z_j \right)_i - \left( u_j \right)_i,
+            \hspace{3mm} \forall i \in N_c\left( j \right) $
+    end for
+    for i in $\mathcal{I}$ do
+        $\tilde{c}_i \leftarrow \frac{1}{d_i}
+            \left(\sum_{j\in N_v\left( i \right) } M_{j\to i}
+                - \frac{\gamma_i}{\mu} \right)$|\Suppressnumber|
+|\vspace{7mm}\Reactivatenumber|
+    end for
+end while
+        \end{genericAlgorithm}
+        \caption{\ac{LP} decoding using \ac{ADMM}}
+        \label{fig:comp:message_passing:admm}
+    \end{subfigure}%
+    
+
+    \caption{The proximal gradient method and \ac{LP} decoding using \ac{ADMM}
+        as message passing algorithms}
+    \label{fig:comp:message_passing}
+\end{figure}%
+%
+It is evident that while the two algorithms are very similar in their general
+structure, with \ac{LP} decoding using \ac{ADMM}, multiple messages have to be
+computed for each check node (line 6 in figure
+\ref{fig:comp:message_passing:admm}), whereas
+with proximal decoding, the same message is transmitted to all \acp{VN}
+(line 5 of figure \ref{fig:comp:message_passing:proximal}).
+This means that while both algorithms have an averege time complexity of
+$\mathcal{O}\left( n \right)$, more arithmetic operations are required for the
+\ac{ADMM} case.
+
+In conclusion, the two algorithms have a very similar structure, where the
+parts of the objective function relating to the likelihood and to the
+constraints are minimized in an alternating fashion.
+With proximal decoding this minimization is performed for all constraints at once
+in an approximative manner, while with \ac{LP} decoding using \ac{ADMM} it is
+performed for each constraint individually and with exact results.
+In terms of time complexity, both algorithms are, on average, linear with
+respect to $n$, although for \ac{LP} decoding using \ac{ADMM} significantly
+more arithmetic operations are necessary in each iteration.
+
+
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+\section{Comparison of Simulation Results}%
+\label{sec:comp:res}

 \begin{itemize}
    \item The comparison of actual implementations is always debatable /
@ -153,11 +273,3 @@ codeword is returned}
    \item Proximal decoding faster than \ac{ADMM} $\rightarrow$ dafuq
        (larger number of iterations before convergence? More values to compute for ADMM?)
 \end{itemize}
-
-
-%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-\section{Comparison of Simulation Results}%
-\label{sec:comp:res}
-
-TODO
-
--- a/latex/thesis/chapters/proximal_decoding.tex
+++ b/latex/thesis/chapters/proximal_decoding.tex
@ -1169,6 +1169,7 @@ an invalid codeword.


 \subsection{Computational Performance}
+\label{subsec:prox:comp_perf}

 In order to determine the time complexity of proximal decoding for an
 \ac{AWGN} channel, the decoding process as depicted in algorithm
@ -1180,7 +1181,7 @@ with $n$, the two
 recursive steps in lines 3 and 4 have time complexity
 $\mathcal{O}\left( n \right)$ \cite[Sec. 4.1]{proximal_paper}.
 The $\text{sign}$ operation in line 5 also has $\mathcal{O}\left( n \right) $
-time complexity as it only depends on the $n$ components of $\boldsymbol{s}$.
+time complexity, as it only depends on the $n$ components of $\boldsymbol{s}$.
 Given the sparsity of the matrix $\boldsymbol{H}$, evaluating the parity-check
 condition has linear time complexity as well, since at most $n$ additions and
 multiplications have to be performed.