Restructured document

2023-04-07 12:01:01 +02:00 · 2023-04-07 12:01:01 +02:00 · 9d0458f6b3
commit 9d0458f6b3
parent a0a13dbb2d
7 changed files with 476 additions and 493 deletions
--- a/latex/thesis/chapters/appendix.tex
+++ b/latex/thesis/chapters/appendix.tex
@ -104,7 +104,7 @@ The iterative algorithm can then be expressed as%
 %

 Using the definition of the proximal operator, the $\tilde{\boldsymbol{c}}$ update step
-can be rewritten to match the definition given in section \ref{sec:dec:LP Decoding using ADMM}:%
+can be rewritten to match the definition given in section \ref{sec:lp:Decoding Algorithm}:%
 %
 \begin{align*}
    \tilde{\boldsymbol{c}} &\leftarrow \textbf{prox}_{\mu f}\left( \tilde{\boldsymbol{c}}
--- a/latex/thesis/chapters/analysis_of_results.tex
+++ b/latex/thesis/chapters/analysis_of_results.tex
@ -1,42 +1,22 @@
-\chapter{Analysis of Results}%
-\label{chapter:Analysis of Results}
+\chapter{Comparison of Proximal Decoding and \acs{LP} Decoding using \acs{ADMM}}%
+\label{chapter:comparison}
+
+TODO
+
+%In this chapter, proximal decoding and \ac{LP} Decoding using \ac{ADMM} are compared.
+%First the two algorithms are compared on a theoretical basis.
+%Subsequently, their respective simulation results are examined and their
+%differences interpreted on the basis of their theoretical structure.
+%
+%some similarities between the proximal decoding algorithm
+%and \ac{LP} decoding using \ac{ADMM} are be pointed out.
+%The two algorithms are compared and their different computational and decoding
+%performance is interpreted on the basis of their theoretical structure.


 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-\section{LP Decoding using ADMM}%
-\label{sec:ana:LP Decoding using ADMM}
-
-
-%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-\section{Proximal Decoding}%
-\label{sec:ana:Proximal Decoding}
-
-\begin{itemize}
-    \item Parameter choice
-    \item FER
-    \item Improved implementation
-\end{itemize}
-
-
-%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-\section{Comparison of BP, Proximal Decoding and LP Decoding using ADMM}%
-\label{sec:ana:Comparison of BP, Proximal Decoding and LP Decoding using ADMM}
-
-\begin{itemize}
-    \item Decoding performance
-    \item Complexity \& runtime(mention difficulty in reaching conclusive
-        results when comparing implementations)
-\end{itemize}
-
-
-%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-\section{Theoretical Comparison of Proximal Decoding and LP Decoding using ADMM}%
-\label{sec:Theoretical Comparison of Proximal Decoding and LP Decoding using ADMM}
-
-In this section, some similarities between the proximal decoding algorithm
-and \ac{LP} decoding using \ac{ADMM} are be pointed out.
-The two algorithms are compared and their different computational and decoding
-performance is interpreted on the basis of their theoretical structure.
+\section{Theoretical Comparison}%
+\label{sec:comp:theo}

 \ac{ADMM} and the proximal gradient method can both be expressed in terms of
 proximal operators.
@ -154,6 +134,8 @@ The advantage which arises because of this when using \ac{ADMM} is that
 it can be easily detected, when the algorithm gets stuck - the algorithm
 returns a pseudocodeword, the components of which are fractional.

+\todo{Compare time complexity using Big-O notation}
+
 \begin{itemize}
    \item The comparison of actual implementations is always debatable /
        contentious, since it is difficult to separate differences in
@ -169,3 +151,11 @@ returns a pseudocodeword, the components of which are fractional.
    \item Proximal decoding faster than \ac{ADMM} $\rightarrow$ dafuq
        (larger number of iterations before convergence?)
 \end{itemize}
+
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+\section{Comparison of Results}%
+\label{sec:comp:res}
+
+TODO
+
--- a/latex/thesis/chapters/decoding_techniques.tex
+++ b/latex/thesis/chapters/decoding_techniques.tex
@ -1,172 +1,12 @@
-\chapter{Decoding Techniques}%
-\label{chapter:decoding_techniques}
+\chapter{\acs{LP} Decoding using \acs{ADMM}}%
+\label{chapter:lp_dec_using_admm}

-In this chapter, the decoding techniques examined in this work are detailed.
-First, an overview of the general methodology of using optimization methods
-for channel decoding is given.
-Then, the field of \ac{LP} decoding and an \ac{ADMM}-based \ac{LP} decoding
-algorithm are introduced.
-Finally, the \textit{proximal decoding} algorithm is presented.
-
-
-%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-\section{Decoding using Optimization Methods}%
-\label{sec:dec:Decoding using Optimization Methods}
-
-%
-% General methodology
-%
-
-The general idea behind using optimization methods for channel decoding
-is to reformulate the decoding problem as an optimization problem.
-This new formulation can then be solved with one of the many
-available optimization algorithms.
-
-Generally, the original decoding problem considered is either the \ac{MAP} or
-the \ac{ML} decoding problem:%
-%
-\begin{align}
-    \hat{\boldsymbol{c}}_{\text{\ac{MAP}}} &= \argmax_{\boldsymbol{c} \in \mathcal{C}}
-    p_{\boldsymbol{C} \mid \boldsymbol{Y}} \left(\boldsymbol{c} \mid \boldsymbol{y}
-        \right) \label{eq:dec:map}\\
-    \hat{\boldsymbol{c}}_{\text{\ac{ML}}} &= \argmax_{\boldsymbol{c} \in \mathcal{C}}
-    f_{\boldsymbol{Y} \mid \boldsymbol{C}} \left( \boldsymbol{y} \mid \boldsymbol{c}
-        \right) \label{eq:dec:ml}
-.\end{align}%
-%
-The goal is to arrive at a formulation, where a certain objective function
-$g : \mathbb{R}^n \rightarrow \mathbb{R} $ must be minimized under certain constraints:%
-%
-\begin{align*}
-    \text{minimize}\hspace{2mm}   &g\left( \tilde{\boldsymbol{c}} \right)\\
-    \text{subject to}\hspace{2mm} &\tilde{\boldsymbol{c}} \in D
-,\end{align*}%
-%
-where $D \subseteq \mathbb{R}^n$ is the domain of values attainable for $\tilde{\boldsymbol{c}}$
-and represents the constraints.
-
-In contrast to the established message-passing decoding algorithms,
-the prespective then changes from observing the decoding process in its
-Tanner graph representation with \acp{VN} and \acp{CN} (as shown in figure \ref{fig:dec:tanner})
-to a spatial representation (figure \ref{fig:dec:spatial}),
-where the codewords are some of the edges of a hypercube.
-The goal is to find the point $\tilde{\boldsymbol{c}}$,
-which minimizes the objective function $g$.
-
-%
-% Figure showing decoding space
-%
-
-\begin{figure}[H]
-    \centering
-
-    \begin{subfigure}[c]{0.47\textwidth}
-        \centering
-    
-        \tikzstyle{checknode} = [color=KITblue, fill=KITblue,
-                                draw, regular polygon,regular polygon sides=4,
-                                inner sep=0pt, minimum size=12pt]
-        \tikzstyle{variablenode} = [color=KITgreen, fill=KITgreen,
-                                draw, circle, inner sep=0pt, minimum size=10pt]
-
-        \begin{tikzpicture}[scale=1, transform shape]
-            \node[checknode,
-                  label={[below, label distance=-0.4cm, align=center]
-                  \acs{CN}\\$\left( c_1 + c_2 + c_3 = 0 \right) $}]
-                (cn) at (0, 0) {};
-            \node[variablenode, label={[above, align=center] \acs{VN}\\$\left( c_1 \right)$}]
-                (c1) at (-2, 2) {};
-            \node[variablenode, label={[above, align=center] \acs{VN}\\$\left( c_2 \right)$}]
-                (c2) at (0, 2) {};
-            \node[variablenode, label={[above, align=center] \acs{VN}\\$\left( c_3 \right)$}]
-                (c3) at (2, 2) {};
-
-            \draw (cn) -- (c1);
-            \draw (cn) -- (c2);
-            \draw (cn) -- (c3);
-        \end{tikzpicture}
-    
-        \caption{Tanner graph representation of a single parity-check code}
-        \label{fig:dec:tanner}
-    \end{subfigure}%
-    \hfill%
-    \begin{subfigure}[c]{0.47\textwidth}
-        \centering
-
-        \tikzstyle{codeword} = [color=KITblue, fill=KITblue,
-                                draw, circle, inner sep=0pt, minimum size=4pt]
-
-        \tdplotsetmaincoords{60}{25}
-        \begin{tikzpicture}[scale=1, transform shape, tdplot_main_coords]
-            % Cube
-
-            \coordinate (p000) at (0, 0, 0);
-            \coordinate (p001) at (0, 0, 2);
-            \coordinate (p010) at (0, 2, 0);
-            \coordinate (p011) at (0, 2, 2);
-            \coordinate (p100) at (2, 0, 0);
-            \coordinate (p101) at (2, 0, 2);
-            \coordinate (p110) at (2, 2, 0);
-            \coordinate (p111) at (2, 2, 2);
-
-            \draw[] (p000) -- (p100);
-            \draw[] (p100) -- (p101);
-            \draw[] (p101) -- (p001);
-            \draw[] (p001) -- (p000);
-
-            \draw[dashed] (p010) -- (p110);
-            \draw[]       (p110) -- (p111);
-            \draw[]       (p111) -- (p011);
-            \draw[dashed] (p011) -- (p010);
-
-            \draw[dashed] (p000) -- (p010);
-            \draw[]       (p100) -- (p110);
-            \draw[]       (p101) -- (p111);
-            \draw[]       (p001) -- (p011);
-
-            % Polytope Vertices
-
-            \node[codeword] (c000) at (p000) {};
-            \node[codeword] (c101) at (p101) {};
-            \node[codeword] (c110) at (p110) {};
-            \node[codeword] (c011) at (p011) {};
-
-            % Polytope Edges
-
-%            \draw[line width=1pt, color=KITblue] (c000) -- (c101);
-%            \draw[line width=1pt, color=KITblue] (c000) -- (c110);
-%            \draw[line width=1pt, color=KITblue] (c000) -- (c011);
-%
-%            \draw[line width=1pt, color=KITblue] (c101) -- (c110);
-%            \draw[line width=1pt, color=KITblue] (c101) -- (c011);
-%
-%            \draw[line width=1pt, color=KITblue] (c011) -- (c110);
-
-            % Polytope Annotations
-
-            \node[color=KITblue, below=0cm of c000]    {$\left( 0, 0, 0 \right) $};
-            \node[color=KITblue, right=0.17cm of c101] {$\left( 1, 0, 1 \right) $};
-            \node[color=KITblue, right=0cm of c110]    {$\left( 1, 1, 0 \right) $};
-            \node[color=KITblue, above=0cm of c011]    {$\left( 0, 1, 1 \right) $};
-
-            % c
-
-            \node[color=KITgreen, fill=KITgreen,
-                  draw, circle, inner sep=0pt, minimum size=4pt] (c) at (0.9, 0.7, 1) {};
-            \node[color=KITgreen, right=0cm of c] {$\tilde{\boldsymbol{c}}$};
-        \end{tikzpicture}
-
-        \caption{Spatial representation of a single parity-check code}
-        \label{fig:dec:spatial}
-    \end{subfigure}%
-
-    \caption{Different representations of the decoding problem}
-\end{figure}
+TODO


 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 \section{LP Decoding}%
-\label{sec:dec:LP Decoding}
+\label{sec:lp:LP Decoding}

 \Ac{LP} decoding is a subject area introduced by Feldman et al.
 \cite{feldman_paper}. They reframe the decoding problem as an
@ -276,7 +116,7 @@ the transfer matrix would be \cite[Sec. II, A]{efficient_lp_dec_admm}
 .\end{align*}%
 %

-In figure \ref{fig:dec:poly}, the two relaxations are compared for an
+In figure \ref{fig:lp:poly}, the two relaxations are compared for an
 examplary code, which is described by the generator and parity-check matrices%
 %
 \begin{align}
@ -298,13 +138,13 @@ and has only two possible codewords:
    \begin{bmatrix} 0 & 1 & 1 \end{bmatrix}   \right\} 
 .\end{align*}
 %
-Figure \ref{fig:dec:poly:exact_ilp} shows the domain of exact \ac{ML} decoding.
+Figure \ref{fig:lp:poly:exact_ilp} shows the domain of exact \ac{ML} decoding.
 The first relaxation, onto the codeword polytope $\text{poly}\left( \mathcal{C} \right) $,
-is shown in figure \ref{fig:dec:poly:exact};
+is shown in figure \ref{fig:lp:poly:exact};
 this expresses the constraints for the equivalent linear program to exact \ac{ML} decoding.
 $\text{poly}\left( \mathcal{C} \right) $ is further relaxed onto the relaxed codeword polytope
-$\overline{Q}$, shown in figure \ref{fig:dec:poly:relaxed}.
-Figure \ref{fig:dec:poly:local} shows how $\overline{Q}$ is formed by intersecting the
+$\overline{Q}$, shown in figure \ref{fig:lp:poly:relaxed}.
+Figure \ref{fig:lp:poly:local} shows how $\overline{Q}$ is formed by intersecting the
 local codeword polytopes of each check node.
 %
 %
@ -368,7 +208,7 @@ local codeword polytopes of each check node.
            \end{tikzpicture}
        
            \caption{Set of all codewords $\mathcal{C}$}
-            \label{fig:dec:poly:exact_ilp}
+            \label{fig:lp:poly:exact_ilp}
        \end{subfigure}\\[1em]
        \begin{subfigure}{\textwidth}
            \centering
@ -429,7 +269,7 @@ local codeword polytopes of each check node.
            \end{tikzpicture}
        
            \caption{Codeword polytope $\text{poly}\left( \mathcal{C} \right) $}
-            \label{fig:dec:poly:exact}
+            \label{fig:lp:poly:exact}
        \end{subfigure}
    \end{subfigure} \hfill%
 %    
@ -574,7 +414,7 @@ local codeword polytopes of each check node.
            \end{tikzpicture}

            \caption{Local codeword polytopes of the check nodes}
-            \label{fig:dec:poly:local}
+            \label{fig:lp:poly:local}
        \end{subfigure}\\[1em]
        \begin{subfigure}{\textwidth}
            \centering
@ -648,7 +488,7 @@ local codeword polytopes of each check node.
            \end{tikzpicture}
        
            \caption{Relaxed codeword polytope $\overline{Q}$}
-            \label{fig:dec:poly:relaxed}
+            \label{fig:lp:poly:relaxed}
        \end{subfigure}
    \end{subfigure}
        
@ -666,7 +506,7 @@ local codeword polytopes of each check node.
    \caption{Visualization of the codeword polytope and the relaxed codeword
        polytope of the code described by equations (\ref{eq:lp:example_code_def_gen})
        and (\ref{eq:lp:example_code_def_par})}
-    \label{fig:dec:poly}
+    \label{fig:lp:poly}
 \end{figure}%
 %
 \noindent It can be seen that the relaxed codeword polytope $\overline{Q}$ introduces
@ -689,10 +529,10 @@ The resulting formulation of the relaxed optimization problem becomes%


 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-\section{LP Decoding using ADMM}%
-\label{sec:dec:LP Decoding using ADMM}
+\section{Decoding Algorithm}%
+\label{sec:lp:Decoding Algorithm}

-The \ac{LP} decoding formulation in section \ref{sec:dec:Decoding using Optimization Methods}
+The \ac{LP} decoding formulation in section \ref{sec:lp:LP Decoding}
 is a very general one that can be solved with a number of different optimization methods.
 In this work \ac{ADMM} is examined, as its distributed nature allows for a very efficient
 implementation.
@ -879,246 +719,12 @@ have been proposed (e.g., in \cite{original_admm}, \cite{efficient_lp_dec_admm},
 The method chosen here is the one presented in \cite{lautern}.


+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+\section{Implementation Details}%
+\label{sec:lp:Implementation Details}
+

 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-\section{Proximal Decoding}%
-\label{sec:dec:Proximal Decoding}
+\section{Results}%
+\label{sec:lp:Results}

-Proximal decoding was proposed by Wadayama et. al as a novel formulation of
-optimization-based decoding \cite{proximal_paper}.
-With this algorithm, minimization is performed using the proximal gradient
-method.
-In contrast to \ac{LP} decoding, the objective function is based on a
-non-convex optimization formulation of the \ac{MAP} decoding problem.
-
-In order to derive the objective function, the authors begin with the
-\ac{MAP} decoding rule, expressed as a continuous maximization problem%
-\footnote{The expansion of the domain to be continuous doesn't constitute a
-material difference in the meaning of the rule.
-The only change is that what previously were \acp{PMF} now have to be expressed
-in terms of \acp{PDF}.}
-over $\boldsymbol{x}$:%
-%
-\begin{align}
-    \hat{\boldsymbol{x}} = \argmax_{\tilde{\boldsymbol{x}} \in \mathbb{R}^{n}}
-        f_{\tilde{\boldsymbol{X}} \mid \boldsymbol{Y}}
-        \left( \tilde{\boldsymbol{x}} \mid \boldsymbol{y} \right)
-        = \argmax_{\tilde{\boldsymbol{x}} \in \mathbb{R}^{n}} f_{\boldsymbol{Y}
-            \mid \tilde{\boldsymbol{X}}}
-        \left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right)
-        f_{\tilde{\boldsymbol{X}}}\left( \tilde{\boldsymbol{x}} \right)%
-    \label{eq:prox:vanilla_MAP}
-.\end{align}%
-%
-The likelihood $f_{\boldsymbol{Y} \mid \tilde{\boldsymbol{X}}}
-\left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right) $ is a known function
-determined by the channel model.
-The prior \ac{PDF} $f_{\tilde{\boldsymbol{X}}}\left( \tilde{\boldsymbol{x}} \right)$ is also
-known, as the equal probability assumption is made on
-$\mathcal{C}$.
-However, since the considered domain is continuous,
-the prior \ac{PDF} cannot be ignored as a constant during the minimization
-as is often done, and has a rather unwieldy representation:%
-%
-\begin{align}
-    f_{\tilde{\boldsymbol{X}}}\left( \tilde{\boldsymbol{x}} \right) =
-        \frac{1}{\left| \mathcal{C} \right| }
-            \sum_{\boldsymbol{c} \in \mathcal{C} }
-                \delta\big( \tilde{\boldsymbol{x}} - \left( -1 \right) ^{\boldsymbol{c}}\big)
-    \label{eq:prox:prior_pdf}
-.\end{align}%
-%
-In order to rewrite the prior \ac{PDF}
-$f_{\tilde{\boldsymbol{X}}}\left( \tilde{\boldsymbol{x}} \right)$,
-the so-called \textit{code-constraint polynomial} is introduced as:%
-%
-\begin{align*}
-    h\left( \tilde{\boldsymbol{x}} \right) =
-        \underbrace{\sum_{i=1}^{n} \left( \tilde{x_i}^2-1 \right) ^2}_{\text{Bipolar constraint}}
-        + \underbrace{\sum_{j=1}^{m} \left[
-            \left( \prod_{i\in N_c \left( j \right) } \tilde{x_i} \right)
-        -1 \right] ^2}_{\text{Parity constraint}}%
-.\end{align*}%
-%
-The intention of this function is to provide a way to penalize vectors far
-from a codeword and favor those close to one.
-In order to achieve this, the polynomial is composed of two parts: one term
-representing the bipolar constraint, providing for a discrete solution of the
-continuous optimization problem, and one term representing the parity
-constraints, accommodating the role of the parity-check matrix $\boldsymbol{H}$.
-The prior \ac{PDF} is then approximated using the code-constraint polynomial as:%
-%
-\begin{align}
-    f_{\tilde{\boldsymbol{X}}}\left( \tilde{\boldsymbol{x}} \right)
-    \approx \frac{1}{Z}\mathrm{e}^{-\gamma h\left( \tilde{\boldsymbol{x}} \right) }%
-    \label{eq:prox:prior_pdf_approx}
-.\end{align}%
-%
-The authors justify this approximation by arguing, that for
-$\gamma \rightarrow \infty$, the approximation in equation
-(\ref{eq:prox:prior_pdf_approx}) approaches the original function in equation
-(\ref{eq:prox:prior_pdf}).
-This approximation can then be plugged into equation (\ref{eq:prox:vanilla_MAP})
-and the likelihood can be rewritten using the negative log-likelihood
-$L \left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right) = -\ln\left(
-        f_{\boldsymbol{Y} \mid \tilde{\boldsymbol{X}}}\left(
-        \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right) \right) $:%
-%
-\begin{align*}
-    \hat{\boldsymbol{x}} &= \argmax_{\tilde{\boldsymbol{x}} \in \mathbb{R}^{n}}
-            \mathrm{e}^{- L\left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right) }
-            \mathrm{e}^{-\gamma h\left( \tilde{\boldsymbol{x}} \right) } \\
-        &= \argmin_{\tilde{\boldsymbol{x}} \in \mathbb{R}^n} \big(
-            L\left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right)
-            + \gamma h\left( \tilde{\boldsymbol{x}} \right) 
-            \big)%
-.\end{align*}%
-%
-Thus, with proximal decoding, the objective function
-$g\left( \tilde{\boldsymbol{x}} \right)$ considered is%
-%
-\begin{align}
-    g\left( \tilde{\boldsymbol{x}} \right) = L\left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}}
-    \right)
-        + \gamma h\left( \tilde{\boldsymbol{x}} \right)%
-    \label{eq:prox:objective_function}
-\end{align}%
-%
-and the decoding problem is reformulated to%
-%
-\begin{align*}    
-    \text{minimize}\hspace{2mm}   &L\left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right)
-        + \gamma h\left( \tilde{\boldsymbol{x}} \right)\\
-    \text{subject to}\hspace{2mm} &\tilde{\boldsymbol{x}} \in \mathbb{R}^n
-.\end{align*}
-%
-
-For the solution of the approximate \ac{MAP} decoding problem, the two parts
-of equation (\ref{eq:prox:objective_function}) are considered separately:
-the minimization of the objective function occurs in an alternating
-fashion, switching between the negative log-likelihood
-$L\left( \boldsymbol{y} \mid \boldsymbol{x} \right) $ and the scaled
-code-constraint polynomial $\gamma h\left( \boldsymbol{x} \right) $.
-Two helper variables, $\boldsymbol{r}$ and $\boldsymbol{s}$, are introduced,
-describing the result of each of the two steps.
-The first step, minimizing the log-likelihood, is performed using gradient
-descent:%
-%
-\begin{align}
-    \boldsymbol{r} \leftarrow \boldsymbol{s} - \omega \nabla
-        L\left( \boldsymbol{y} \mid \boldsymbol{s} \right),
-    \hspace{5mm}\omega > 0
-    \label{eq:prox:step_log_likelihood}
-.\end{align}%
-%
-For the second step, minimizing the scaled code-constraint polynomial, the
-proximal gradient method is used and the \textit{proximal operator} of
-$\gamma h\left( \tilde{\boldsymbol{x}} \right) $ has to be computed.
-It is then immediately approximated with gradient-descent:%
-%
-\begin{align*}
-    \textbf{prox}_{\gamma h} \left( \tilde{\boldsymbol{x}} \right) &\equiv
-        \argmin_{\boldsymbol{t} \in \mathbb{R}^n}
-            \left( \gamma h\left( \boldsymbol{t} \right) +
-                \frac{1}{2} \lVert \boldsymbol{t} - \tilde{\boldsymbol{x}} \rVert \right)\\
-        &\approx \tilde{\boldsymbol{x}} - \gamma \nabla h \left( \tilde{\boldsymbol{x}} \right),
-    \hspace{5mm} \gamma > 0, \text{ small}
-.\end{align*}%
-%
-The second step thus becomes%
-%
-\begin{align*}
-    \boldsymbol{s} \leftarrow \boldsymbol{r} - \gamma \nabla h\left( \boldsymbol{r} \right),
-    \hspace{5mm}\gamma > 0,\text{ small}
-.\end{align*}
-%
-While the approximation of the prior \ac{PDF} made in equation (\ref{eq:prox:prior_pdf_approx})
-theoretically becomes better
-with larger $\gamma$, the constraint that $\gamma$ be small is important,
-as it keeps the effect of $h\left( \tilde{\boldsymbol{x}} \right) $ on the landscape
-of the objective function small.
-Otherwise, unwanted stationary points, including local minima, are introduced.
-The authors say that ``in practice, the value of $\gamma$ should be adjusted
-according to the decoding performance.'' \cite[Sec. 3.1]{proximal_paper}.
-
-%The components of the gradient of the code-constraint polynomial can be computed as follows:%
-%%
-%\begin{align*}
-%    \frac{\partial}{\partial x_k} h\left( \boldsymbol{x} \right) =
-%        4\left( x_k^2 - 1 \right) x_k + \frac{2}{x_k}
-%            \sum_{i\in \mathcal{B}\left( k \right) } \left(
-%                \left( \prod_{j\in\mathcal{A}\left( i \right)} x_j\right)^2
-%                - \prod_{j\in\mathcal{A}\left( i \right) }x_j \right)
-%.\end{align*}%
-%\todo{Only multiplication?}%
-%\todo{$x_k$: $k$ or some other indexing variable?}%
-%%
-In the case of \ac{AWGN}, the likelihood
-$f_{\boldsymbol{Y} \mid \tilde{\boldsymbol{X}}}
-    \left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right)$
-is%
-%
-\begin{align*}
-    f_{\boldsymbol{Y} \mid \tilde{\boldsymbol{X}}}
-        \left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right)
-        = \frac{1}{\sqrt{2\pi\sigma^2}}\mathrm{e}^{
-            -\frac{\lVert \boldsymbol{y}-\tilde{\boldsymbol{x}}
-        \rVert^2 }
-    {2\sigma^2}}
-.\end{align*}
-%
-Thus, the gradient of the negative log-likelihood becomes%
-\footnote{For the minimization, constants can be disregarded. For this reason,
-it suffices to consider only proportionality instead of equality.}%
-%
-\begin{align*}
-    \nabla L \left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right)
-    &\propto -\nabla \lVert \boldsymbol{y} - \tilde{\boldsymbol{x}} \rVert^2\\
-    &\propto \tilde{\boldsymbol{x}} - \boldsymbol{y}
-,\end{align*}%
-%
-allowing equation (\ref{eq:prox:step_log_likelihood}) to be rewritten as%
-%
-\begin{align*}
-    \boldsymbol{r} \leftarrow \boldsymbol{s}
-        - \omega \left( \boldsymbol{s} - \boldsymbol{y} \right) 
-.\end{align*}
-%
-
-One thing to consider during the actual decoding process, is that the gradient
-of the code-constraint polynomial can take on extremely large values.
-To avoid numerical instability, an additional step is added, where all
-components of the current estimate are clipped to $\left[-\eta, \eta \right]$,
-where $\eta$ is a positive constant slightly larger than one:%
-%
-\begin{align*}
-    \boldsymbol{s} \leftarrow \Pi_{\eta} \left( \boldsymbol{r}
-        - \gamma \nabla h\left( \boldsymbol{r} \right)  \right) 
-,\end{align*}
-%
-$\Pi_{\eta}\left( \cdot \right) $ expressing the projection onto
-$\left[ -\eta, \eta \right]^n$.
-
-The iterative decoding process resulting from these considerations is shown in
-figure \ref{fig:prox:alg}.
-
-\begin{figure}[H]
-    \centering
-
-    \begin{genericAlgorithm}[caption={}, label={}]
-$\boldsymbol{s} \leftarrow \boldsymbol{0}$
-for $K$ iterations do
-    $\boldsymbol{r} \leftarrow \boldsymbol{s} - \omega \left( \boldsymbol{s} - \boldsymbol{y} \right) $
-    $\boldsymbol{s} \leftarrow \Pi_\eta \left(\boldsymbol{r} - \gamma \nabla h\left( \boldsymbol{r} \right) \right)$
-    $\boldsymbol{\hat{x}} \leftarrow \text{sign}\left( \boldsymbol{s} \right) $
-    if $\boldsymbol{H}\boldsymbol{\hat{c}} = \boldsymbol{0}$ do
-        return $\boldsymbol{\hat{c}}$
-    end if
-end for
-return $\boldsymbol{\hat{c}}$
-    \end{genericAlgorithm}
-
-
-    \caption{Proximal decoding algorithm for an \ac{AWGN} channel}
-    \label{fig:prox:alg}
-\end{figure}
--- a/latex/thesis/chapters/methodology_and_implementation.tex
+++ b/latex/thesis/chapters/methodology_and_implementation.tex
@ -1,34 +0,0 @@
-\chapter{Methodology and Implementation}%
-\label{chapter:methodology_and_implementation}
-
-
-%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-\section{General implementation process}%
-\label{sec:impl:General implementation process}
-
-\begin{itemize}
-    \item First python using numpy
-    \item Then C++ using Eigen
-\end{itemize}
-
-
-%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-\section{LP Decoding using ADMM}%
-\label{sec:impl:LP Decoding using ADMM}
-
-\begin{itemize}
-    \item Choice of parameters
-    \item Selected projection algorithm
-    \item Adaptive linear programming decoding?
-\end{itemize}
-
-
-%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-\section{Proximal Decoding}%
-\label{sec:impl:Proximal Decoding}
-
-\begin{itemize}
-    \item Choice of parameters
-    \item Road to improved implemenation
-\end{itemize}
-
--- a/latex/thesis/chapters/proximal_decoding.tex
+++ b/latex/thesis/chapters/proximal_decoding.tex
@ -0,0 +1,264 @@
+\chapter{Proximal Decoding}%
+\label{chapter:proximal_decoding}
+
+TODO
+
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+\section{Decoding Algorithm}%
+\label{sec:prox:Decoding Algorithm}
+
+Proximal decoding was proposed by Wadayama et. al as a novel formulation of
+optimization-based decoding \cite{proximal_paper}.
+With this algorithm, minimization is performed using the proximal gradient
+method.
+In contrast to \ac{LP} decoding, the objective function is based on a
+non-convex optimization formulation of the \ac{MAP} decoding problem.
+
+In order to derive the objective function, the authors begin with the
+\ac{MAP} decoding rule, expressed as a continuous maximization problem%
+\footnote{The expansion of the domain to be continuous doesn't constitute a
+material difference in the meaning of the rule.
+The only change is that what previously were \acp{PMF} now have to be expressed
+in terms of \acp{PDF}.}
+over $\boldsymbol{x}$:%
+%
+\begin{align}
+    \hat{\boldsymbol{x}} = \argmax_{\tilde{\boldsymbol{x}} \in \mathbb{R}^{n}}
+        f_{\tilde{\boldsymbol{X}} \mid \boldsymbol{Y}}
+        \left( \tilde{\boldsymbol{x}} \mid \boldsymbol{y} \right)
+        = \argmax_{\tilde{\boldsymbol{x}} \in \mathbb{R}^{n}} f_{\boldsymbol{Y}
+            \mid \tilde{\boldsymbol{X}}}
+        \left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right)
+        f_{\tilde{\boldsymbol{X}}}\left( \tilde{\boldsymbol{x}} \right)%
+    \label{eq:prox:vanilla_MAP}
+.\end{align}%
+%
+The likelihood $f_{\boldsymbol{Y} \mid \tilde{\boldsymbol{X}}}
+\left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right) $ is a known function
+determined by the channel model.
+The prior \ac{PDF} $f_{\tilde{\boldsymbol{X}}}\left( \tilde{\boldsymbol{x}} \right)$ is also
+known, as the equal probability assumption is made on
+$\mathcal{C}$.
+However, since the considered domain is continuous,
+the prior \ac{PDF} cannot be ignored as a constant during the minimization
+as is often done, and has a rather unwieldy representation:%
+%
+\begin{align}
+    f_{\tilde{\boldsymbol{X}}}\left( \tilde{\boldsymbol{x}} \right) =
+        \frac{1}{\left| \mathcal{C} \right| }
+            \sum_{\boldsymbol{c} \in \mathcal{C} }
+                \delta\big( \tilde{\boldsymbol{x}} - \left( -1 \right) ^{\boldsymbol{c}}\big)
+    \label{eq:prox:prior_pdf}
+.\end{align}%
+%
+In order to rewrite the prior \ac{PDF}
+$f_{\tilde{\boldsymbol{X}}}\left( \tilde{\boldsymbol{x}} \right)$,
+the so-called \textit{code-constraint polynomial} is introduced as:%
+%
+\begin{align*}
+    h\left( \tilde{\boldsymbol{x}} \right) =
+        \underbrace{\sum_{i=1}^{n} \left( \tilde{x_i}^2-1 \right) ^2}_{\text{Bipolar constraint}}
+        + \underbrace{\sum_{j=1}^{m} \left[
+            \left( \prod_{i\in N_c \left( j \right) } \tilde{x_i} \right)
+        -1 \right] ^2}_{\text{Parity constraint}}%
+.\end{align*}%
+%
+The intention of this function is to provide a way to penalize vectors far
+from a codeword and favor those close to one.
+In order to achieve this, the polynomial is composed of two parts: one term
+representing the bipolar constraint, providing for a discrete solution of the
+continuous optimization problem, and one term representing the parity
+constraints, accommodating the role of the parity-check matrix $\boldsymbol{H}$.
+The prior \ac{PDF} is then approximated using the code-constraint polynomial as:%
+%
+\begin{align}
+    f_{\tilde{\boldsymbol{X}}}\left( \tilde{\boldsymbol{x}} \right)
+    \approx \frac{1}{Z}\mathrm{e}^{-\gamma h\left( \tilde{\boldsymbol{x}} \right) }%
+    \label{eq:prox:prior_pdf_approx}
+.\end{align}%
+%
+The authors justify this approximation by arguing, that for
+$\gamma \rightarrow \infty$, the approximation in equation
+(\ref{eq:prox:prior_pdf_approx}) approaches the original function in equation
+(\ref{eq:prox:prior_pdf}).
+This approximation can then be plugged into equation (\ref{eq:prox:vanilla_MAP})
+and the likelihood can be rewritten using the negative log-likelihood
+$L \left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right) = -\ln\left(
+        f_{\boldsymbol{Y} \mid \tilde{\boldsymbol{X}}}\left(
+        \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right) \right) $:%
+%
+\begin{align*}
+    \hat{\boldsymbol{x}} &= \argmax_{\tilde{\boldsymbol{x}} \in \mathbb{R}^{n}}
+            \mathrm{e}^{- L\left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right) }
+            \mathrm{e}^{-\gamma h\left( \tilde{\boldsymbol{x}} \right) } \\
+        &= \argmin_{\tilde{\boldsymbol{x}} \in \mathbb{R}^n} \big(
+            L\left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right)
+            + \gamma h\left( \tilde{\boldsymbol{x}} \right) 
+            \big)%
+.\end{align*}%
+%
+Thus, with proximal decoding, the objective function
+$g\left( \tilde{\boldsymbol{x}} \right)$ considered is%
+%
+\begin{align}
+    g\left( \tilde{\boldsymbol{x}} \right) = L\left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}}
+    \right)
+        + \gamma h\left( \tilde{\boldsymbol{x}} \right)%
+    \label{eq:prox:objective_function}
+\end{align}%
+%
+and the decoding problem is reformulated to%
+%
+\begin{align*}    
+    \text{minimize}\hspace{2mm}   &L\left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right)
+        + \gamma h\left( \tilde{\boldsymbol{x}} \right)\\
+    \text{subject to}\hspace{2mm} &\tilde{\boldsymbol{x}} \in \mathbb{R}^n
+.\end{align*}
+%
+
+For the solution of the approximate \ac{MAP} decoding problem, the two parts
+of equation (\ref{eq:prox:objective_function}) are considered separately:
+the minimization of the objective function occurs in an alternating
+fashion, switching between the negative log-likelihood
+$L\left( \boldsymbol{y} \mid \boldsymbol{x} \right) $ and the scaled
+code-constraint polynomial $\gamma h\left( \boldsymbol{x} \right) $.
+Two helper variables, $\boldsymbol{r}$ and $\boldsymbol{s}$, are introduced,
+describing the result of each of the two steps.
+The first step, minimizing the log-likelihood, is performed using gradient
+descent:%
+%
+\begin{align}
+    \boldsymbol{r} \leftarrow \boldsymbol{s} - \omega \nabla
+        L\left( \boldsymbol{y} \mid \boldsymbol{s} \right),
+    \hspace{5mm}\omega > 0
+    \label{eq:prox:step_log_likelihood}
+.\end{align}%
+%
+For the second step, minimizing the scaled code-constraint polynomial, the
+proximal gradient method is used and the \textit{proximal operator} of
+$\gamma h\left( \tilde{\boldsymbol{x}} \right) $ has to be computed.
+It is then immediately approximated with gradient-descent:%
+%
+\begin{align*}
+    \textbf{prox}_{\gamma h} \left( \tilde{\boldsymbol{x}} \right) &\equiv
+        \argmin_{\boldsymbol{t} \in \mathbb{R}^n}
+            \left( \gamma h\left( \boldsymbol{t} \right) +
+                \frac{1}{2} \lVert \boldsymbol{t} - \tilde{\boldsymbol{x}} \rVert \right)\\
+        &\approx \tilde{\boldsymbol{x}} - \gamma \nabla h \left( \tilde{\boldsymbol{x}} \right),
+    \hspace{5mm} \gamma > 0, \text{ small}
+.\end{align*}%
+%
+The second step thus becomes%
+%
+\begin{align*}
+    \boldsymbol{s} \leftarrow \boldsymbol{r} - \gamma \nabla h\left( \boldsymbol{r} \right),
+    \hspace{5mm}\gamma > 0,\text{ small}
+.\end{align*}
+%
+While the approximation of the prior \ac{PDF} made in equation (\ref{eq:prox:prior_pdf_approx})
+theoretically becomes better
+with larger $\gamma$, the constraint that $\gamma$ be small is important,
+as it keeps the effect of $h\left( \tilde{\boldsymbol{x}} \right) $ on the landscape
+of the objective function small.
+Otherwise, unwanted stationary points, including local minima, are introduced.
+The authors say that ``in practice, the value of $\gamma$ should be adjusted
+according to the decoding performance.'' \cite[Sec. 3.1]{proximal_paper}.
+
+%The components of the gradient of the code-constraint polynomial can be computed as follows:%
+%%
+%\begin{align*}
+%    \frac{\partial}{\partial x_k} h\left( \boldsymbol{x} \right) =
+%        4\left( x_k^2 - 1 \right) x_k + \frac{2}{x_k}
+%            \sum_{i\in \mathcal{B}\left( k \right) } \left(
+%                \left( \prod_{j\in\mathcal{A}\left( i \right)} x_j\right)^2
+%                - \prod_{j\in\mathcal{A}\left( i \right) }x_j \right)
+%.\end{align*}%
+%\todo{Only multiplication?}%
+%\todo{$x_k$: $k$ or some other indexing variable?}%
+%%
+In the case of \ac{AWGN}, the likelihood
+$f_{\boldsymbol{Y} \mid \tilde{\boldsymbol{X}}}
+    \left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right)$
+is%
+%
+\begin{align*}
+    f_{\boldsymbol{Y} \mid \tilde{\boldsymbol{X}}}
+        \left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right)
+        = \frac{1}{\sqrt{2\pi\sigma^2}}\mathrm{e}^{
+            -\frac{\lVert \boldsymbol{y}-\tilde{\boldsymbol{x}}
+        \rVert^2 }
+    {2\sigma^2}}
+.\end{align*}
+%
+Thus, the gradient of the negative log-likelihood becomes%
+\footnote{For the minimization, constants can be disregarded. For this reason,
+it suffices to consider only proportionality instead of equality.}%
+%
+\begin{align*}
+    \nabla L \left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right)
+    &\propto -\nabla \lVert \boldsymbol{y} - \tilde{\boldsymbol{x}} \rVert^2\\
+    &\propto \tilde{\boldsymbol{x}} - \boldsymbol{y}
+,\end{align*}%
+%
+allowing equation (\ref{eq:prox:step_log_likelihood}) to be rewritten as%
+%
+\begin{align*}
+    \boldsymbol{r} \leftarrow \boldsymbol{s}
+        - \omega \left( \boldsymbol{s} - \boldsymbol{y} \right) 
+.\end{align*}
+%
+
+One thing to consider during the actual decoding process, is that the gradient
+of the code-constraint polynomial can take on extremely large values.
+To avoid numerical instability, an additional step is added, where all
+components of the current estimate are clipped to $\left[-\eta, \eta \right]$,
+where $\eta$ is a positive constant slightly larger than one:%
+%
+\begin{align*}
+    \boldsymbol{s} \leftarrow \Pi_{\eta} \left( \boldsymbol{r}
+        - \gamma \nabla h\left( \boldsymbol{r} \right)  \right) 
+,\end{align*}
+%
+$\Pi_{\eta}\left( \cdot \right) $ expressing the projection onto
+$\left[ -\eta, \eta \right]^n$.
+
+The iterative decoding process resulting from these considerations is shown in
+figure \ref{fig:prox:alg}.
+
+\begin{figure}[H]
+    \centering
+
+    \begin{genericAlgorithm}[caption={}, label={}]
+$\boldsymbol{s} \leftarrow \boldsymbol{0}$
+for $K$ iterations do
+    $\boldsymbol{r} \leftarrow \boldsymbol{s} - \omega \left( \boldsymbol{s} - \boldsymbol{y} \right) $
+    $\boldsymbol{s} \leftarrow \Pi_\eta \left(\boldsymbol{r} - \gamma \nabla h\left( \boldsymbol{r} \right) \right)$
+    $\boldsymbol{\hat{x}} \leftarrow \text{sign}\left( \boldsymbol{s} \right) $
+    if $\boldsymbol{H}\boldsymbol{\hat{c}} = \boldsymbol{0}$ do
+        return $\boldsymbol{\hat{c}}$
+    end if
+end for
+return $\boldsymbol{\hat{c}}$
+    \end{genericAlgorithm}
+
+
+    \caption{Proximal decoding algorithm for an \ac{AWGN} channel}
+    \label{fig:prox:alg}
+\end{figure}
+
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+\section{Implementation Details}%
+\label{sec:prox:Implementation Details}
+
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+\section{Results}%
+\label{sec:prox:Results}
+
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+\section{Improved Implementation}%
+\label{sec:prox:Improved Implementation}
+
--- a/latex/thesis/chapters/theoretical_background.tex
+++ b/latex/thesis/chapters/theoretical_background.tex
@ -5,10 +5,11 @@ In this chapter, the theoretical background necessary to understand this
 work is given.
 First, the used notation is clarified.
 The physical aspects are detailed - the used modulation scheme and channel model.
-A short introduction of channel coding with binary linear codes and especially
+A short introduction to channel coding with binary linear codes and especially
 \ac{LDPC} codes is given.
 The established methods of decoding LPDC codes are briefly explained.
-Lastly, the optimization methods utilized are described.
+Lastly, the general process of decoding using optimization techniques is described
+and an overview of the utilized optimization methods is given.


 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
@ -270,6 +271,161 @@ the use of \ac{BP} impractical for applications where a very low \ac{BER} is
 desired \cite[Sec. 15.3]{ryan_lin_2009}.


+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+\section{Decoding using Optimization Methods}%
+\label{sec:theo:Decoding using Optimization Methods}
+
+%
+% General methodology
+%
+
+The general idea behind using optimization methods for channel decoding
+is to reformulate the decoding problem as an optimization problem.
+This new formulation can then be solved with one of the many
+available optimization algorithms.
+
+Generally, the original decoding problem considered is either the \ac{MAP} or
+the \ac{ML} decoding problem:%
+%
+\begin{align}
+    \hat{\boldsymbol{c}}_{\text{\ac{MAP}}} &= \argmax_{\boldsymbol{c} \in \mathcal{C}}
+    p_{\boldsymbol{C} \mid \boldsymbol{Y}} \left(\boldsymbol{c} \mid \boldsymbol{y}
+        \right) \label{eq:dec:map}\\
+    \hat{\boldsymbol{c}}_{\text{\ac{ML}}} &= \argmax_{\boldsymbol{c} \in \mathcal{C}}
+    f_{\boldsymbol{Y} \mid \boldsymbol{C}} \left( \boldsymbol{y} \mid \boldsymbol{c}
+        \right) \label{eq:dec:ml}
+.\end{align}%
+%
+The goal is to arrive at a formulation, where a certain objective function
+$g : \mathbb{R}^n \rightarrow \mathbb{R} $ must be minimized under certain constraints:%
+%
+\begin{align*}
+    \text{minimize}\hspace{2mm}   &g\left( \tilde{\boldsymbol{c}} \right)\\
+    \text{subject to}\hspace{2mm} &\tilde{\boldsymbol{c}} \in D
+,\end{align*}%
+%
+where $D \subseteq \mathbb{R}^n$ is the domain of values attainable for $\tilde{\boldsymbol{c}}$
+and represents the constraints.
+
+In contrast to the established message-passing decoding algorithms,
+the prespective then changes from observing the decoding process in its
+Tanner graph representation with \acp{VN} and \acp{CN} (as shown in figure \ref{fig:dec:tanner})
+to a spatial representation (figure \ref{fig:dec:spatial}),
+where the codewords are some of the edges of a hypercube.
+The goal is to find the point $\tilde{\boldsymbol{c}}$,
+which minimizes the objective function $g$.
+
+%
+% Figure showing decoding space
+%
+
+\begin{figure}[H]
+    \centering
+
+    \begin{subfigure}[c]{0.47\textwidth}
+        \centering
+    
+        \tikzstyle{checknode} = [color=KITblue, fill=KITblue,
+                                draw, regular polygon,regular polygon sides=4,
+                                inner sep=0pt, minimum size=12pt]
+        \tikzstyle{variablenode} = [color=KITgreen, fill=KITgreen,
+                                draw, circle, inner sep=0pt, minimum size=10pt]
+
+        \begin{tikzpicture}[scale=1, transform shape]
+            \node[checknode,
+                  label={[below, label distance=-0.4cm, align=center]
+                  \acs{CN}\\$\left( c_1 + c_2 + c_3 = 0 \right) $}]
+                (cn) at (0, 0) {};
+            \node[variablenode, label={[above, align=center] \acs{VN}\\$\left( c_1 \right)$}]
+                (c1) at (-2, 2) {};
+            \node[variablenode, label={[above, align=center] \acs{VN}\\$\left( c_2 \right)$}]
+                (c2) at (0, 2) {};
+            \node[variablenode, label={[above, align=center] \acs{VN}\\$\left( c_3 \right)$}]
+                (c3) at (2, 2) {};
+
+            \draw (cn) -- (c1);
+            \draw (cn) -- (c2);
+            \draw (cn) -- (c3);
+        \end{tikzpicture}
+    
+        \caption{Tanner graph representation of a single parity-check code}
+        \label{fig:dec:tanner}
+    \end{subfigure}%
+    \hfill%
+    \begin{subfigure}[c]{0.47\textwidth}
+        \centering
+
+        \tikzstyle{codeword} = [color=KITblue, fill=KITblue,
+                                draw, circle, inner sep=0pt, minimum size=4pt]
+
+        \tdplotsetmaincoords{60}{25}
+        \begin{tikzpicture}[scale=1, transform shape, tdplot_main_coords]
+            % Cube
+
+            \coordinate (p000) at (0, 0, 0);
+            \coordinate (p001) at (0, 0, 2);
+            \coordinate (p010) at (0, 2, 0);
+            \coordinate (p011) at (0, 2, 2);
+            \coordinate (p100) at (2, 0, 0);
+            \coordinate (p101) at (2, 0, 2);
+            \coordinate (p110) at (2, 2, 0);
+            \coordinate (p111) at (2, 2, 2);
+
+            \draw[] (p000) -- (p100);
+            \draw[] (p100) -- (p101);
+            \draw[] (p101) -- (p001);
+            \draw[] (p001) -- (p000);
+
+            \draw[dashed] (p010) -- (p110);
+            \draw[]       (p110) -- (p111);
+            \draw[]       (p111) -- (p011);
+            \draw[dashed] (p011) -- (p010);
+
+            \draw[dashed] (p000) -- (p010);
+            \draw[]       (p100) -- (p110);
+            \draw[]       (p101) -- (p111);
+            \draw[]       (p001) -- (p011);
+
+            % Polytope Vertices
+
+            \node[codeword] (c000) at (p000) {};
+            \node[codeword] (c101) at (p101) {};
+            \node[codeword] (c110) at (p110) {};
+            \node[codeword] (c011) at (p011) {};
+
+            % Polytope Edges
+
+%            \draw[line width=1pt, color=KITblue] (c000) -- (c101);
+%            \draw[line width=1pt, color=KITblue] (c000) -- (c110);
+%            \draw[line width=1pt, color=KITblue] (c000) -- (c011);
+%
+%            \draw[line width=1pt, color=KITblue] (c101) -- (c110);
+%            \draw[line width=1pt, color=KITblue] (c101) -- (c011);
+%
+%            \draw[line width=1pt, color=KITblue] (c011) -- (c110);
+
+            % Polytope Annotations
+
+            \node[color=KITblue, below=0cm of c000]    {$\left( 0, 0, 0 \right) $};
+            \node[color=KITblue, right=0.17cm of c101] {$\left( 1, 0, 1 \right) $};
+            \node[color=KITblue, right=0cm of c110]    {$\left( 1, 1, 0 \right) $};
+            \node[color=KITblue, above=0cm of c011]    {$\left( 0, 1, 1 \right) $};
+
+            % c
+
+            \node[color=KITgreen, fill=KITgreen,
+                  draw, circle, inner sep=0pt, minimum size=4pt] (c) at (0.9, 0.7, 1) {};
+            \node[color=KITgreen, right=0cm of c] {$\tilde{\boldsymbol{c}}$};
+        \end{tikzpicture}
+
+        \caption{Spatial representation of a single parity-check code}
+        \label{fig:dec:spatial}
+    \end{subfigure}%
+
+    \caption{Different representations of the decoding problem}
+\end{figure}
+
+
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 \section{Optimization Methods}
 \label{sec:theo:Optimization Methods}
@ -484,3 +640,4 @@ condition that the step size be $\mu$:%
 %        \hspace{5mm} \mu > 0
 .\end{align*}
 %
+
--- a/latex/thesis/thesis.tex
+++ b/latex/thesis/thesis.tex
@ -213,9 +213,9 @@

    \include{chapters/introduction}
    \include{chapters/theoretical_background}
-    \include{chapters/decoding_techniques}
-    \include{chapters/methodology_and_implementation}
-    \include{chapters/analysis_of_results}
+    \include{chapters/proximal_decoding}
+    \include{chapters/lp_dec_using_admm}
+    \include{chapters/comparison}
    \include{chapters/discussion}
    \include{chapters/conclusion}
    \include{chapters/appendix}