Restructured document
This commit is contained in:
parent
a0a13dbb2d
commit
9d0458f6b3
@ -104,7 +104,7 @@ The iterative algorithm can then be expressed as%
|
||||
%
|
||||
|
||||
Using the definition of the proximal operator, the $\tilde{\boldsymbol{c}}$ update step
|
||||
can be rewritten to match the definition given in section \ref{sec:dec:LP Decoding using ADMM}:%
|
||||
can be rewritten to match the definition given in section \ref{sec:lp:Decoding Algorithm}:%
|
||||
%
|
||||
\begin{align*}
|
||||
\tilde{\boldsymbol{c}} &\leftarrow \textbf{prox}_{\mu f}\left( \tilde{\boldsymbol{c}}
|
||||
|
||||
@ -1,42 +1,22 @@
|
||||
\chapter{Analysis of Results}%
|
||||
\label{chapter:Analysis of Results}
|
||||
\chapter{Comparison of Proximal Decoding and \acs{LP} Decoding using \acs{ADMM}}%
|
||||
\label{chapter:comparison}
|
||||
|
||||
TODO
|
||||
|
||||
%In this chapter, proximal decoding and \ac{LP} Decoding using \ac{ADMM} are compared.
|
||||
%First the two algorithms are compared on a theoretical basis.
|
||||
%Subsequently, their respective simulation results are examined and their
|
||||
%differences interpreted on the basis of their theoretical structure.
|
||||
%
|
||||
%some similarities between the proximal decoding algorithm
|
||||
%and \ac{LP} decoding using \ac{ADMM} are be pointed out.
|
||||
%The two algorithms are compared and their different computational and decoding
|
||||
%performance is interpreted on the basis of their theoretical structure.
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\section{LP Decoding using ADMM}%
|
||||
\label{sec:ana:LP Decoding using ADMM}
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\section{Proximal Decoding}%
|
||||
\label{sec:ana:Proximal Decoding}
|
||||
|
||||
\begin{itemize}
|
||||
\item Parameter choice
|
||||
\item FER
|
||||
\item Improved implementation
|
||||
\end{itemize}
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\section{Comparison of BP, Proximal Decoding and LP Decoding using ADMM}%
|
||||
\label{sec:ana:Comparison of BP, Proximal Decoding and LP Decoding using ADMM}
|
||||
|
||||
\begin{itemize}
|
||||
\item Decoding performance
|
||||
\item Complexity \& runtime(mention difficulty in reaching conclusive
|
||||
results when comparing implementations)
|
||||
\end{itemize}
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\section{Theoretical Comparison of Proximal Decoding and LP Decoding using ADMM}%
|
||||
\label{sec:Theoretical Comparison of Proximal Decoding and LP Decoding using ADMM}
|
||||
|
||||
In this section, some similarities between the proximal decoding algorithm
|
||||
and \ac{LP} decoding using \ac{ADMM} are be pointed out.
|
||||
The two algorithms are compared and their different computational and decoding
|
||||
performance is interpreted on the basis of their theoretical structure.
|
||||
\section{Theoretical Comparison}%
|
||||
\label{sec:comp:theo}
|
||||
|
||||
\ac{ADMM} and the proximal gradient method can both be expressed in terms of
|
||||
proximal operators.
|
||||
@ -154,6 +134,8 @@ The advantage which arises because of this when using \ac{ADMM} is that
|
||||
it can be easily detected, when the algorithm gets stuck - the algorithm
|
||||
returns a pseudocodeword, the components of which are fractional.
|
||||
|
||||
\todo{Compare time complexity using Big-O notation}
|
||||
|
||||
\begin{itemize}
|
||||
\item The comparison of actual implementations is always debatable /
|
||||
contentious, since it is difficult to separate differences in
|
||||
@ -169,3 +151,11 @@ returns a pseudocodeword, the components of which are fractional.
|
||||
\item Proximal decoding faster than \ac{ADMM} $\rightarrow$ dafuq
|
||||
(larger number of iterations before convergence?)
|
||||
\end{itemize}
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\section{Comparison of Results}%
|
||||
\label{sec:comp:res}
|
||||
|
||||
TODO
|
||||
|
||||
@ -1,172 +1,12 @@
|
||||
\chapter{Decoding Techniques}%
|
||||
\label{chapter:decoding_techniques}
|
||||
\chapter{\acs{LP} Decoding using \acs{ADMM}}%
|
||||
\label{chapter:lp_dec_using_admm}
|
||||
|
||||
In this chapter, the decoding techniques examined in this work are detailed.
|
||||
First, an overview of the general methodology of using optimization methods
|
||||
for channel decoding is given.
|
||||
Then, the field of \ac{LP} decoding and an \ac{ADMM}-based \ac{LP} decoding
|
||||
algorithm are introduced.
|
||||
Finally, the \textit{proximal decoding} algorithm is presented.
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\section{Decoding using Optimization Methods}%
|
||||
\label{sec:dec:Decoding using Optimization Methods}
|
||||
|
||||
%
|
||||
% General methodology
|
||||
%
|
||||
|
||||
The general idea behind using optimization methods for channel decoding
|
||||
is to reformulate the decoding problem as an optimization problem.
|
||||
This new formulation can then be solved with one of the many
|
||||
available optimization algorithms.
|
||||
|
||||
Generally, the original decoding problem considered is either the \ac{MAP} or
|
||||
the \ac{ML} decoding problem:%
|
||||
%
|
||||
\begin{align}
|
||||
\hat{\boldsymbol{c}}_{\text{\ac{MAP}}} &= \argmax_{\boldsymbol{c} \in \mathcal{C}}
|
||||
p_{\boldsymbol{C} \mid \boldsymbol{Y}} \left(\boldsymbol{c} \mid \boldsymbol{y}
|
||||
\right) \label{eq:dec:map}\\
|
||||
\hat{\boldsymbol{c}}_{\text{\ac{ML}}} &= \argmax_{\boldsymbol{c} \in \mathcal{C}}
|
||||
f_{\boldsymbol{Y} \mid \boldsymbol{C}} \left( \boldsymbol{y} \mid \boldsymbol{c}
|
||||
\right) \label{eq:dec:ml}
|
||||
.\end{align}%
|
||||
%
|
||||
The goal is to arrive at a formulation, where a certain objective function
|
||||
$g : \mathbb{R}^n \rightarrow \mathbb{R} $ must be minimized under certain constraints:%
|
||||
%
|
||||
\begin{align*}
|
||||
\text{minimize}\hspace{2mm} &g\left( \tilde{\boldsymbol{c}} \right)\\
|
||||
\text{subject to}\hspace{2mm} &\tilde{\boldsymbol{c}} \in D
|
||||
,\end{align*}%
|
||||
%
|
||||
where $D \subseteq \mathbb{R}^n$ is the domain of values attainable for $\tilde{\boldsymbol{c}}$
|
||||
and represents the constraints.
|
||||
|
||||
In contrast to the established message-passing decoding algorithms,
|
||||
the prespective then changes from observing the decoding process in its
|
||||
Tanner graph representation with \acp{VN} and \acp{CN} (as shown in figure \ref{fig:dec:tanner})
|
||||
to a spatial representation (figure \ref{fig:dec:spatial}),
|
||||
where the codewords are some of the edges of a hypercube.
|
||||
The goal is to find the point $\tilde{\boldsymbol{c}}$,
|
||||
which minimizes the objective function $g$.
|
||||
|
||||
%
|
||||
% Figure showing decoding space
|
||||
%
|
||||
|
||||
\begin{figure}[H]
|
||||
\centering
|
||||
|
||||
\begin{subfigure}[c]{0.47\textwidth}
|
||||
\centering
|
||||
|
||||
\tikzstyle{checknode} = [color=KITblue, fill=KITblue,
|
||||
draw, regular polygon,regular polygon sides=4,
|
||||
inner sep=0pt, minimum size=12pt]
|
||||
\tikzstyle{variablenode} = [color=KITgreen, fill=KITgreen,
|
||||
draw, circle, inner sep=0pt, minimum size=10pt]
|
||||
|
||||
\begin{tikzpicture}[scale=1, transform shape]
|
||||
\node[checknode,
|
||||
label={[below, label distance=-0.4cm, align=center]
|
||||
\acs{CN}\\$\left( c_1 + c_2 + c_3 = 0 \right) $}]
|
||||
(cn) at (0, 0) {};
|
||||
\node[variablenode, label={[above, align=center] \acs{VN}\\$\left( c_1 \right)$}]
|
||||
(c1) at (-2, 2) {};
|
||||
\node[variablenode, label={[above, align=center] \acs{VN}\\$\left( c_2 \right)$}]
|
||||
(c2) at (0, 2) {};
|
||||
\node[variablenode, label={[above, align=center] \acs{VN}\\$\left( c_3 \right)$}]
|
||||
(c3) at (2, 2) {};
|
||||
|
||||
\draw (cn) -- (c1);
|
||||
\draw (cn) -- (c2);
|
||||
\draw (cn) -- (c3);
|
||||
\end{tikzpicture}
|
||||
|
||||
\caption{Tanner graph representation of a single parity-check code}
|
||||
\label{fig:dec:tanner}
|
||||
\end{subfigure}%
|
||||
\hfill%
|
||||
\begin{subfigure}[c]{0.47\textwidth}
|
||||
\centering
|
||||
|
||||
\tikzstyle{codeword} = [color=KITblue, fill=KITblue,
|
||||
draw, circle, inner sep=0pt, minimum size=4pt]
|
||||
|
||||
\tdplotsetmaincoords{60}{25}
|
||||
\begin{tikzpicture}[scale=1, transform shape, tdplot_main_coords]
|
||||
% Cube
|
||||
|
||||
\coordinate (p000) at (0, 0, 0);
|
||||
\coordinate (p001) at (0, 0, 2);
|
||||
\coordinate (p010) at (0, 2, 0);
|
||||
\coordinate (p011) at (0, 2, 2);
|
||||
\coordinate (p100) at (2, 0, 0);
|
||||
\coordinate (p101) at (2, 0, 2);
|
||||
\coordinate (p110) at (2, 2, 0);
|
||||
\coordinate (p111) at (2, 2, 2);
|
||||
|
||||
\draw[] (p000) -- (p100);
|
||||
\draw[] (p100) -- (p101);
|
||||
\draw[] (p101) -- (p001);
|
||||
\draw[] (p001) -- (p000);
|
||||
|
||||
\draw[dashed] (p010) -- (p110);
|
||||
\draw[] (p110) -- (p111);
|
||||
\draw[] (p111) -- (p011);
|
||||
\draw[dashed] (p011) -- (p010);
|
||||
|
||||
\draw[dashed] (p000) -- (p010);
|
||||
\draw[] (p100) -- (p110);
|
||||
\draw[] (p101) -- (p111);
|
||||
\draw[] (p001) -- (p011);
|
||||
|
||||
% Polytope Vertices
|
||||
|
||||
\node[codeword] (c000) at (p000) {};
|
||||
\node[codeword] (c101) at (p101) {};
|
||||
\node[codeword] (c110) at (p110) {};
|
||||
\node[codeword] (c011) at (p011) {};
|
||||
|
||||
% Polytope Edges
|
||||
|
||||
% \draw[line width=1pt, color=KITblue] (c000) -- (c101);
|
||||
% \draw[line width=1pt, color=KITblue] (c000) -- (c110);
|
||||
% \draw[line width=1pt, color=KITblue] (c000) -- (c011);
|
||||
%
|
||||
% \draw[line width=1pt, color=KITblue] (c101) -- (c110);
|
||||
% \draw[line width=1pt, color=KITblue] (c101) -- (c011);
|
||||
%
|
||||
% \draw[line width=1pt, color=KITblue] (c011) -- (c110);
|
||||
|
||||
% Polytope Annotations
|
||||
|
||||
\node[color=KITblue, below=0cm of c000] {$\left( 0, 0, 0 \right) $};
|
||||
\node[color=KITblue, right=0.17cm of c101] {$\left( 1, 0, 1 \right) $};
|
||||
\node[color=KITblue, right=0cm of c110] {$\left( 1, 1, 0 \right) $};
|
||||
\node[color=KITblue, above=0cm of c011] {$\left( 0, 1, 1 \right) $};
|
||||
|
||||
% c
|
||||
|
||||
\node[color=KITgreen, fill=KITgreen,
|
||||
draw, circle, inner sep=0pt, minimum size=4pt] (c) at (0.9, 0.7, 1) {};
|
||||
\node[color=KITgreen, right=0cm of c] {$\tilde{\boldsymbol{c}}$};
|
||||
\end{tikzpicture}
|
||||
|
||||
\caption{Spatial representation of a single parity-check code}
|
||||
\label{fig:dec:spatial}
|
||||
\end{subfigure}%
|
||||
|
||||
\caption{Different representations of the decoding problem}
|
||||
\end{figure}
|
||||
TODO
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\section{LP Decoding}%
|
||||
\label{sec:dec:LP Decoding}
|
||||
\label{sec:lp:LP Decoding}
|
||||
|
||||
\Ac{LP} decoding is a subject area introduced by Feldman et al.
|
||||
\cite{feldman_paper}. They reframe the decoding problem as an
|
||||
@ -276,7 +116,7 @@ the transfer matrix would be \cite[Sec. II, A]{efficient_lp_dec_admm}
|
||||
.\end{align*}%
|
||||
%
|
||||
|
||||
In figure \ref{fig:dec:poly}, the two relaxations are compared for an
|
||||
In figure \ref{fig:lp:poly}, the two relaxations are compared for an
|
||||
examplary code, which is described by the generator and parity-check matrices%
|
||||
%
|
||||
\begin{align}
|
||||
@ -298,13 +138,13 @@ and has only two possible codewords:
|
||||
\begin{bmatrix} 0 & 1 & 1 \end{bmatrix} \right\}
|
||||
.\end{align*}
|
||||
%
|
||||
Figure \ref{fig:dec:poly:exact_ilp} shows the domain of exact \ac{ML} decoding.
|
||||
Figure \ref{fig:lp:poly:exact_ilp} shows the domain of exact \ac{ML} decoding.
|
||||
The first relaxation, onto the codeword polytope $\text{poly}\left( \mathcal{C} \right) $,
|
||||
is shown in figure \ref{fig:dec:poly:exact};
|
||||
is shown in figure \ref{fig:lp:poly:exact};
|
||||
this expresses the constraints for the equivalent linear program to exact \ac{ML} decoding.
|
||||
$\text{poly}\left( \mathcal{C} \right) $ is further relaxed onto the relaxed codeword polytope
|
||||
$\overline{Q}$, shown in figure \ref{fig:dec:poly:relaxed}.
|
||||
Figure \ref{fig:dec:poly:local} shows how $\overline{Q}$ is formed by intersecting the
|
||||
$\overline{Q}$, shown in figure \ref{fig:lp:poly:relaxed}.
|
||||
Figure \ref{fig:lp:poly:local} shows how $\overline{Q}$ is formed by intersecting the
|
||||
local codeword polytopes of each check node.
|
||||
%
|
||||
%
|
||||
@ -368,7 +208,7 @@ local codeword polytopes of each check node.
|
||||
\end{tikzpicture}
|
||||
|
||||
\caption{Set of all codewords $\mathcal{C}$}
|
||||
\label{fig:dec:poly:exact_ilp}
|
||||
\label{fig:lp:poly:exact_ilp}
|
||||
\end{subfigure}\\[1em]
|
||||
\begin{subfigure}{\textwidth}
|
||||
\centering
|
||||
@ -429,7 +269,7 @@ local codeword polytopes of each check node.
|
||||
\end{tikzpicture}
|
||||
|
||||
\caption{Codeword polytope $\text{poly}\left( \mathcal{C} \right) $}
|
||||
\label{fig:dec:poly:exact}
|
||||
\label{fig:lp:poly:exact}
|
||||
\end{subfigure}
|
||||
\end{subfigure} \hfill%
|
||||
%
|
||||
@ -574,7 +414,7 @@ local codeword polytopes of each check node.
|
||||
\end{tikzpicture}
|
||||
|
||||
\caption{Local codeword polytopes of the check nodes}
|
||||
\label{fig:dec:poly:local}
|
||||
\label{fig:lp:poly:local}
|
||||
\end{subfigure}\\[1em]
|
||||
\begin{subfigure}{\textwidth}
|
||||
\centering
|
||||
@ -648,7 +488,7 @@ local codeword polytopes of each check node.
|
||||
\end{tikzpicture}
|
||||
|
||||
\caption{Relaxed codeword polytope $\overline{Q}$}
|
||||
\label{fig:dec:poly:relaxed}
|
||||
\label{fig:lp:poly:relaxed}
|
||||
\end{subfigure}
|
||||
\end{subfigure}
|
||||
|
||||
@ -666,7 +506,7 @@ local codeword polytopes of each check node.
|
||||
\caption{Visualization of the codeword polytope and the relaxed codeword
|
||||
polytope of the code described by equations (\ref{eq:lp:example_code_def_gen})
|
||||
and (\ref{eq:lp:example_code_def_par})}
|
||||
\label{fig:dec:poly}
|
||||
\label{fig:lp:poly}
|
||||
\end{figure}%
|
||||
%
|
||||
\noindent It can be seen that the relaxed codeword polytope $\overline{Q}$ introduces
|
||||
@ -689,10 +529,10 @@ The resulting formulation of the relaxed optimization problem becomes%
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\section{LP Decoding using ADMM}%
|
||||
\label{sec:dec:LP Decoding using ADMM}
|
||||
\section{Decoding Algorithm}%
|
||||
\label{sec:lp:Decoding Algorithm}
|
||||
|
||||
The \ac{LP} decoding formulation in section \ref{sec:dec:Decoding using Optimization Methods}
|
||||
The \ac{LP} decoding formulation in section \ref{sec:lp:LP Decoding}
|
||||
is a very general one that can be solved with a number of different optimization methods.
|
||||
In this work \ac{ADMM} is examined, as its distributed nature allows for a very efficient
|
||||
implementation.
|
||||
@ -879,246 +719,12 @@ have been proposed (e.g., in \cite{original_admm}, \cite{efficient_lp_dec_admm},
|
||||
The method chosen here is the one presented in \cite{lautern}.
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\section{Implementation Details}%
|
||||
\label{sec:lp:Implementation Details}
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\section{Proximal Decoding}%
|
||||
\label{sec:dec:Proximal Decoding}
|
||||
\section{Results}%
|
||||
\label{sec:lp:Results}
|
||||
|
||||
Proximal decoding was proposed by Wadayama et. al as a novel formulation of
|
||||
optimization-based decoding \cite{proximal_paper}.
|
||||
With this algorithm, minimization is performed using the proximal gradient
|
||||
method.
|
||||
In contrast to \ac{LP} decoding, the objective function is based on a
|
||||
non-convex optimization formulation of the \ac{MAP} decoding problem.
|
||||
|
||||
In order to derive the objective function, the authors begin with the
|
||||
\ac{MAP} decoding rule, expressed as a continuous maximization problem%
|
||||
\footnote{The expansion of the domain to be continuous doesn't constitute a
|
||||
material difference in the meaning of the rule.
|
||||
The only change is that what previously were \acp{PMF} now have to be expressed
|
||||
in terms of \acp{PDF}.}
|
||||
over $\boldsymbol{x}$:%
|
||||
%
|
||||
\begin{align}
|
||||
\hat{\boldsymbol{x}} = \argmax_{\tilde{\boldsymbol{x}} \in \mathbb{R}^{n}}
|
||||
f_{\tilde{\boldsymbol{X}} \mid \boldsymbol{Y}}
|
||||
\left( \tilde{\boldsymbol{x}} \mid \boldsymbol{y} \right)
|
||||
= \argmax_{\tilde{\boldsymbol{x}} \in \mathbb{R}^{n}} f_{\boldsymbol{Y}
|
||||
\mid \tilde{\boldsymbol{X}}}
|
||||
\left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right)
|
||||
f_{\tilde{\boldsymbol{X}}}\left( \tilde{\boldsymbol{x}} \right)%
|
||||
\label{eq:prox:vanilla_MAP}
|
||||
.\end{align}%
|
||||
%
|
||||
The likelihood $f_{\boldsymbol{Y} \mid \tilde{\boldsymbol{X}}}
|
||||
\left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right) $ is a known function
|
||||
determined by the channel model.
|
||||
The prior \ac{PDF} $f_{\tilde{\boldsymbol{X}}}\left( \tilde{\boldsymbol{x}} \right)$ is also
|
||||
known, as the equal probability assumption is made on
|
||||
$\mathcal{C}$.
|
||||
However, since the considered domain is continuous,
|
||||
the prior \ac{PDF} cannot be ignored as a constant during the minimization
|
||||
as is often done, and has a rather unwieldy representation:%
|
||||
%
|
||||
\begin{align}
|
||||
f_{\tilde{\boldsymbol{X}}}\left( \tilde{\boldsymbol{x}} \right) =
|
||||
\frac{1}{\left| \mathcal{C} \right| }
|
||||
\sum_{\boldsymbol{c} \in \mathcal{C} }
|
||||
\delta\big( \tilde{\boldsymbol{x}} - \left( -1 \right) ^{\boldsymbol{c}}\big)
|
||||
\label{eq:prox:prior_pdf}
|
||||
.\end{align}%
|
||||
%
|
||||
In order to rewrite the prior \ac{PDF}
|
||||
$f_{\tilde{\boldsymbol{X}}}\left( \tilde{\boldsymbol{x}} \right)$,
|
||||
the so-called \textit{code-constraint polynomial} is introduced as:%
|
||||
%
|
||||
\begin{align*}
|
||||
h\left( \tilde{\boldsymbol{x}} \right) =
|
||||
\underbrace{\sum_{i=1}^{n} \left( \tilde{x_i}^2-1 \right) ^2}_{\text{Bipolar constraint}}
|
||||
+ \underbrace{\sum_{j=1}^{m} \left[
|
||||
\left( \prod_{i\in N_c \left( j \right) } \tilde{x_i} \right)
|
||||
-1 \right] ^2}_{\text{Parity constraint}}%
|
||||
.\end{align*}%
|
||||
%
|
||||
The intention of this function is to provide a way to penalize vectors far
|
||||
from a codeword and favor those close to one.
|
||||
In order to achieve this, the polynomial is composed of two parts: one term
|
||||
representing the bipolar constraint, providing for a discrete solution of the
|
||||
continuous optimization problem, and one term representing the parity
|
||||
constraints, accommodating the role of the parity-check matrix $\boldsymbol{H}$.
|
||||
The prior \ac{PDF} is then approximated using the code-constraint polynomial as:%
|
||||
%
|
||||
\begin{align}
|
||||
f_{\tilde{\boldsymbol{X}}}\left( \tilde{\boldsymbol{x}} \right)
|
||||
\approx \frac{1}{Z}\mathrm{e}^{-\gamma h\left( \tilde{\boldsymbol{x}} \right) }%
|
||||
\label{eq:prox:prior_pdf_approx}
|
||||
.\end{align}%
|
||||
%
|
||||
The authors justify this approximation by arguing, that for
|
||||
$\gamma \rightarrow \infty$, the approximation in equation
|
||||
(\ref{eq:prox:prior_pdf_approx}) approaches the original function in equation
|
||||
(\ref{eq:prox:prior_pdf}).
|
||||
This approximation can then be plugged into equation (\ref{eq:prox:vanilla_MAP})
|
||||
and the likelihood can be rewritten using the negative log-likelihood
|
||||
$L \left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right) = -\ln\left(
|
||||
f_{\boldsymbol{Y} \mid \tilde{\boldsymbol{X}}}\left(
|
||||
\boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right) \right) $:%
|
||||
%
|
||||
\begin{align*}
|
||||
\hat{\boldsymbol{x}} &= \argmax_{\tilde{\boldsymbol{x}} \in \mathbb{R}^{n}}
|
||||
\mathrm{e}^{- L\left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right) }
|
||||
\mathrm{e}^{-\gamma h\left( \tilde{\boldsymbol{x}} \right) } \\
|
||||
&= \argmin_{\tilde{\boldsymbol{x}} \in \mathbb{R}^n} \big(
|
||||
L\left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right)
|
||||
+ \gamma h\left( \tilde{\boldsymbol{x}} \right)
|
||||
\big)%
|
||||
.\end{align*}%
|
||||
%
|
||||
Thus, with proximal decoding, the objective function
|
||||
$g\left( \tilde{\boldsymbol{x}} \right)$ considered is%
|
||||
%
|
||||
\begin{align}
|
||||
g\left( \tilde{\boldsymbol{x}} \right) = L\left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}}
|
||||
\right)
|
||||
+ \gamma h\left( \tilde{\boldsymbol{x}} \right)%
|
||||
\label{eq:prox:objective_function}
|
||||
\end{align}%
|
||||
%
|
||||
and the decoding problem is reformulated to%
|
||||
%
|
||||
\begin{align*}
|
||||
\text{minimize}\hspace{2mm} &L\left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right)
|
||||
+ \gamma h\left( \tilde{\boldsymbol{x}} \right)\\
|
||||
\text{subject to}\hspace{2mm} &\tilde{\boldsymbol{x}} \in \mathbb{R}^n
|
||||
.\end{align*}
|
||||
%
|
||||
|
||||
For the solution of the approximate \ac{MAP} decoding problem, the two parts
|
||||
of equation (\ref{eq:prox:objective_function}) are considered separately:
|
||||
the minimization of the objective function occurs in an alternating
|
||||
fashion, switching between the negative log-likelihood
|
||||
$L\left( \boldsymbol{y} \mid \boldsymbol{x} \right) $ and the scaled
|
||||
code-constraint polynomial $\gamma h\left( \boldsymbol{x} \right) $.
|
||||
Two helper variables, $\boldsymbol{r}$ and $\boldsymbol{s}$, are introduced,
|
||||
describing the result of each of the two steps.
|
||||
The first step, minimizing the log-likelihood, is performed using gradient
|
||||
descent:%
|
||||
%
|
||||
\begin{align}
|
||||
\boldsymbol{r} \leftarrow \boldsymbol{s} - \omega \nabla
|
||||
L\left( \boldsymbol{y} \mid \boldsymbol{s} \right),
|
||||
\hspace{5mm}\omega > 0
|
||||
\label{eq:prox:step_log_likelihood}
|
||||
.\end{align}%
|
||||
%
|
||||
For the second step, minimizing the scaled code-constraint polynomial, the
|
||||
proximal gradient method is used and the \textit{proximal operator} of
|
||||
$\gamma h\left( \tilde{\boldsymbol{x}} \right) $ has to be computed.
|
||||
It is then immediately approximated with gradient-descent:%
|
||||
%
|
||||
\begin{align*}
|
||||
\textbf{prox}_{\gamma h} \left( \tilde{\boldsymbol{x}} \right) &\equiv
|
||||
\argmin_{\boldsymbol{t} \in \mathbb{R}^n}
|
||||
\left( \gamma h\left( \boldsymbol{t} \right) +
|
||||
\frac{1}{2} \lVert \boldsymbol{t} - \tilde{\boldsymbol{x}} \rVert \right)\\
|
||||
&\approx \tilde{\boldsymbol{x}} - \gamma \nabla h \left( \tilde{\boldsymbol{x}} \right),
|
||||
\hspace{5mm} \gamma > 0, \text{ small}
|
||||
.\end{align*}%
|
||||
%
|
||||
The second step thus becomes%
|
||||
%
|
||||
\begin{align*}
|
||||
\boldsymbol{s} \leftarrow \boldsymbol{r} - \gamma \nabla h\left( \boldsymbol{r} \right),
|
||||
\hspace{5mm}\gamma > 0,\text{ small}
|
||||
.\end{align*}
|
||||
%
|
||||
While the approximation of the prior \ac{PDF} made in equation (\ref{eq:prox:prior_pdf_approx})
|
||||
theoretically becomes better
|
||||
with larger $\gamma$, the constraint that $\gamma$ be small is important,
|
||||
as it keeps the effect of $h\left( \tilde{\boldsymbol{x}} \right) $ on the landscape
|
||||
of the objective function small.
|
||||
Otherwise, unwanted stationary points, including local minima, are introduced.
|
||||
The authors say that ``in practice, the value of $\gamma$ should be adjusted
|
||||
according to the decoding performance.'' \cite[Sec. 3.1]{proximal_paper}.
|
||||
|
||||
%The components of the gradient of the code-constraint polynomial can be computed as follows:%
|
||||
%%
|
||||
%\begin{align*}
|
||||
% \frac{\partial}{\partial x_k} h\left( \boldsymbol{x} \right) =
|
||||
% 4\left( x_k^2 - 1 \right) x_k + \frac{2}{x_k}
|
||||
% \sum_{i\in \mathcal{B}\left( k \right) } \left(
|
||||
% \left( \prod_{j\in\mathcal{A}\left( i \right)} x_j\right)^2
|
||||
% - \prod_{j\in\mathcal{A}\left( i \right) }x_j \right)
|
||||
%.\end{align*}%
|
||||
%\todo{Only multiplication?}%
|
||||
%\todo{$x_k$: $k$ or some other indexing variable?}%
|
||||
%%
|
||||
In the case of \ac{AWGN}, the likelihood
|
||||
$f_{\boldsymbol{Y} \mid \tilde{\boldsymbol{X}}}
|
||||
\left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right)$
|
||||
is%
|
||||
%
|
||||
\begin{align*}
|
||||
f_{\boldsymbol{Y} \mid \tilde{\boldsymbol{X}}}
|
||||
\left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right)
|
||||
= \frac{1}{\sqrt{2\pi\sigma^2}}\mathrm{e}^{
|
||||
-\frac{\lVert \boldsymbol{y}-\tilde{\boldsymbol{x}}
|
||||
\rVert^2 }
|
||||
{2\sigma^2}}
|
||||
.\end{align*}
|
||||
%
|
||||
Thus, the gradient of the negative log-likelihood becomes%
|
||||
\footnote{For the minimization, constants can be disregarded. For this reason,
|
||||
it suffices to consider only proportionality instead of equality.}%
|
||||
%
|
||||
\begin{align*}
|
||||
\nabla L \left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right)
|
||||
&\propto -\nabla \lVert \boldsymbol{y} - \tilde{\boldsymbol{x}} \rVert^2\\
|
||||
&\propto \tilde{\boldsymbol{x}} - \boldsymbol{y}
|
||||
,\end{align*}%
|
||||
%
|
||||
allowing equation (\ref{eq:prox:step_log_likelihood}) to be rewritten as%
|
||||
%
|
||||
\begin{align*}
|
||||
\boldsymbol{r} \leftarrow \boldsymbol{s}
|
||||
- \omega \left( \boldsymbol{s} - \boldsymbol{y} \right)
|
||||
.\end{align*}
|
||||
%
|
||||
|
||||
One thing to consider during the actual decoding process, is that the gradient
|
||||
of the code-constraint polynomial can take on extremely large values.
|
||||
To avoid numerical instability, an additional step is added, where all
|
||||
components of the current estimate are clipped to $\left[-\eta, \eta \right]$,
|
||||
where $\eta$ is a positive constant slightly larger than one:%
|
||||
%
|
||||
\begin{align*}
|
||||
\boldsymbol{s} \leftarrow \Pi_{\eta} \left( \boldsymbol{r}
|
||||
- \gamma \nabla h\left( \boldsymbol{r} \right) \right)
|
||||
,\end{align*}
|
||||
%
|
||||
$\Pi_{\eta}\left( \cdot \right) $ expressing the projection onto
|
||||
$\left[ -\eta, \eta \right]^n$.
|
||||
|
||||
The iterative decoding process resulting from these considerations is shown in
|
||||
figure \ref{fig:prox:alg}.
|
||||
|
||||
\begin{figure}[H]
|
||||
\centering
|
||||
|
||||
\begin{genericAlgorithm}[caption={}, label={}]
|
||||
$\boldsymbol{s} \leftarrow \boldsymbol{0}$
|
||||
for $K$ iterations do
|
||||
$\boldsymbol{r} \leftarrow \boldsymbol{s} - \omega \left( \boldsymbol{s} - \boldsymbol{y} \right) $
|
||||
$\boldsymbol{s} \leftarrow \Pi_\eta \left(\boldsymbol{r} - \gamma \nabla h\left( \boldsymbol{r} \right) \right)$
|
||||
$\boldsymbol{\hat{x}} \leftarrow \text{sign}\left( \boldsymbol{s} \right) $
|
||||
if $\boldsymbol{H}\boldsymbol{\hat{c}} = \boldsymbol{0}$ do
|
||||
return $\boldsymbol{\hat{c}}$
|
||||
end if
|
||||
end for
|
||||
return $\boldsymbol{\hat{c}}$
|
||||
\end{genericAlgorithm}
|
||||
|
||||
|
||||
\caption{Proximal decoding algorithm for an \ac{AWGN} channel}
|
||||
\label{fig:prox:alg}
|
||||
\end{figure}
|
||||
@ -1,34 +0,0 @@
|
||||
\chapter{Methodology and Implementation}%
|
||||
\label{chapter:methodology_and_implementation}
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\section{General implementation process}%
|
||||
\label{sec:impl:General implementation process}
|
||||
|
||||
\begin{itemize}
|
||||
\item First python using numpy
|
||||
\item Then C++ using Eigen
|
||||
\end{itemize}
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\section{LP Decoding using ADMM}%
|
||||
\label{sec:impl:LP Decoding using ADMM}
|
||||
|
||||
\begin{itemize}
|
||||
\item Choice of parameters
|
||||
\item Selected projection algorithm
|
||||
\item Adaptive linear programming decoding?
|
||||
\end{itemize}
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\section{Proximal Decoding}%
|
||||
\label{sec:impl:Proximal Decoding}
|
||||
|
||||
\begin{itemize}
|
||||
\item Choice of parameters
|
||||
\item Road to improved implemenation
|
||||
\end{itemize}
|
||||
|
||||
264
latex/thesis/chapters/proximal_decoding.tex
Normal file
264
latex/thesis/chapters/proximal_decoding.tex
Normal file
@ -0,0 +1,264 @@
|
||||
\chapter{Proximal Decoding}%
|
||||
\label{chapter:proximal_decoding}
|
||||
|
||||
TODO
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\section{Decoding Algorithm}%
|
||||
\label{sec:prox:Decoding Algorithm}
|
||||
|
||||
Proximal decoding was proposed by Wadayama et. al as a novel formulation of
|
||||
optimization-based decoding \cite{proximal_paper}.
|
||||
With this algorithm, minimization is performed using the proximal gradient
|
||||
method.
|
||||
In contrast to \ac{LP} decoding, the objective function is based on a
|
||||
non-convex optimization formulation of the \ac{MAP} decoding problem.
|
||||
|
||||
In order to derive the objective function, the authors begin with the
|
||||
\ac{MAP} decoding rule, expressed as a continuous maximization problem%
|
||||
\footnote{The expansion of the domain to be continuous doesn't constitute a
|
||||
material difference in the meaning of the rule.
|
||||
The only change is that what previously were \acp{PMF} now have to be expressed
|
||||
in terms of \acp{PDF}.}
|
||||
over $\boldsymbol{x}$:%
|
||||
%
|
||||
\begin{align}
|
||||
\hat{\boldsymbol{x}} = \argmax_{\tilde{\boldsymbol{x}} \in \mathbb{R}^{n}}
|
||||
f_{\tilde{\boldsymbol{X}} \mid \boldsymbol{Y}}
|
||||
\left( \tilde{\boldsymbol{x}} \mid \boldsymbol{y} \right)
|
||||
= \argmax_{\tilde{\boldsymbol{x}} \in \mathbb{R}^{n}} f_{\boldsymbol{Y}
|
||||
\mid \tilde{\boldsymbol{X}}}
|
||||
\left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right)
|
||||
f_{\tilde{\boldsymbol{X}}}\left( \tilde{\boldsymbol{x}} \right)%
|
||||
\label{eq:prox:vanilla_MAP}
|
||||
.\end{align}%
|
||||
%
|
||||
The likelihood $f_{\boldsymbol{Y} \mid \tilde{\boldsymbol{X}}}
|
||||
\left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right) $ is a known function
|
||||
determined by the channel model.
|
||||
The prior \ac{PDF} $f_{\tilde{\boldsymbol{X}}}\left( \tilde{\boldsymbol{x}} \right)$ is also
|
||||
known, as the equal probability assumption is made on
|
||||
$\mathcal{C}$.
|
||||
However, since the considered domain is continuous,
|
||||
the prior \ac{PDF} cannot be ignored as a constant during the minimization
|
||||
as is often done, and has a rather unwieldy representation:%
|
||||
%
|
||||
\begin{align}
|
||||
f_{\tilde{\boldsymbol{X}}}\left( \tilde{\boldsymbol{x}} \right) =
|
||||
\frac{1}{\left| \mathcal{C} \right| }
|
||||
\sum_{\boldsymbol{c} \in \mathcal{C} }
|
||||
\delta\big( \tilde{\boldsymbol{x}} - \left( -1 \right) ^{\boldsymbol{c}}\big)
|
||||
\label{eq:prox:prior_pdf}
|
||||
.\end{align}%
|
||||
%
|
||||
In order to rewrite the prior \ac{PDF}
|
||||
$f_{\tilde{\boldsymbol{X}}}\left( \tilde{\boldsymbol{x}} \right)$,
|
||||
the so-called \textit{code-constraint polynomial} is introduced as:%
|
||||
%
|
||||
\begin{align*}
|
||||
h\left( \tilde{\boldsymbol{x}} \right) =
|
||||
\underbrace{\sum_{i=1}^{n} \left( \tilde{x_i}^2-1 \right) ^2}_{\text{Bipolar constraint}}
|
||||
+ \underbrace{\sum_{j=1}^{m} \left[
|
||||
\left( \prod_{i\in N_c \left( j \right) } \tilde{x_i} \right)
|
||||
-1 \right] ^2}_{\text{Parity constraint}}%
|
||||
.\end{align*}%
|
||||
%
|
||||
The intention of this function is to provide a way to penalize vectors far
|
||||
from a codeword and favor those close to one.
|
||||
In order to achieve this, the polynomial is composed of two parts: one term
|
||||
representing the bipolar constraint, providing for a discrete solution of the
|
||||
continuous optimization problem, and one term representing the parity
|
||||
constraints, accommodating the role of the parity-check matrix $\boldsymbol{H}$.
|
||||
The prior \ac{PDF} is then approximated using the code-constraint polynomial as:%
|
||||
%
|
||||
\begin{align}
|
||||
f_{\tilde{\boldsymbol{X}}}\left( \tilde{\boldsymbol{x}} \right)
|
||||
\approx \frac{1}{Z}\mathrm{e}^{-\gamma h\left( \tilde{\boldsymbol{x}} \right) }%
|
||||
\label{eq:prox:prior_pdf_approx}
|
||||
.\end{align}%
|
||||
%
|
||||
The authors justify this approximation by arguing, that for
|
||||
$\gamma \rightarrow \infty$, the approximation in equation
|
||||
(\ref{eq:prox:prior_pdf_approx}) approaches the original function in equation
|
||||
(\ref{eq:prox:prior_pdf}).
|
||||
This approximation can then be plugged into equation (\ref{eq:prox:vanilla_MAP})
|
||||
and the likelihood can be rewritten using the negative log-likelihood
|
||||
$L \left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right) = -\ln\left(
|
||||
f_{\boldsymbol{Y} \mid \tilde{\boldsymbol{X}}}\left(
|
||||
\boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right) \right) $:%
|
||||
%
|
||||
\begin{align*}
|
||||
\hat{\boldsymbol{x}} &= \argmax_{\tilde{\boldsymbol{x}} \in \mathbb{R}^{n}}
|
||||
\mathrm{e}^{- L\left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right) }
|
||||
\mathrm{e}^{-\gamma h\left( \tilde{\boldsymbol{x}} \right) } \\
|
||||
&= \argmin_{\tilde{\boldsymbol{x}} \in \mathbb{R}^n} \big(
|
||||
L\left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right)
|
||||
+ \gamma h\left( \tilde{\boldsymbol{x}} \right)
|
||||
\big)%
|
||||
.\end{align*}%
|
||||
%
|
||||
Thus, with proximal decoding, the objective function
|
||||
$g\left( \tilde{\boldsymbol{x}} \right)$ considered is%
|
||||
%
|
||||
\begin{align}
|
||||
g\left( \tilde{\boldsymbol{x}} \right) = L\left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}}
|
||||
\right)
|
||||
+ \gamma h\left( \tilde{\boldsymbol{x}} \right)%
|
||||
\label{eq:prox:objective_function}
|
||||
\end{align}%
|
||||
%
|
||||
and the decoding problem is reformulated to%
|
||||
%
|
||||
\begin{align*}
|
||||
\text{minimize}\hspace{2mm} &L\left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right)
|
||||
+ \gamma h\left( \tilde{\boldsymbol{x}} \right)\\
|
||||
\text{subject to}\hspace{2mm} &\tilde{\boldsymbol{x}} \in \mathbb{R}^n
|
||||
.\end{align*}
|
||||
%
|
||||
|
||||
For the solution of the approximate \ac{MAP} decoding problem, the two parts
|
||||
of equation (\ref{eq:prox:objective_function}) are considered separately:
|
||||
the minimization of the objective function occurs in an alternating
|
||||
fashion, switching between the negative log-likelihood
|
||||
$L\left( \boldsymbol{y} \mid \boldsymbol{x} \right) $ and the scaled
|
||||
code-constraint polynomial $\gamma h\left( \boldsymbol{x} \right) $.
|
||||
Two helper variables, $\boldsymbol{r}$ and $\boldsymbol{s}$, are introduced,
|
||||
describing the result of each of the two steps.
|
||||
The first step, minimizing the log-likelihood, is performed using gradient
|
||||
descent:%
|
||||
%
|
||||
\begin{align}
|
||||
\boldsymbol{r} \leftarrow \boldsymbol{s} - \omega \nabla
|
||||
L\left( \boldsymbol{y} \mid \boldsymbol{s} \right),
|
||||
\hspace{5mm}\omega > 0
|
||||
\label{eq:prox:step_log_likelihood}
|
||||
.\end{align}%
|
||||
%
|
||||
For the second step, minimizing the scaled code-constraint polynomial, the
|
||||
proximal gradient method is used and the \textit{proximal operator} of
|
||||
$\gamma h\left( \tilde{\boldsymbol{x}} \right) $ has to be computed.
|
||||
It is then immediately approximated with gradient-descent:%
|
||||
%
|
||||
\begin{align*}
|
||||
\textbf{prox}_{\gamma h} \left( \tilde{\boldsymbol{x}} \right) &\equiv
|
||||
\argmin_{\boldsymbol{t} \in \mathbb{R}^n}
|
||||
\left( \gamma h\left( \boldsymbol{t} \right) +
|
||||
\frac{1}{2} \lVert \boldsymbol{t} - \tilde{\boldsymbol{x}} \rVert \right)\\
|
||||
&\approx \tilde{\boldsymbol{x}} - \gamma \nabla h \left( \tilde{\boldsymbol{x}} \right),
|
||||
\hspace{5mm} \gamma > 0, \text{ small}
|
||||
.\end{align*}%
|
||||
%
|
||||
The second step thus becomes%
|
||||
%
|
||||
\begin{align*}
|
||||
\boldsymbol{s} \leftarrow \boldsymbol{r} - \gamma \nabla h\left( \boldsymbol{r} \right),
|
||||
\hspace{5mm}\gamma > 0,\text{ small}
|
||||
.\end{align*}
|
||||
%
|
||||
While the approximation of the prior \ac{PDF} made in equation (\ref{eq:prox:prior_pdf_approx})
|
||||
theoretically becomes better
|
||||
with larger $\gamma$, the constraint that $\gamma$ be small is important,
|
||||
as it keeps the effect of $h\left( \tilde{\boldsymbol{x}} \right) $ on the landscape
|
||||
of the objective function small.
|
||||
Otherwise, unwanted stationary points, including local minima, are introduced.
|
||||
The authors say that ``in practice, the value of $\gamma$ should be adjusted
|
||||
according to the decoding performance.'' \cite[Sec. 3.1]{proximal_paper}.
|
||||
|
||||
%The components of the gradient of the code-constraint polynomial can be computed as follows:%
|
||||
%%
|
||||
%\begin{align*}
|
||||
% \frac{\partial}{\partial x_k} h\left( \boldsymbol{x} \right) =
|
||||
% 4\left( x_k^2 - 1 \right) x_k + \frac{2}{x_k}
|
||||
% \sum_{i\in \mathcal{B}\left( k \right) } \left(
|
||||
% \left( \prod_{j\in\mathcal{A}\left( i \right)} x_j\right)^2
|
||||
% - \prod_{j\in\mathcal{A}\left( i \right) }x_j \right)
|
||||
%.\end{align*}%
|
||||
%\todo{Only multiplication?}%
|
||||
%\todo{$x_k$: $k$ or some other indexing variable?}%
|
||||
%%
|
||||
In the case of \ac{AWGN}, the likelihood
|
||||
$f_{\boldsymbol{Y} \mid \tilde{\boldsymbol{X}}}
|
||||
\left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right)$
|
||||
is%
|
||||
%
|
||||
\begin{align*}
|
||||
f_{\boldsymbol{Y} \mid \tilde{\boldsymbol{X}}}
|
||||
\left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right)
|
||||
= \frac{1}{\sqrt{2\pi\sigma^2}}\mathrm{e}^{
|
||||
-\frac{\lVert \boldsymbol{y}-\tilde{\boldsymbol{x}}
|
||||
\rVert^2 }
|
||||
{2\sigma^2}}
|
||||
.\end{align*}
|
||||
%
|
||||
Thus, the gradient of the negative log-likelihood becomes%
|
||||
\footnote{For the minimization, constants can be disregarded. For this reason,
|
||||
it suffices to consider only proportionality instead of equality.}%
|
||||
%
|
||||
\begin{align*}
|
||||
\nabla L \left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right)
|
||||
&\propto -\nabla \lVert \boldsymbol{y} - \tilde{\boldsymbol{x}} \rVert^2\\
|
||||
&\propto \tilde{\boldsymbol{x}} - \boldsymbol{y}
|
||||
,\end{align*}%
|
||||
%
|
||||
allowing equation (\ref{eq:prox:step_log_likelihood}) to be rewritten as%
|
||||
%
|
||||
\begin{align*}
|
||||
\boldsymbol{r} \leftarrow \boldsymbol{s}
|
||||
- \omega \left( \boldsymbol{s} - \boldsymbol{y} \right)
|
||||
.\end{align*}
|
||||
%
|
||||
|
||||
One thing to consider during the actual decoding process, is that the gradient
|
||||
of the code-constraint polynomial can take on extremely large values.
|
||||
To avoid numerical instability, an additional step is added, where all
|
||||
components of the current estimate are clipped to $\left[-\eta, \eta \right]$,
|
||||
where $\eta$ is a positive constant slightly larger than one:%
|
||||
%
|
||||
\begin{align*}
|
||||
\boldsymbol{s} \leftarrow \Pi_{\eta} \left( \boldsymbol{r}
|
||||
- \gamma \nabla h\left( \boldsymbol{r} \right) \right)
|
||||
,\end{align*}
|
||||
%
|
||||
$\Pi_{\eta}\left( \cdot \right) $ expressing the projection onto
|
||||
$\left[ -\eta, \eta \right]^n$.
|
||||
|
||||
The iterative decoding process resulting from these considerations is shown in
|
||||
figure \ref{fig:prox:alg}.
|
||||
|
||||
\begin{figure}[H]
|
||||
\centering
|
||||
|
||||
\begin{genericAlgorithm}[caption={}, label={}]
|
||||
$\boldsymbol{s} \leftarrow \boldsymbol{0}$
|
||||
for $K$ iterations do
|
||||
$\boldsymbol{r} \leftarrow \boldsymbol{s} - \omega \left( \boldsymbol{s} - \boldsymbol{y} \right) $
|
||||
$\boldsymbol{s} \leftarrow \Pi_\eta \left(\boldsymbol{r} - \gamma \nabla h\left( \boldsymbol{r} \right) \right)$
|
||||
$\boldsymbol{\hat{x}} \leftarrow \text{sign}\left( \boldsymbol{s} \right) $
|
||||
if $\boldsymbol{H}\boldsymbol{\hat{c}} = \boldsymbol{0}$ do
|
||||
return $\boldsymbol{\hat{c}}$
|
||||
end if
|
||||
end for
|
||||
return $\boldsymbol{\hat{c}}$
|
||||
\end{genericAlgorithm}
|
||||
|
||||
|
||||
\caption{Proximal decoding algorithm for an \ac{AWGN} channel}
|
||||
\label{fig:prox:alg}
|
||||
\end{figure}
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\section{Implementation Details}%
|
||||
\label{sec:prox:Implementation Details}
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\section{Results}%
|
||||
\label{sec:prox:Results}
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\section{Improved Implementation}%
|
||||
\label{sec:prox:Improved Implementation}
|
||||
|
||||
@ -5,10 +5,11 @@ In this chapter, the theoretical background necessary to understand this
|
||||
work is given.
|
||||
First, the used notation is clarified.
|
||||
The physical aspects are detailed - the used modulation scheme and channel model.
|
||||
A short introduction of channel coding with binary linear codes and especially
|
||||
A short introduction to channel coding with binary linear codes and especially
|
||||
\ac{LDPC} codes is given.
|
||||
The established methods of decoding LPDC codes are briefly explained.
|
||||
Lastly, the optimization methods utilized are described.
|
||||
Lastly, the general process of decoding using optimization techniques is described
|
||||
and an overview of the utilized optimization methods is given.
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
@ -270,6 +271,161 @@ the use of \ac{BP} impractical for applications where a very low \ac{BER} is
|
||||
desired \cite[Sec. 15.3]{ryan_lin_2009}.
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\section{Decoding using Optimization Methods}%
|
||||
\label{sec:theo:Decoding using Optimization Methods}
|
||||
|
||||
%
|
||||
% General methodology
|
||||
%
|
||||
|
||||
The general idea behind using optimization methods for channel decoding
|
||||
is to reformulate the decoding problem as an optimization problem.
|
||||
This new formulation can then be solved with one of the many
|
||||
available optimization algorithms.
|
||||
|
||||
Generally, the original decoding problem considered is either the \ac{MAP} or
|
||||
the \ac{ML} decoding problem:%
|
||||
%
|
||||
\begin{align}
|
||||
\hat{\boldsymbol{c}}_{\text{\ac{MAP}}} &= \argmax_{\boldsymbol{c} \in \mathcal{C}}
|
||||
p_{\boldsymbol{C} \mid \boldsymbol{Y}} \left(\boldsymbol{c} \mid \boldsymbol{y}
|
||||
\right) \label{eq:dec:map}\\
|
||||
\hat{\boldsymbol{c}}_{\text{\ac{ML}}} &= \argmax_{\boldsymbol{c} \in \mathcal{C}}
|
||||
f_{\boldsymbol{Y} \mid \boldsymbol{C}} \left( \boldsymbol{y} \mid \boldsymbol{c}
|
||||
\right) \label{eq:dec:ml}
|
||||
.\end{align}%
|
||||
%
|
||||
The goal is to arrive at a formulation, where a certain objective function
|
||||
$g : \mathbb{R}^n \rightarrow \mathbb{R} $ must be minimized under certain constraints:%
|
||||
%
|
||||
\begin{align*}
|
||||
\text{minimize}\hspace{2mm} &g\left( \tilde{\boldsymbol{c}} \right)\\
|
||||
\text{subject to}\hspace{2mm} &\tilde{\boldsymbol{c}} \in D
|
||||
,\end{align*}%
|
||||
%
|
||||
where $D \subseteq \mathbb{R}^n$ is the domain of values attainable for $\tilde{\boldsymbol{c}}$
|
||||
and represents the constraints.
|
||||
|
||||
In contrast to the established message-passing decoding algorithms,
|
||||
the prespective then changes from observing the decoding process in its
|
||||
Tanner graph representation with \acp{VN} and \acp{CN} (as shown in figure \ref{fig:dec:tanner})
|
||||
to a spatial representation (figure \ref{fig:dec:spatial}),
|
||||
where the codewords are some of the edges of a hypercube.
|
||||
The goal is to find the point $\tilde{\boldsymbol{c}}$,
|
||||
which minimizes the objective function $g$.
|
||||
|
||||
%
|
||||
% Figure showing decoding space
|
||||
%
|
||||
|
||||
\begin{figure}[H]
|
||||
\centering
|
||||
|
||||
\begin{subfigure}[c]{0.47\textwidth}
|
||||
\centering
|
||||
|
||||
\tikzstyle{checknode} = [color=KITblue, fill=KITblue,
|
||||
draw, regular polygon,regular polygon sides=4,
|
||||
inner sep=0pt, minimum size=12pt]
|
||||
\tikzstyle{variablenode} = [color=KITgreen, fill=KITgreen,
|
||||
draw, circle, inner sep=0pt, minimum size=10pt]
|
||||
|
||||
\begin{tikzpicture}[scale=1, transform shape]
|
||||
\node[checknode,
|
||||
label={[below, label distance=-0.4cm, align=center]
|
||||
\acs{CN}\\$\left( c_1 + c_2 + c_3 = 0 \right) $}]
|
||||
(cn) at (0, 0) {};
|
||||
\node[variablenode, label={[above, align=center] \acs{VN}\\$\left( c_1 \right)$}]
|
||||
(c1) at (-2, 2) {};
|
||||
\node[variablenode, label={[above, align=center] \acs{VN}\\$\left( c_2 \right)$}]
|
||||
(c2) at (0, 2) {};
|
||||
\node[variablenode, label={[above, align=center] \acs{VN}\\$\left( c_3 \right)$}]
|
||||
(c3) at (2, 2) {};
|
||||
|
||||
\draw (cn) -- (c1);
|
||||
\draw (cn) -- (c2);
|
||||
\draw (cn) -- (c3);
|
||||
\end{tikzpicture}
|
||||
|
||||
\caption{Tanner graph representation of a single parity-check code}
|
||||
\label{fig:dec:tanner}
|
||||
\end{subfigure}%
|
||||
\hfill%
|
||||
\begin{subfigure}[c]{0.47\textwidth}
|
||||
\centering
|
||||
|
||||
\tikzstyle{codeword} = [color=KITblue, fill=KITblue,
|
||||
draw, circle, inner sep=0pt, minimum size=4pt]
|
||||
|
||||
\tdplotsetmaincoords{60}{25}
|
||||
\begin{tikzpicture}[scale=1, transform shape, tdplot_main_coords]
|
||||
% Cube
|
||||
|
||||
\coordinate (p000) at (0, 0, 0);
|
||||
\coordinate (p001) at (0, 0, 2);
|
||||
\coordinate (p010) at (0, 2, 0);
|
||||
\coordinate (p011) at (0, 2, 2);
|
||||
\coordinate (p100) at (2, 0, 0);
|
||||
\coordinate (p101) at (2, 0, 2);
|
||||
\coordinate (p110) at (2, 2, 0);
|
||||
\coordinate (p111) at (2, 2, 2);
|
||||
|
||||
\draw[] (p000) -- (p100);
|
||||
\draw[] (p100) -- (p101);
|
||||
\draw[] (p101) -- (p001);
|
||||
\draw[] (p001) -- (p000);
|
||||
|
||||
\draw[dashed] (p010) -- (p110);
|
||||
\draw[] (p110) -- (p111);
|
||||
\draw[] (p111) -- (p011);
|
||||
\draw[dashed] (p011) -- (p010);
|
||||
|
||||
\draw[dashed] (p000) -- (p010);
|
||||
\draw[] (p100) -- (p110);
|
||||
\draw[] (p101) -- (p111);
|
||||
\draw[] (p001) -- (p011);
|
||||
|
||||
% Polytope Vertices
|
||||
|
||||
\node[codeword] (c000) at (p000) {};
|
||||
\node[codeword] (c101) at (p101) {};
|
||||
\node[codeword] (c110) at (p110) {};
|
||||
\node[codeword] (c011) at (p011) {};
|
||||
|
||||
% Polytope Edges
|
||||
|
||||
% \draw[line width=1pt, color=KITblue] (c000) -- (c101);
|
||||
% \draw[line width=1pt, color=KITblue] (c000) -- (c110);
|
||||
% \draw[line width=1pt, color=KITblue] (c000) -- (c011);
|
||||
%
|
||||
% \draw[line width=1pt, color=KITblue] (c101) -- (c110);
|
||||
% \draw[line width=1pt, color=KITblue] (c101) -- (c011);
|
||||
%
|
||||
% \draw[line width=1pt, color=KITblue] (c011) -- (c110);
|
||||
|
||||
% Polytope Annotations
|
||||
|
||||
\node[color=KITblue, below=0cm of c000] {$\left( 0, 0, 0 \right) $};
|
||||
\node[color=KITblue, right=0.17cm of c101] {$\left( 1, 0, 1 \right) $};
|
||||
\node[color=KITblue, right=0cm of c110] {$\left( 1, 1, 0 \right) $};
|
||||
\node[color=KITblue, above=0cm of c011] {$\left( 0, 1, 1 \right) $};
|
||||
|
||||
% c
|
||||
|
||||
\node[color=KITgreen, fill=KITgreen,
|
||||
draw, circle, inner sep=0pt, minimum size=4pt] (c) at (0.9, 0.7, 1) {};
|
||||
\node[color=KITgreen, right=0cm of c] {$\tilde{\boldsymbol{c}}$};
|
||||
\end{tikzpicture}
|
||||
|
||||
\caption{Spatial representation of a single parity-check code}
|
||||
\label{fig:dec:spatial}
|
||||
\end{subfigure}%
|
||||
|
||||
\caption{Different representations of the decoding problem}
|
||||
\end{figure}
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\section{Optimization Methods}
|
||||
\label{sec:theo:Optimization Methods}
|
||||
@ -484,3 +640,4 @@ condition that the step size be $\mu$:%
|
||||
% \hspace{5mm} \mu > 0
|
||||
.\end{align*}
|
||||
%
|
||||
|
||||
|
||||
@ -213,9 +213,9 @@
|
||||
|
||||
\include{chapters/introduction}
|
||||
\include{chapters/theoretical_background}
|
||||
\include{chapters/decoding_techniques}
|
||||
\include{chapters/methodology_and_implementation}
|
||||
\include{chapters/analysis_of_results}
|
||||
\include{chapters/proximal_decoding}
|
||||
\include{chapters/lp_dec_using_admm}
|
||||
\include{chapters/comparison}
|
||||
\include{chapters/discussion}
|
||||
\include{chapters/conclusion}
|
||||
\include{chapters/appendix}
|
||||
|
||||
Loading…
Reference in New Issue
Block a user