Spell checked entire decoding techniques chapter

This commit is contained in:
Andreas Tsouchlos 2023-02-19 14:35:33 +01:00
parent 252736ff31
commit c1473c6eb4

View File

@ -2,7 +2,7 @@
\label{chapter:decoding_techniques} \label{chapter:decoding_techniques}
In this chapter, the decoding techniques examined in this work are detailed. In this chapter, the decoding techniques examined in this work are detailed.
First, an overview of of the general methodology of using optimization methods First, an overview of the general methodology of using optimization methods
for channel decoding is given. Afterwards, the specific decoding techniques for channel decoding is given. Afterwards, the specific decoding techniques
themselves are explained. themselves are explained.
@ -31,7 +31,7 @@ the \ac{ML} decoding problem:%
.\end{align*}% .\end{align*}%
% %
The goal is to arrive at a formulation, where a certain objective function The goal is to arrive at a formulation, where a certain objective function
$f$ has to be minimized under certain constraints:% $f$ must be minimized under certain constraints:%
% %
\begin{align*} \begin{align*}
\text{minimize}\hspace{2mm} &f\left( \boldsymbol{c} \right)\\ \text{minimize}\hspace{2mm} &f\left( \boldsymbol{c} \right)\\
@ -44,7 +44,7 @@ constraints.
In contrast to the established message-passing decoding algorithms, In contrast to the established message-passing decoding algorithms,
the viewpoint then changes from observing the decoding process in its the viewpoint then changes from observing the decoding process in its
tanner graph representation (as shown in figure \ref{fig:dec:tanner}) tanner graph representation (as shown in figure \ref{fig:dec:tanner})
to a spacial representation (figure \ref{fig:dec:spacial}), to a spatial representation (figure \ref{fig:dec:spatial}),
where the codewords are some of the edges of a hypercube. where the codewords are some of the edges of a hypercube.
The goal is to find that point $\boldsymbol{c}$, The goal is to find that point $\boldsymbol{c}$,
which minimizes the objective function $f$. which minimizes the objective function $f$.
@ -149,8 +149,8 @@ which minimizes the objective function $f$.
\node[color=KITgreen, right=0cm of c] {$\boldsymbol{c}$}; \node[color=KITgreen, right=0cm of c] {$\boldsymbol{c}$};
\end{tikzpicture} \end{tikzpicture}
\caption{Spacial representation of a single parity-check code} \caption{Spatial representation of a single parity-check code}
\label{fig:dec:spacial} \label{fig:dec:spatial}
\end{subfigure}% \end{subfigure}%
\caption{Different representations of the decoding problem} \caption{Different representations of the decoding problem}
@ -171,7 +171,7 @@ representation.
To solve the resulting linear program, various optimization methods can be To solve the resulting linear program, various optimization methods can be
used. used.
Feldman at al. begin by looking at the \ac{ML} decoding problem% Feldman et al. begin by looking at the \ac{ML} decoding problem%
\footnote{They assume that all codewords are equally likely to be transmitted, \footnote{They assume that all codewords are equally likely to be transmitted,
making the \ac{ML} and \ac{MAP} decoding problems equivalent.}% making the \ac{ML} and \ac{MAP} decoding problems equivalent.}%
% %
@ -233,7 +233,7 @@ decoding, redefining the constraints in terms of the \text{codeword polytope}
which represents the \textit{convex hull} of all possible codewords, which represents the \textit{convex hull} of all possible codewords,
i.e. the convex set of linear combinations of all codewords. i.e. the convex set of linear combinations of all codewords.
However, since the number of constraints needed to characterize the codeword However, since the number of constraints needed to characterize the codeword
polytope is exponential in the code length, this formulation is relaxed futher. polytope is exponential in the code length, this formulation is relaxed further.
By observing that each check node defines its own local single parity-check By observing that each check node defines its own local single parity-check
code, and thus its own \textit{local codeword polytope}, code, and thus its own \textit{local codeword polytope},
the \textit{relaxed codeword polytope} $\overline{Q}$ is defined as the intersection of all the \textit{relaxed codeword polytope} $\overline{Q}$ is defined as the intersection of all
@ -274,7 +274,7 @@ Figures \ref{fig:dec:poly:local1} and \ref{fig:dec:poly:local2} show the local
codeword polytopes of each check node. codeword polytopes of each check node.
Their intersection, the relaxed codeword polytope $\overline{Q}$, is shown in Their intersection, the relaxed codeword polytope $\overline{Q}$, is shown in
figure \ref{fig:dec:poly:relaxed}. figure \ref{fig:dec:poly:relaxed}.
It can be seen, that the relaxed codeword polytope $\overline{Q}$ introduces It can be seen that the relaxed codeword polytope $\overline{Q}$ introduces
vertices with fractional values; vertices with fractional values;
these represent erroneous non-codeword solutions to the linear program and these represent erroneous non-codeword solutions to the linear program and
correspond to the so-called \textit{pseudocodewords} introduced in correspond to the so-called \textit{pseudocodewords} introduced in
@ -590,7 +590,7 @@ The resulting formulation of the relaxed optimization problem is the following:%
\begin{itemize} \begin{itemize}
\item Why ADMM? \item Why ADMM?
\item Adaptive Linear Programming? \item Adaptive linear programming?
\item How ADMM is adapted to LP decoding \item How ADMM is adapted to LP decoding
\end{itemize} \end{itemize}
@ -600,7 +600,7 @@ The resulting formulation of the relaxed optimization problem is the following:%
\label{sec:dec:Proximal Decoding} \label{sec:dec:Proximal Decoding}
Proximal decoding was proposed by Wadayama et. al as a novel formulation of Proximal decoding was proposed by Wadayama et. al as a novel formulation of
optimization based decoding \cite{proximal_paper}. optimization-based decoding \cite{proximal_paper}.
With this algorithm, minimization is performed using the proximal gradient With this algorithm, minimization is performed using the proximal gradient
method. method.
In contrast to \ac{LP} decoding, the objective function is based on a In contrast to \ac{LP} decoding, the objective function is based on a
@ -624,7 +624,7 @@ The likelihood $f_{\boldsymbol{Y} \mid \boldsymbol{X}}
\left( \boldsymbol{y} \mid \boldsymbol{x} \right) $ is a known function \left( \boldsymbol{y} \mid \boldsymbol{x} \right) $ is a known function
determined by the channel model. determined by the channel model.
The prior \ac{PDF} $f_{\boldsymbol{X}}\left( \boldsymbol{x} \right)$ is also The prior \ac{PDF} $f_{\boldsymbol{X}}\left( \boldsymbol{x} \right)$ is also
known, as the equal probability assumption is made on known as the equal probability assumption is made on
$\mathcal{C}\left( \boldsymbol{H} \right)$. $\mathcal{C}\left( \boldsymbol{H} \right)$.
However, because in this case the considered domain is continuous, However, because in this case the considered domain is continuous,
the prior \ac{PDF} cannot be ignored as a constant during the minimization the prior \ac{PDF} cannot be ignored as a constant during the minimization
@ -653,9 +653,9 @@ the so-called \textit{code-constraint polynomial} is introduced:%
The intention of this function is to provide a way to penalize vectors far The intention of this function is to provide a way to penalize vectors far
from a codeword and favor those close to a codeword. from a codeword and favor those close to a codeword.
In order to achieve this, the polynomial is composed of two parts: one term In order to achieve this, the polynomial is composed of two parts: one term
representing the bibolar constraint, providing for a discrete solution of the representing the bipolar constraint, providing for a discrete solution of the
continuous optimization problem, and one term representing the parity continuous optimization problem, and one term representing the parity
constraint, accomodating the role of the parity-check matrix $\boldsymbol{H}$. constraint, accommodating the role of the parity-check matrix $\boldsymbol{H}$.
The prior \ac{PDF} is then approximated using the code-constraint polynomial:% The prior \ac{PDF} is then approximated using the code-constraint polynomial:%
% %
\begin{align} \begin{align}
@ -666,7 +666,7 @@ The prior \ac{PDF} is then approximated using the code-constraint polynomial:%
% %
The authors justify this approximation by arguing that for The authors justify this approximation by arguing that for
$\gamma \rightarrow \infty$, the approximation in equation $\gamma \rightarrow \infty$, the approximation in equation
\ref{eq:prox:prior_pdf_approx} aproaches the original fuction in equation \ref{eq:prox:prior_pdf_approx} approaches the original function in equation
\ref{eq:prox:prior_pdf}. \ref{eq:prox:prior_pdf}.
This approximation can then be plugged into equation \ref{eq:prox:vanilla_MAP} This approximation can then be plugged into equation \ref{eq:prox:vanilla_MAP}
and the likelihood can be rewritten using the negative log-likelihood and the likelihood can be rewritten using the negative log-likelihood
@ -707,8 +707,8 @@ of \ref{eq:prox:objective_function} are considered separately:
the minimization of the objective function occurs in an alternating the minimization of the objective function occurs in an alternating
fashion, switching between the negative log-likelihood fashion, switching between the negative log-likelihood
$L\left( \boldsymbol{y} \mid \boldsymbol{x} \right) $ and the scaled $L\left( \boldsymbol{y} \mid \boldsymbol{x} \right) $ and the scaled
code-constaint polynomial $\gamma h\left( \boldsymbol{x} \right) $. code-constraint polynomial $\gamma h\left( \boldsymbol{x} \right) $.
Two helper variables, $\boldsymbol{r}$ and $\boldsymbol{s}$ are introduced, Two helper variables, $\boldsymbol{r}$ and $\boldsymbol{s}$, are introduced,
describing the result of each of the two steps. describing the result of each of the two steps.
The first step, minimizing the log-likelihood, is performed using gradient The first step, minimizing the log-likelihood, is performed using gradient
descent:% descent:%
@ -720,10 +720,10 @@ descent:%
\label{eq:prox:step_log_likelihood} \label{eq:prox:step_log_likelihood}
.\end{align}% .\end{align}%
% %
For the second step, minimizig the scaled code-constraint polynomial, the For the second step, minimizing the scaled code-constraint polynomial, the
proximal gradient method is used and the \textit{proximal operator} of proximal gradient method is used and the \textit{proximal operator} of
$\gamma h\left( \boldsymbol{x} \right) $ has to be computed. $\gamma h\left( \boldsymbol{x} \right) $ has to be computed.
It is then immediately approximalted with gradient-descent:% It is then immediately approximated with gradient-descent:%
% %
\begin{align*} \begin{align*}
\text{prox}_{\gamma h} \left( \boldsymbol{x} \right) &\equiv \text{prox}_{\gamma h} \left( \boldsymbol{x} \right) &\equiv
@ -773,7 +773,7 @@ is%
% %
Thus, the gradient of the negative log-likelihood becomes% Thus, the gradient of the negative log-likelihood becomes%
\footnote{For the minimization, constants can be disregarded. For this reason, \footnote{For the minimization, constants can be disregarded. For this reason,
it suffices to consider only the proportionality instead of the equality.}% it suffices to consider only proportionality instead of equality.}%
% %
\begin{align*} \begin{align*}
\nabla L \left( \boldsymbol{y} \mid \boldsymbol{x} \right) \nabla L \left( \boldsymbol{y} \mid \boldsymbol{x} \right)
@ -791,7 +791,7 @@ Allowing equation \ref{eq:prox:step_log_likelihood} to be rewritten as%
One thing to consider during the actual decoding process, is that the gradient One thing to consider during the actual decoding process, is that the gradient
of the code-constraint polynomial can take on extremely large values. of the code-constraint polynomial can take on extremely large values.
In order to avoid numeric instability, an additional step is added, where all To avoid numerical instability, an additional step is added, where all
components of the current estimate are clipped to $\left[-\eta, \eta \right]$, components of the current estimate are clipped to $\left[-\eta, \eta \right]$,
where $\eta$ is a positive constant slightly larger than one:% where $\eta$ is a positive constant slightly larger than one:%
% %
@ -803,7 +803,7 @@ where $\eta$ is a positive constant slightly larger than one:%
$\Pi_{\eta}\left( \cdot \right) $ expressing the projection onto $\Pi_{\eta}\left( \cdot \right) $ expressing the projection onto
$\left[ -\eta, \eta \right]^n$. $\left[ -\eta, \eta \right]^n$.
The iterative decoding process resulting from these considreations is shown in The iterative decoding process resulting from these considerations is shown in
figure \ref{fig:prox:alg}. figure \ref{fig:prox:alg}.
\begin{figure}[H] \begin{figure}[H]