Added codeword polytope figure; minor other changes

This commit is contained in:
Andreas Tsouchlos 2023-02-17 00:32:07 +01:00
parent 3ec2a3fb76
commit e3adafe2ff

View File

@ -147,8 +147,7 @@ which minimizes the objective function $f$ (as shown in figure \ref{fig:dec:spac
\textit{integer linear program} and subsequently presented a relaxation into \textit{integer linear program} and subsequently presented a relaxation into
a \textit{linear program}, lifting the integer requirement. a \textit{linear program}, lifting the integer requirement.
The optimization method used to solve this problem that is examined in this The optimization method used to solve this problem that is examined in this
work is the \ac{ADMM}. work is \ac{ADMM}.
\todo{With or without 'the'?}
\todo{Why chose ADMM?} \todo{Why chose ADMM?}
Feldman at al. begin by looking at the \ac{ML} decoding problem% Feldman at al. begin by looking at the \ac{ML} decoding problem%
@ -164,11 +163,11 @@ They suggest that maximizing the likelihood
$f_{\boldsymbol{Y} \mid \boldsymbol{C}}\left( \boldsymbol{y} \mid \boldsymbol{c} \right)$ $f_{\boldsymbol{Y} \mid \boldsymbol{C}}\left( \boldsymbol{y} \mid \boldsymbol{c} \right)$
is equivalent to minimizing the negative log-likelihood. is equivalent to minimizing the negative log-likelihood.
\ldots (Explaing arriving at cost function from ML decoding problem) \ldots (Explain arriving at the cost function from the ML decoding problem)
Based on this, they propose their cost function% Based on this, they propose their cost function%
\footnote{In this context, \textit{cost function} and \textit{objective function} \footnote{In this context, \textit{cost function} and \textit{objective function}
mean the same thing.} have the same meaning.}
for the \ac{LP} decoding problem:% for the \ac{LP} decoding problem:%
% %
\begin{align*} \begin{align*}
@ -177,11 +176,11 @@ for the \ac{LP} decoding problem:%
\frac{f_{\boldsymbol{Y} | \boldsymbol{C}} \frac{f_{\boldsymbol{Y} | \boldsymbol{C}}
\left( Y_i = y_i \mid C_i = 0 \right) } \left( Y_i = y_i \mid C_i = 0 \right) }
{f_{\boldsymbol{Y} | \boldsymbol{C}} {f_{\boldsymbol{Y} | \boldsymbol{C}}
\left( Y_i = y_i | C_i = 1 \right) } \right) \\ \left( Y_i = y_i | C_i = 1 \right) } \right)
.\end{align*} .\end{align*}
% %
% %
The exact integer linear program \todo{ILP acronym?} formulation of \ac{ML} With this cost function, the exact integer linear program formulation of \ac{ML}
decoding is the following:% decoding is the following:%
% %
\begin{align*} \begin{align*}
@ -190,19 +189,75 @@ decoding is the following:%
.\end{align*}% .\end{align*}%
% %
\ldots (LP Relaxation) As solving integer linear programs is generally NP-hard, the decoding problem \todo{New \S?}
has to be approximated by one with looser constraints.
A technique called \textit{\ac{LP} Relaxation} is applied,
essentially removing the requirement for the components of $\boldsymbol{c}$
to be integer values.
In order to provide a formal definition of the relaxed constraints, the
authors go on to define the concept of the \textit{codeword polytope}
(figure \ref{fig:dec:poly}) as
being the convex hull of all possible codewords:
%
\begin{align*}
\text{poly}\left( \mathcal{C} \right) = \left\{
\sum_{c \in \mathcal{C}} \lambda_{\boldsymbol{c}} \boldsymbol{c}
\text{ : } \lambda_{\boldsymbol{c}} \ge 0,
\sum_{\boldsymbol{c} \in \mathcal{C}} \lambda_{\boldsymbol{c}} = 1 \right\}
.\end{align*}
%They go on to define the constraints under which this minimization is to be \begin{figure}[H]
%accomplished. \centering
%They define the concept of the \textit{codeword polytope} as a linear
%combination of all possible codewords, forming their convex hull:% \tikzstyle{codeword} = [color=KITblue, fill=KITblue,
%% draw, circle, inner sep=0pt, minimum size=4pt]
%\begin{align*}
% \text{poly}\left( \mathcal{C} \right) = \left\{ \tdplotsetmaincoords{60}{245}
% \sum_{c \in \mathcal{C}} \lambda_{\boldsymbol{c}} \boldsymbol{c} \begin{tikzpicture}[scale=1, transform shape, tdplot_main_coords]
% \text{ : } \lambda_{\boldsymbol{c}} \ge 0, % Cube
% \sum_{\boldsymbol{c} \in \mathcal{C}} \lambda_{\boldsymbol{c}} = 1 \right\}
%.\end{align*} \draw[dashed] (0, 0, 0) -- (2, 0, 0);
\draw[dashed] (2, 0, 0) -- (2, 0, 2);
\draw[] (2, 0, 2) -- (0, 0, 2);
\draw[] (0, 0, 2) -- (0, 0, 0);
\draw[] (0, 2, 0) -- (2, 2, 0);
\draw[] (2, 2, 0) -- (2, 2, 2);
\draw[] (2, 2, 2) -- (0, 2, 2);
\draw[] (0, 2, 2) -- (0, 2, 0);
\draw[] (0, 0, 0) -- (0, 2, 0);
\draw[dashed] (2, 0, 0) -- (2, 2, 0);
\draw[] (2, 0, 2) -- (2, 2, 2);
\draw[] (0, 0, 2) -- (0, 2, 2);
% Codeword Polytope
\draw[line width=1pt, color=KITblue] (0, 0, 0) -- (2, 0, 2);
\draw[line width=1pt, color=KITblue] (0, 0, 0) -- (2, 2, 0);
\draw[line width=1pt, color=KITblue] (0, 0, 0) -- (0, 2, 2);
\draw[line width=1pt, color=KITblue] (2, 0, 2) -- (2, 2, 0);
\draw[line width=1pt, color=KITblue] (2, 0, 2) -- (0, 2, 2);
\draw[line width=1pt, color=KITblue] (0, 2, 2) -- (2, 2, 0);
% Polytope Annotations
\node[codeword] (c000) at (0, 0, 0) {};% {$\left( 0, 0, 0 \right) $};
\node[codeword] (c101) at (2, 0, 2) {};% {$\left( 1, 0, 1 \right) $};
\node[codeword] (c110) at (2, 2, 0) {};% {$\left( 1, 1, 0 \right) $};
\node[codeword] (c011) at (0, 2, 2) {};% {$\left( 0, 1, 1 \right) $};
\node[color=KITblue, right=0cm of c000] {$\left( 0, 0, 0 \right) $};
\node[color=KITblue, above=0cm of c101] {$\left( 1, 0, 1 \right) $};
\node[color=KITblue, left=0cm of c110] {$\left( 1, 1, 0 \right) $};
\node[color=KITblue, left=-0.1cm of c011] {$\left( 0, 1, 1 \right) $};
\end{tikzpicture}
\caption{Codeword polytope of a single parity-check code}
\label{fig:dec:poly}
\end{figure}
\begin{itemize} \begin{itemize}
\item Equivalent \ac{ML} optimization problem \item Equivalent \ac{ML} optimization problem
@ -254,7 +309,7 @@ continuous optimization problem, and one term representing the parity
constraint, accomodating the role of the parity-check matrix $\boldsymbol{H}$. constraint, accomodating the role of the parity-check matrix $\boldsymbol{H}$.
% %
The equal probability assumption is made on $\mathcal{C}\left( \boldsymbol{H} \right) $. The equal probability assumption is made on $\mathcal{C}\left( \boldsymbol{H} \right) $.
The prior \ac{PDF} is then approximated using the code-constraint polynomial\todo{Italic?}:% The prior \ac{PDF} is then approximated using the code-constraint polynomial:%
% %
\begin{align} \begin{align}
f_{\boldsymbol{X}}\left( \boldsymbol{x} \right) = f_{\boldsymbol{X}}\left( \boldsymbol{x} \right) =
@ -267,12 +322,13 @@ The prior \ac{PDF} is then approximated using the code-constraint polynomial\tod
% %
The authors justify this approximation by arguing that for The authors justify this approximation by arguing that for
$\gamma \rightarrow \infty$, the right-hand side aproaches the left-hand $\gamma \rightarrow \infty$, the right-hand side aproaches the left-hand
side. In \ref{eq:prox:vanilla_MAP} the prior \ac{PDF} side. In equation \ref{eq:prox:vanilla_MAP}, the prior \ac{PDF}
$f_{\boldsymbol{X}}\left( \boldsymbol{x} \right) $ can then be subsituted $f_{\boldsymbol{X}}\left( \boldsymbol{x} \right) $ can then be subsituted
for \ref{eq:prox:prior_pdf_approx} and the likelihood can be rewritten using for equation \ref{eq:prox:prior_pdf_approx} and the likelihood can be rewritten using
the negative log-likelihood the negative log-likelihood
$f_{\boldsymbol{X} \mid \boldsymbol{Y}}\left( \boldsymbol{x} \mid \boldsymbol{y} \right) $L \left( \boldsymbol{y} \mid \boldsymbol{x} \right) = -\ln\left(
= e^{- L\left( \boldsymbol{y} \mid \boldsymbol{x} \right) }$:% f_{\boldsymbol{X} \mid \boldsymbol{Y}}\left(
\boldsymbol{x} \mid \boldsymbol{y} \right) \right) $:%
% %
\begin{align} \begin{align}
\hat{\boldsymbol{x}} &= \argmax_{\boldsymbol{x} \in \mathbb{R}^{n}} \hat{\boldsymbol{x}} &= \argmax_{\boldsymbol{x} \in \mathbb{R}^{n}}
@ -283,7 +339,7 @@ $f_{\boldsymbol{X} \mid \boldsymbol{Y}}\left( \boldsymbol{x} \mid \boldsymbol{y}
+ \gamma h\left( \boldsymbol{x} \right) + \gamma h\left( \boldsymbol{x} \right)
\right)% \right)%
\label{eq:prox:approx_map_problem} \label{eq:prox:approx_map_problem}
\end{align}% .\end{align}%
% %
Thus, with proximal decoding, the objective function Thus, with proximal decoding, the objective function
$f\left( \boldsymbol{x} \right)$ to be minimized is% $f\left( \boldsymbol{x} \right)$ to be minimized is%
@ -292,11 +348,11 @@ $f\left( \boldsymbol{x} \right)$ to be minimized is%
f\left( \boldsymbol{x} \right) = L\left( \boldsymbol{x} \mid \boldsymbol{y} \right) f\left( \boldsymbol{x} \right) = L\left( \boldsymbol{x} \mid \boldsymbol{y} \right)
+ \gamma h\left( \boldsymbol{x} \right)% + \gamma h\left( \boldsymbol{x} \right)%
\label{eq:prox:objective_function} \label{eq:prox:objective_function}
.\end{align}\todo{Dot after equations?} .\end{align}
For the solution of the approximalte \ac{MAP} decoding problem, the two parts For the solution of the approximalte \ac{MAP} decoding problem, the two parts
of \ref{eq:prox:approx_map_problem} are considered separately from one of equation \ref{eq:prox:approx_map_problem} are considered separately:
another: the minimization of the objective function occurs in an alternating the minimization of the objective function occurs in an alternating
manner, switching between the minimization of the negative log-likelihood manner, switching between the minimization of the negative log-likelihood
$L\left( \boldsymbol{y} \mid \boldsymbol{x} \right) $ and the scaled $L\left( \boldsymbol{y} \mid \boldsymbol{x} \right) $ and the scaled
code-constaint polynomial $\gamma h\left( \boldsymbol{x} \right) $. code-constaint polynomial $\gamma h\left( \boldsymbol{x} \right) $.
@ -332,7 +388,7 @@ The second step thus becomes \todo{Write the formulation optimization problem pr
\hspace{5mm}\gamma > 0,\text{ small} \hspace{5mm}\gamma > 0,\text{ small}
.\end{align*} .\end{align*}
% %
While the approximatin of the prior \ac{PDF} made in \ref{eq:prox:prior_pdf_approx} While the approximation of the prior \ac{PDF} made in \ref{eq:prox:prior_pdf_approx}
theoretically becomes better theoretically becomes better
with larger $\gamma$, the constraint that $\gamma$ be small is important, with larger $\gamma$, the constraint that $\gamma$ be small is important,
as it keeps the effect of $h\left( \boldsymbol{x} \right) $ on the landscape as it keeps the effect of $h\left( \boldsymbol{x} \right) $ on the landscape