Added codeword polytope figure; minor other changes

This commit is contained in:
Andreas Tsouchlos 2023-02-17 00:32:07 +01:00
parent 3ec2a3fb76
commit e3adafe2ff

View File

@ -147,8 +147,7 @@ which minimizes the objective function $f$ (as shown in figure \ref{fig:dec:spac
\textit{integer linear program} and subsequently presented a relaxation into
a \textit{linear program}, lifting the integer requirement.
The optimization method used to solve this problem that is examined in this
work is the \ac{ADMM}.
\todo{With or without 'the'?}
work is \ac{ADMM}.
\todo{Why chose ADMM?}
Feldman at al. begin by looking at the \ac{ML} decoding problem%
@ -164,11 +163,11 @@ They suggest that maximizing the likelihood
$f_{\boldsymbol{Y} \mid \boldsymbol{C}}\left( \boldsymbol{y} \mid \boldsymbol{c} \right)$
is equivalent to minimizing the negative log-likelihood.
\ldots (Explaing arriving at cost function from ML decoding problem)
\ldots (Explain arriving at the cost function from the ML decoding problem)
Based on this, they propose their cost function%
\footnote{In this context, \textit{cost function} and \textit{objective function}
mean the same thing.}
have the same meaning.}
for the \ac{LP} decoding problem:%
%
\begin{align*}
@ -177,11 +176,11 @@ for the \ac{LP} decoding problem:%
\frac{f_{\boldsymbol{Y} | \boldsymbol{C}}
\left( Y_i = y_i \mid C_i = 0 \right) }
{f_{\boldsymbol{Y} | \boldsymbol{C}}
\left( Y_i = y_i | C_i = 1 \right) } \right) \\
\left( Y_i = y_i | C_i = 1 \right) } \right)
.\end{align*}
%
%
The exact integer linear program \todo{ILP acronym?} formulation of \ac{ML}
With this cost function, the exact integer linear program formulation of \ac{ML}
decoding is the following:%
%
\begin{align*}
@ -190,19 +189,75 @@ decoding is the following:%
.\end{align*}%
%
\ldots (LP Relaxation)
As solving integer linear programs is generally NP-hard, the decoding problem \todo{New \S?}
has to be approximated by one with looser constraints.
A technique called \textit{\ac{LP} Relaxation} is applied,
essentially removing the requirement for the components of $\boldsymbol{c}$
to be integer values.
In order to provide a formal definition of the relaxed constraints, the
authors go on to define the concept of the \textit{codeword polytope}
(figure \ref{fig:dec:poly}) as
being the convex hull of all possible codewords:
%
\begin{align*}
\text{poly}\left( \mathcal{C} \right) = \left\{
\sum_{c \in \mathcal{C}} \lambda_{\boldsymbol{c}} \boldsymbol{c}
\text{ : } \lambda_{\boldsymbol{c}} \ge 0,
\sum_{\boldsymbol{c} \in \mathcal{C}} \lambda_{\boldsymbol{c}} = 1 \right\}
.\end{align*}
%They go on to define the constraints under which this minimization is to be
%accomplished.
%They define the concept of the \textit{codeword polytope} as a linear
%combination of all possible codewords, forming their convex hull:%
%%
%\begin{align*}
% \text{poly}\left( \mathcal{C} \right) = \left\{
% \sum_{c \in \mathcal{C}} \lambda_{\boldsymbol{c}} \boldsymbol{c}
% \text{ : } \lambda_{\boldsymbol{c}} \ge 0,
% \sum_{\boldsymbol{c} \in \mathcal{C}} \lambda_{\boldsymbol{c}} = 1 \right\}
%.\end{align*}
\begin{figure}[H]
\centering
\tikzstyle{codeword} = [color=KITblue, fill=KITblue,
draw, circle, inner sep=0pt, minimum size=4pt]
\tdplotsetmaincoords{60}{245}
\begin{tikzpicture}[scale=1, transform shape, tdplot_main_coords]
% Cube
\draw[dashed] (0, 0, 0) -- (2, 0, 0);
\draw[dashed] (2, 0, 0) -- (2, 0, 2);
\draw[] (2, 0, 2) -- (0, 0, 2);
\draw[] (0, 0, 2) -- (0, 0, 0);
\draw[] (0, 2, 0) -- (2, 2, 0);
\draw[] (2, 2, 0) -- (2, 2, 2);
\draw[] (2, 2, 2) -- (0, 2, 2);
\draw[] (0, 2, 2) -- (0, 2, 0);
\draw[] (0, 0, 0) -- (0, 2, 0);
\draw[dashed] (2, 0, 0) -- (2, 2, 0);
\draw[] (2, 0, 2) -- (2, 2, 2);
\draw[] (0, 0, 2) -- (0, 2, 2);
% Codeword Polytope
\draw[line width=1pt, color=KITblue] (0, 0, 0) -- (2, 0, 2);
\draw[line width=1pt, color=KITblue] (0, 0, 0) -- (2, 2, 0);
\draw[line width=1pt, color=KITblue] (0, 0, 0) -- (0, 2, 2);
\draw[line width=1pt, color=KITblue] (2, 0, 2) -- (2, 2, 0);
\draw[line width=1pt, color=KITblue] (2, 0, 2) -- (0, 2, 2);
\draw[line width=1pt, color=KITblue] (0, 2, 2) -- (2, 2, 0);
% Polytope Annotations
\node[codeword] (c000) at (0, 0, 0) {};% {$\left( 0, 0, 0 \right) $};
\node[codeword] (c101) at (2, 0, 2) {};% {$\left( 1, 0, 1 \right) $};
\node[codeword] (c110) at (2, 2, 0) {};% {$\left( 1, 1, 0 \right) $};
\node[codeword] (c011) at (0, 2, 2) {};% {$\left( 0, 1, 1 \right) $};
\node[color=KITblue, right=0cm of c000] {$\left( 0, 0, 0 \right) $};
\node[color=KITblue, above=0cm of c101] {$\left( 1, 0, 1 \right) $};
\node[color=KITblue, left=0cm of c110] {$\left( 1, 1, 0 \right) $};
\node[color=KITblue, left=-0.1cm of c011] {$\left( 0, 1, 1 \right) $};
\end{tikzpicture}
\caption{Codeword polytope of a single parity-check code}
\label{fig:dec:poly}
\end{figure}
\begin{itemize}
\item Equivalent \ac{ML} optimization problem
@ -254,7 +309,7 @@ continuous optimization problem, and one term representing the parity
constraint, accomodating the role of the parity-check matrix $\boldsymbol{H}$.
%
The equal probability assumption is made on $\mathcal{C}\left( \boldsymbol{H} \right) $.
The prior \ac{PDF} is then approximated using the code-constraint polynomial\todo{Italic?}:%
The prior \ac{PDF} is then approximated using the code-constraint polynomial:%
%
\begin{align}
f_{\boldsymbol{X}}\left( \boldsymbol{x} \right) =
@ -267,12 +322,13 @@ The prior \ac{PDF} is then approximated using the code-constraint polynomial\tod
%
The authors justify this approximation by arguing that for
$\gamma \rightarrow \infty$, the right-hand side aproaches the left-hand
side. In \ref{eq:prox:vanilla_MAP} the prior \ac{PDF}
side. In equation \ref{eq:prox:vanilla_MAP}, the prior \ac{PDF}
$f_{\boldsymbol{X}}\left( \boldsymbol{x} \right) $ can then be subsituted
for \ref{eq:prox:prior_pdf_approx} and the likelihood can be rewritten using
for equation \ref{eq:prox:prior_pdf_approx} and the likelihood can be rewritten using
the negative log-likelihood
$f_{\boldsymbol{X} \mid \boldsymbol{Y}}\left( \boldsymbol{x} \mid \boldsymbol{y} \right)
= e^{- L\left( \boldsymbol{y} \mid \boldsymbol{x} \right) }$:%
$L \left( \boldsymbol{y} \mid \boldsymbol{x} \right) = -\ln\left(
f_{\boldsymbol{X} \mid \boldsymbol{Y}}\left(
\boldsymbol{x} \mid \boldsymbol{y} \right) \right) $:%
%
\begin{align}
\hat{\boldsymbol{x}} &= \argmax_{\boldsymbol{x} \in \mathbb{R}^{n}}
@ -283,7 +339,7 @@ $f_{\boldsymbol{X} \mid \boldsymbol{Y}}\left( \boldsymbol{x} \mid \boldsymbol{y}
+ \gamma h\left( \boldsymbol{x} \right)
\right)%
\label{eq:prox:approx_map_problem}
\end{align}%
.\end{align}%
%
Thus, with proximal decoding, the objective function
$f\left( \boldsymbol{x} \right)$ to be minimized is%
@ -292,11 +348,11 @@ $f\left( \boldsymbol{x} \right)$ to be minimized is%
f\left( \boldsymbol{x} \right) = L\left( \boldsymbol{x} \mid \boldsymbol{y} \right)
+ \gamma h\left( \boldsymbol{x} \right)%
\label{eq:prox:objective_function}
.\end{align}\todo{Dot after equations?}
.\end{align}
For the solution of the approximalte \ac{MAP} decoding problem, the two parts
of \ref{eq:prox:approx_map_problem} are considered separately from one
another: the minimization of the objective function occurs in an alternating
of equation \ref{eq:prox:approx_map_problem} are considered separately:
the minimization of the objective function occurs in an alternating
manner, switching between the minimization of the negative log-likelihood
$L\left( \boldsymbol{y} \mid \boldsymbol{x} \right) $ and the scaled
code-constaint polynomial $\gamma h\left( \boldsymbol{x} \right) $.
@ -332,7 +388,7 @@ The second step thus becomes \todo{Write the formulation optimization problem pr
\hspace{5mm}\gamma > 0,\text{ small}
.\end{align*}
%
While the approximatin of the prior \ac{PDF} made in \ref{eq:prox:prior_pdf_approx}
While the approximation of the prior \ac{PDF} made in \ref{eq:prox:prior_pdf_approx}
theoretically becomes better
with larger $\gamma$, the constraint that $\gamma$ be small is important,
as it keeps the effect of $h\left( \boldsymbol{x} \right) $ on the landscape