Added codeword polytope figure; minor other changes
This commit is contained in:
parent
3ec2a3fb76
commit
e3adafe2ff
@ -147,8 +147,7 @@ which minimizes the objective function $f$ (as shown in figure \ref{fig:dec:spac
|
||||
\textit{integer linear program} and subsequently presented a relaxation into
|
||||
a \textit{linear program}, lifting the integer requirement.
|
||||
The optimization method used to solve this problem that is examined in this
|
||||
work is the \ac{ADMM}.
|
||||
\todo{With or without 'the'?}
|
||||
work is \ac{ADMM}.
|
||||
\todo{Why chose ADMM?}
|
||||
|
||||
Feldman at al. begin by looking at the \ac{ML} decoding problem%
|
||||
@ -164,11 +163,11 @@ They suggest that maximizing the likelihood
|
||||
$f_{\boldsymbol{Y} \mid \boldsymbol{C}}\left( \boldsymbol{y} \mid \boldsymbol{c} \right)$
|
||||
is equivalent to minimizing the negative log-likelihood.
|
||||
|
||||
\ldots (Explaing arriving at cost function from ML decoding problem)
|
||||
\ldots (Explain arriving at the cost function from the ML decoding problem)
|
||||
|
||||
Based on this, they propose their cost function%
|
||||
\footnote{In this context, \textit{cost function} and \textit{objective function}
|
||||
mean the same thing.}
|
||||
have the same meaning.}
|
||||
for the \ac{LP} decoding problem:%
|
||||
%
|
||||
\begin{align*}
|
||||
@ -177,11 +176,11 @@ for the \ac{LP} decoding problem:%
|
||||
\frac{f_{\boldsymbol{Y} | \boldsymbol{C}}
|
||||
\left( Y_i = y_i \mid C_i = 0 \right) }
|
||||
{f_{\boldsymbol{Y} | \boldsymbol{C}}
|
||||
\left( Y_i = y_i | C_i = 1 \right) } \right) \\
|
||||
\left( Y_i = y_i | C_i = 1 \right) } \right)
|
||||
.\end{align*}
|
||||
%
|
||||
%
|
||||
The exact integer linear program \todo{ILP acronym?} formulation of \ac{ML}
|
||||
With this cost function, the exact integer linear program formulation of \ac{ML}
|
||||
decoding is the following:%
|
||||
%
|
||||
\begin{align*}
|
||||
@ -190,19 +189,75 @@ decoding is the following:%
|
||||
.\end{align*}%
|
||||
%
|
||||
|
||||
\ldots (LP Relaxation)
|
||||
As solving integer linear programs is generally NP-hard, the decoding problem \todo{New \S?}
|
||||
has to be approximated by one with looser constraints.
|
||||
A technique called \textit{\ac{LP} Relaxation} is applied,
|
||||
essentially removing the requirement for the components of $\boldsymbol{c}$
|
||||
to be integer values.
|
||||
In order to provide a formal definition of the relaxed constraints, the
|
||||
authors go on to define the concept of the \textit{codeword polytope}
|
||||
(figure \ref{fig:dec:poly}) as
|
||||
being the convex hull of all possible codewords:
|
||||
%
|
||||
\begin{align*}
|
||||
\text{poly}\left( \mathcal{C} \right) = \left\{
|
||||
\sum_{c \in \mathcal{C}} \lambda_{\boldsymbol{c}} \boldsymbol{c}
|
||||
\text{ : } \lambda_{\boldsymbol{c}} \ge 0,
|
||||
\sum_{\boldsymbol{c} \in \mathcal{C}} \lambda_{\boldsymbol{c}} = 1 \right\}
|
||||
.\end{align*}
|
||||
|
||||
%They go on to define the constraints under which this minimization is to be
|
||||
%accomplished.
|
||||
%They define the concept of the \textit{codeword polytope} as a linear
|
||||
%combination of all possible codewords, forming their convex hull:%
|
||||
%%
|
||||
%\begin{align*}
|
||||
% \text{poly}\left( \mathcal{C} \right) = \left\{
|
||||
% \sum_{c \in \mathcal{C}} \lambda_{\boldsymbol{c}} \boldsymbol{c}
|
||||
% \text{ : } \lambda_{\boldsymbol{c}} \ge 0,
|
||||
% \sum_{\boldsymbol{c} \in \mathcal{C}} \lambda_{\boldsymbol{c}} = 1 \right\}
|
||||
%.\end{align*}
|
||||
\begin{figure}[H]
|
||||
\centering
|
||||
|
||||
\tikzstyle{codeword} = [color=KITblue, fill=KITblue,
|
||||
draw, circle, inner sep=0pt, minimum size=4pt]
|
||||
|
||||
\tdplotsetmaincoords{60}{245}
|
||||
\begin{tikzpicture}[scale=1, transform shape, tdplot_main_coords]
|
||||
% Cube
|
||||
|
||||
\draw[dashed] (0, 0, 0) -- (2, 0, 0);
|
||||
\draw[dashed] (2, 0, 0) -- (2, 0, 2);
|
||||
\draw[] (2, 0, 2) -- (0, 0, 2);
|
||||
\draw[] (0, 0, 2) -- (0, 0, 0);
|
||||
|
||||
\draw[] (0, 2, 0) -- (2, 2, 0);
|
||||
\draw[] (2, 2, 0) -- (2, 2, 2);
|
||||
\draw[] (2, 2, 2) -- (0, 2, 2);
|
||||
\draw[] (0, 2, 2) -- (0, 2, 0);
|
||||
|
||||
\draw[] (0, 0, 0) -- (0, 2, 0);
|
||||
\draw[dashed] (2, 0, 0) -- (2, 2, 0);
|
||||
\draw[] (2, 0, 2) -- (2, 2, 2);
|
||||
\draw[] (0, 0, 2) -- (0, 2, 2);
|
||||
|
||||
% Codeword Polytope
|
||||
|
||||
\draw[line width=1pt, color=KITblue] (0, 0, 0) -- (2, 0, 2);
|
||||
\draw[line width=1pt, color=KITblue] (0, 0, 0) -- (2, 2, 0);
|
||||
\draw[line width=1pt, color=KITblue] (0, 0, 0) -- (0, 2, 2);
|
||||
|
||||
\draw[line width=1pt, color=KITblue] (2, 0, 2) -- (2, 2, 0);
|
||||
\draw[line width=1pt, color=KITblue] (2, 0, 2) -- (0, 2, 2);
|
||||
|
||||
\draw[line width=1pt, color=KITblue] (0, 2, 2) -- (2, 2, 0);
|
||||
|
||||
% Polytope Annotations
|
||||
|
||||
\node[codeword] (c000) at (0, 0, 0) {};% {$\left( 0, 0, 0 \right) $};
|
||||
\node[codeword] (c101) at (2, 0, 2) {};% {$\left( 1, 0, 1 \right) $};
|
||||
\node[codeword] (c110) at (2, 2, 0) {};% {$\left( 1, 1, 0 \right) $};
|
||||
\node[codeword] (c011) at (0, 2, 2) {};% {$\left( 0, 1, 1 \right) $};
|
||||
|
||||
\node[color=KITblue, right=0cm of c000] {$\left( 0, 0, 0 \right) $};
|
||||
\node[color=KITblue, above=0cm of c101] {$\left( 1, 0, 1 \right) $};
|
||||
\node[color=KITblue, left=0cm of c110] {$\left( 1, 1, 0 \right) $};
|
||||
\node[color=KITblue, left=-0.1cm of c011] {$\left( 0, 1, 1 \right) $};
|
||||
\end{tikzpicture}
|
||||
|
||||
\caption{Codeword polytope of a single parity-check code}
|
||||
\label{fig:dec:poly}
|
||||
\end{figure}
|
||||
|
||||
\begin{itemize}
|
||||
\item Equivalent \ac{ML} optimization problem
|
||||
@ -254,7 +309,7 @@ continuous optimization problem, and one term representing the parity
|
||||
constraint, accomodating the role of the parity-check matrix $\boldsymbol{H}$.
|
||||
%
|
||||
The equal probability assumption is made on $\mathcal{C}\left( \boldsymbol{H} \right) $.
|
||||
The prior \ac{PDF} is then approximated using the code-constraint polynomial\todo{Italic?}:%
|
||||
The prior \ac{PDF} is then approximated using the code-constraint polynomial:%
|
||||
%
|
||||
\begin{align}
|
||||
f_{\boldsymbol{X}}\left( \boldsymbol{x} \right) =
|
||||
@ -267,12 +322,13 @@ The prior \ac{PDF} is then approximated using the code-constraint polynomial\tod
|
||||
%
|
||||
The authors justify this approximation by arguing that for
|
||||
$\gamma \rightarrow \infty$, the right-hand side aproaches the left-hand
|
||||
side. In \ref{eq:prox:vanilla_MAP} the prior \ac{PDF}
|
||||
side. In equation \ref{eq:prox:vanilla_MAP}, the prior \ac{PDF}
|
||||
$f_{\boldsymbol{X}}\left( \boldsymbol{x} \right) $ can then be subsituted
|
||||
for \ref{eq:prox:prior_pdf_approx} and the likelihood can be rewritten using
|
||||
for equation \ref{eq:prox:prior_pdf_approx} and the likelihood can be rewritten using
|
||||
the negative log-likelihood
|
||||
$f_{\boldsymbol{X} \mid \boldsymbol{Y}}\left( \boldsymbol{x} \mid \boldsymbol{y} \right)
|
||||
= e^{- L\left( \boldsymbol{y} \mid \boldsymbol{x} \right) }$:%
|
||||
$L \left( \boldsymbol{y} \mid \boldsymbol{x} \right) = -\ln\left(
|
||||
f_{\boldsymbol{X} \mid \boldsymbol{Y}}\left(
|
||||
\boldsymbol{x} \mid \boldsymbol{y} \right) \right) $:%
|
||||
%
|
||||
\begin{align}
|
||||
\hat{\boldsymbol{x}} &= \argmax_{\boldsymbol{x} \in \mathbb{R}^{n}}
|
||||
@ -283,7 +339,7 @@ $f_{\boldsymbol{X} \mid \boldsymbol{Y}}\left( \boldsymbol{x} \mid \boldsymbol{y}
|
||||
+ \gamma h\left( \boldsymbol{x} \right)
|
||||
\right)%
|
||||
\label{eq:prox:approx_map_problem}
|
||||
\end{align}%
|
||||
.\end{align}%
|
||||
%
|
||||
Thus, with proximal decoding, the objective function
|
||||
$f\left( \boldsymbol{x} \right)$ to be minimized is%
|
||||
@ -292,11 +348,11 @@ $f\left( \boldsymbol{x} \right)$ to be minimized is%
|
||||
f\left( \boldsymbol{x} \right) = L\left( \boldsymbol{x} \mid \boldsymbol{y} \right)
|
||||
+ \gamma h\left( \boldsymbol{x} \right)%
|
||||
\label{eq:prox:objective_function}
|
||||
.\end{align}\todo{Dot after equations?}
|
||||
.\end{align}
|
||||
|
||||
For the solution of the approximalte \ac{MAP} decoding problem, the two parts
|
||||
of \ref{eq:prox:approx_map_problem} are considered separately from one
|
||||
another: the minimization of the objective function occurs in an alternating
|
||||
of equation \ref{eq:prox:approx_map_problem} are considered separately:
|
||||
the minimization of the objective function occurs in an alternating
|
||||
manner, switching between the minimization of the negative log-likelihood
|
||||
$L\left( \boldsymbol{y} \mid \boldsymbol{x} \right) $ and the scaled
|
||||
code-constaint polynomial $\gamma h\left( \boldsymbol{x} \right) $.
|
||||
@ -332,7 +388,7 @@ The second step thus becomes \todo{Write the formulation optimization problem pr
|
||||
\hspace{5mm}\gamma > 0,\text{ small}
|
||||
.\end{align*}
|
||||
%
|
||||
While the approximatin of the prior \ac{PDF} made in \ref{eq:prox:prior_pdf_approx}
|
||||
While the approximation of the prior \ac{PDF} made in \ref{eq:prox:prior_pdf_approx}
|
||||
theoretically becomes better
|
||||
with larger $\gamma$, the constraint that $\gamma$ be small is important,
|
||||
as it keeps the effect of $h\left( \boldsymbol{x} \right) $ on the landscape
|
||||
|
||||
Loading…
Reference in New Issue
Block a user