diff --git a/latex/thesis/chapters/decoding_techniques.tex b/latex/thesis/chapters/decoding_techniques.tex index 429af0e..575b455 100644 --- a/latex/thesis/chapters/decoding_techniques.tex +++ b/latex/thesis/chapters/decoding_techniques.tex @@ -147,8 +147,7 @@ which minimizes the objective function $f$ (as shown in figure \ref{fig:dec:spac \textit{integer linear program} and subsequently presented a relaxation into a \textit{linear program}, lifting the integer requirement. The optimization method used to solve this problem that is examined in this -work is the \ac{ADMM}. -\todo{With or without 'the'?} +work is \ac{ADMM}. \todo{Why chose ADMM?} Feldman at al. begin by looking at the \ac{ML} decoding problem% @@ -164,11 +163,11 @@ They suggest that maximizing the likelihood $f_{\boldsymbol{Y} \mid \boldsymbol{C}}\left( \boldsymbol{y} \mid \boldsymbol{c} \right)$ is equivalent to minimizing the negative log-likelihood. -\ldots (Explaing arriving at cost function from ML decoding problem) +\ldots (Explain arriving at the cost function from the ML decoding problem) Based on this, they propose their cost function% \footnote{In this context, \textit{cost function} and \textit{objective function} -mean the same thing.} +have the same meaning.} for the \ac{LP} decoding problem:% % \begin{align*} @@ -177,11 +176,11 @@ for the \ac{LP} decoding problem:% \frac{f_{\boldsymbol{Y} | \boldsymbol{C}} \left( Y_i = y_i \mid C_i = 0 \right) } {f_{\boldsymbol{Y} | \boldsymbol{C}} - \left( Y_i = y_i | C_i = 1 \right) } \right) \\ + \left( Y_i = y_i | C_i = 1 \right) } \right) .\end{align*} % % -The exact integer linear program \todo{ILP acronym?} formulation of \ac{ML} +With this cost function, the exact integer linear program formulation of \ac{ML} decoding is the following:% % \begin{align*} @@ -190,19 +189,75 @@ decoding is the following:% .\end{align*}% % -\ldots (LP Relaxation) +As solving integer linear programs is generally NP-hard, the decoding problem \todo{New \S?} +has to be approximated by one with looser constraints. +A technique called \textit{\ac{LP} Relaxation} is applied, +essentially removing the requirement for the components of $\boldsymbol{c}$ +to be integer values. +In order to provide a formal definition of the relaxed constraints, the +authors go on to define the concept of the \textit{codeword polytope} +(figure \ref{fig:dec:poly}) as +being the convex hull of all possible codewords: +% +\begin{align*} + \text{poly}\left( \mathcal{C} \right) = \left\{ + \sum_{c \in \mathcal{C}} \lambda_{\boldsymbol{c}} \boldsymbol{c} + \text{ : } \lambda_{\boldsymbol{c}} \ge 0, + \sum_{\boldsymbol{c} \in \mathcal{C}} \lambda_{\boldsymbol{c}} = 1 \right\} +.\end{align*} -%They go on to define the constraints under which this minimization is to be -%accomplished. -%They define the concept of the \textit{codeword polytope} as a linear -%combination of all possible codewords, forming their convex hull:% -%% -%\begin{align*} -% \text{poly}\left( \mathcal{C} \right) = \left\{ -% \sum_{c \in \mathcal{C}} \lambda_{\boldsymbol{c}} \boldsymbol{c} -% \text{ : } \lambda_{\boldsymbol{c}} \ge 0, -% \sum_{\boldsymbol{c} \in \mathcal{C}} \lambda_{\boldsymbol{c}} = 1 \right\} -%.\end{align*} +\begin{figure}[H] + \centering + + \tikzstyle{codeword} = [color=KITblue, fill=KITblue, + draw, circle, inner sep=0pt, minimum size=4pt] + + \tdplotsetmaincoords{60}{245} + \begin{tikzpicture}[scale=1, transform shape, tdplot_main_coords] + % Cube + + \draw[dashed] (0, 0, 0) -- (2, 0, 0); + \draw[dashed] (2, 0, 0) -- (2, 0, 2); + \draw[] (2, 0, 2) -- (0, 0, 2); + \draw[] (0, 0, 2) -- (0, 0, 0); + + \draw[] (0, 2, 0) -- (2, 2, 0); + \draw[] (2, 2, 0) -- (2, 2, 2); + \draw[] (2, 2, 2) -- (0, 2, 2); + \draw[] (0, 2, 2) -- (0, 2, 0); + + \draw[] (0, 0, 0) -- (0, 2, 0); + \draw[dashed] (2, 0, 0) -- (2, 2, 0); + \draw[] (2, 0, 2) -- (2, 2, 2); + \draw[] (0, 0, 2) -- (0, 2, 2); + + % Codeword Polytope + + \draw[line width=1pt, color=KITblue] (0, 0, 0) -- (2, 0, 2); + \draw[line width=1pt, color=KITblue] (0, 0, 0) -- (2, 2, 0); + \draw[line width=1pt, color=KITblue] (0, 0, 0) -- (0, 2, 2); + + \draw[line width=1pt, color=KITblue] (2, 0, 2) -- (2, 2, 0); + \draw[line width=1pt, color=KITblue] (2, 0, 2) -- (0, 2, 2); + + \draw[line width=1pt, color=KITblue] (0, 2, 2) -- (2, 2, 0); + + % Polytope Annotations + + \node[codeword] (c000) at (0, 0, 0) {};% {$\left( 0, 0, 0 \right) $}; + \node[codeword] (c101) at (2, 0, 2) {};% {$\left( 1, 0, 1 \right) $}; + \node[codeword] (c110) at (2, 2, 0) {};% {$\left( 1, 1, 0 \right) $}; + \node[codeword] (c011) at (0, 2, 2) {};% {$\left( 0, 1, 1 \right) $}; + + \node[color=KITblue, right=0cm of c000] {$\left( 0, 0, 0 \right) $}; + \node[color=KITblue, above=0cm of c101] {$\left( 1, 0, 1 \right) $}; + \node[color=KITblue, left=0cm of c110] {$\left( 1, 1, 0 \right) $}; + \node[color=KITblue, left=-0.1cm of c011] {$\left( 0, 1, 1 \right) $}; + \end{tikzpicture} + + \caption{Codeword polytope of a single parity-check code} + \label{fig:dec:poly} +\end{figure} \begin{itemize} \item Equivalent \ac{ML} optimization problem @@ -254,7 +309,7 @@ continuous optimization problem, and one term representing the parity constraint, accomodating the role of the parity-check matrix $\boldsymbol{H}$. % The equal probability assumption is made on $\mathcal{C}\left( \boldsymbol{H} \right) $. -The prior \ac{PDF} is then approximated using the code-constraint polynomial\todo{Italic?}:% +The prior \ac{PDF} is then approximated using the code-constraint polynomial:% % \begin{align} f_{\boldsymbol{X}}\left( \boldsymbol{x} \right) = @@ -267,12 +322,13 @@ The prior \ac{PDF} is then approximated using the code-constraint polynomial\tod % The authors justify this approximation by arguing that for $\gamma \rightarrow \infty$, the right-hand side aproaches the left-hand -side. In \ref{eq:prox:vanilla_MAP} the prior \ac{PDF} +side. In equation \ref{eq:prox:vanilla_MAP}, the prior \ac{PDF} $f_{\boldsymbol{X}}\left( \boldsymbol{x} \right) $ can then be subsituted -for \ref{eq:prox:prior_pdf_approx} and the likelihood can be rewritten using +for equation \ref{eq:prox:prior_pdf_approx} and the likelihood can be rewritten using the negative log-likelihood -$f_{\boldsymbol{X} \mid \boldsymbol{Y}}\left( \boldsymbol{x} \mid \boldsymbol{y} \right) - = e^{- L\left( \boldsymbol{y} \mid \boldsymbol{x} \right) }$:% +$L \left( \boldsymbol{y} \mid \boldsymbol{x} \right) = -\ln\left( + f_{\boldsymbol{X} \mid \boldsymbol{Y}}\left( + \boldsymbol{x} \mid \boldsymbol{y} \right) \right) $:% % \begin{align} \hat{\boldsymbol{x}} &= \argmax_{\boldsymbol{x} \in \mathbb{R}^{n}} @@ -283,7 +339,7 @@ $f_{\boldsymbol{X} \mid \boldsymbol{Y}}\left( \boldsymbol{x} \mid \boldsymbol{y} + \gamma h\left( \boldsymbol{x} \right) \right)% \label{eq:prox:approx_map_problem} -\end{align}% +.\end{align}% % Thus, with proximal decoding, the objective function $f\left( \boldsymbol{x} \right)$ to be minimized is% @@ -292,11 +348,11 @@ $f\left( \boldsymbol{x} \right)$ to be minimized is% f\left( \boldsymbol{x} \right) = L\left( \boldsymbol{x} \mid \boldsymbol{y} \right) + \gamma h\left( \boldsymbol{x} \right)% \label{eq:prox:objective_function} -.\end{align}\todo{Dot after equations?} +.\end{align} For the solution of the approximalte \ac{MAP} decoding problem, the two parts -of \ref{eq:prox:approx_map_problem} are considered separately from one -another: the minimization of the objective function occurs in an alternating +of equation \ref{eq:prox:approx_map_problem} are considered separately: +the minimization of the objective function occurs in an alternating manner, switching between the minimization of the negative log-likelihood $L\left( \boldsymbol{y} \mid \boldsymbol{x} \right) $ and the scaled code-constaint polynomial $\gamma h\left( \boldsymbol{x} \right) $. @@ -332,7 +388,7 @@ The second step thus becomes \todo{Write the formulation optimization problem pr \hspace{5mm}\gamma > 0,\text{ small} .\end{align*} % -While the approximatin of the prior \ac{PDF} made in \ref{eq:prox:prior_pdf_approx} +While the approximation of the prior \ac{PDF} made in \ref{eq:prox:prior_pdf_approx} theoretically becomes better with larger $\gamma$, the constraint that $\gamma$ be small is important, as it keeps the effect of $h\left( \boldsymbol{x} \right) $ on the landscape