732 lines
30 KiB
TeX
732 lines
30 KiB
TeX
\chapter{\acs{LP} Decoding using \acs{ADMM}}%
|
|
\label{chapter:lp_dec_using_admm}
|
|
|
|
TODO
|
|
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\section{LP Decoding}%
|
|
\label{sec:lp:LP Decoding}
|
|
|
|
\Ac{LP} decoding is a subject area introduced by Feldman et al.
|
|
\cite{feldman_paper}. They reframe the decoding problem as an
|
|
\textit{integer linear program} and subsequently present two relaxations into
|
|
\textit{linear programs}, one representing an \ac{LP} formulation of exact
|
|
\ac{ML} decoding and one, which is an approximation with a more manageable
|
|
representation.
|
|
To solve the resulting linear program, various optimization methods can be
|
|
used (see for example \cite{alp}, \cite{interior_point},
|
|
\cite{efficient_lp_dec_admm}, \cite{pdd}).
|
|
|
|
Feldman et al. begin by looking at the \ac{ML} decoding problem%
|
|
\footnote{They assume that all codewords are equally likely to be transmitted,
|
|
making the \ac{ML} and \ac{MAP} decoding problems equivalent.}%
|
|
%
|
|
\begin{align}
|
|
\hat{\boldsymbol{c}}_{\text{\ac{ML}}} = \argmax_{\boldsymbol{c} \in \mathcal{C}}
|
|
f_{\boldsymbol{Y} \mid \boldsymbol{C}}
|
|
\left( \boldsymbol{y} \mid \boldsymbol{c} \right)%
|
|
\label{eq:lp:ml}
|
|
.\end{align}%
|
|
%
|
|
Assuming a memoryless channel, equation (\ref{eq:lp:ml}) can be rewritten in terms
|
|
of the \acp{LLR} $\gamma_i$ \cite[Sec. 2.5]{feldman_thesis}:%
|
|
%
|
|
\begin{align*}
|
|
\hat{\boldsymbol{c}}_{\text{\ac{ML}}} = \argmin_{\boldsymbol{c}\in\mathcal{C}}
|
|
\sum_{i=1}^{n} \gamma_i c_i,%
|
|
\hspace{5mm} \gamma_i = \ln\left(
|
|
\frac{f_{Y_i | C_i} \left( y_i \mid c_i = 0 \right) }
|
|
{f_{Y_i | C_i} \left( y_i \mid c_i = 1 \right) } \right)
|
|
.\end{align*}
|
|
%
|
|
The authors propose using the following cost function%
|
|
\footnote{In this context, \textit{cost function} and \textit{objective function}
|
|
have the same meaning.}
|
|
for the \ac{LP} decoding problem:%
|
|
%
|
|
\begin{align*}
|
|
g\left( \boldsymbol{c} \right) = \sum_{i=1}^{n} \gamma_i c_i
|
|
= \boldsymbol{\gamma}^\text{T}\boldsymbol{c}
|
|
.\end{align*}
|
|
%
|
|
With this cost function, the exact integer linear program formulation of \ac{ML}
|
|
decoding becomes%
|
|
%
|
|
\begin{align*}
|
|
\text{minimize }\hspace{2mm} & \boldsymbol{\gamma}^\text{T}\boldsymbol{c} \\
|
|
\text{subject to }\hspace{2mm} &\boldsymbol{c} \in \mathcal{C}
|
|
.\end{align*}%
|
|
%
|
|
%\todo{$\boldsymbol{c}$ or some other variable name? e.g. $\boldsymbol{c}^{*}$.
|
|
%Especially for the continuous variable in LP decoding}
|
|
|
|
As solving integer linear programs is generally NP-hard, this decoding problem
|
|
has to be approximated by a problem with looser constraints.
|
|
A technique called \textit{relaxation} is applied:
|
|
relaxing the constraints, thereby broadening the considered domain
|
|
(e.g., by lifting the integer requirement).
|
|
First, the authors present an equivalent \ac{LP} formulation of exact \ac{ML}
|
|
decoding, redefining the constraints in terms of the \text{codeword polytope}
|
|
%
|
|
\begin{align*}
|
|
\text{poly}\left( \mathcal{C} \right) = \left\{
|
|
\sum_{\boldsymbol{c} \in \mathcal{C}} \alpha_{\boldsymbol{c}} \boldsymbol{c}
|
|
\text{ : } \alpha_{\boldsymbol{c}} \ge 0,
|
|
\sum_{\boldsymbol{c} \in \mathcal{C}} \alpha_{\boldsymbol{c}} = 1 \right\}
|
|
,\end{align*} %
|
|
%
|
|
which represents the \textit{convex hull} of all possible codewords,
|
|
i.e., the convex set of linear combinations of all codewords.
|
|
This corresponds to simply lifting the integer requirement.
|
|
However, since the number of constraints needed to characterize the codeword
|
|
polytope is exponential in the code length, this formulation is relaxed further.
|
|
By observing that each check node defines its own local single parity-check
|
|
code, and, thus, its own \textit{local codeword polytope},
|
|
the \textit{relaxed codeword polytope} $\overline{Q}$ is defined as the intersection of all
|
|
local codeword polytopes.
|
|
This consideration leads to constraints that can be described as follows
|
|
\cite[Sec. II, A]{efficient_lp_dec_admm}:%
|
|
%
|
|
\begin{align*}
|
|
\boldsymbol{T}_j \tilde{\boldsymbol{c}} \in \mathcal{P}_{d_j}
|
|
\hspace{5mm}\forall j\in \mathcal{J}
|
|
,\end{align*}%
|
|
%
|
|
where $\mathcal{P}_{d_j}$ is the \textit{check polytope}, i.e., the convex hull of all
|
|
binary vectors of length $d_j$ with even parity%
|
|
\footnote{Essentially $\mathcal{P}_{d_j}$ is the set of vectors that satisfy
|
|
parity-check $j$, but extended to the continuous domain.},
|
|
and $\boldsymbol{T}_j$ is the \textit{transfer matrix}, which selects the
|
|
neighboring variable nodes
|
|
of check node $j$ (i.e., the relevant components of $\tilde{\boldsymbol{c}}$
|
|
for parity-check $j$).
|
|
For example, if the $j$th row of the parity-check matrix
|
|
$\boldsymbol{H}$ was $\boldsymbol{h}_j =
|
|
\begin{bmatrix} 0 & 1 & 0 & 1 & 0 & 1 & 0 \end{bmatrix}$,
|
|
the transfer matrix would be \cite[Sec. II, A]{efficient_lp_dec_admm}
|
|
%
|
|
\begin{align*}
|
|
\boldsymbol{T}_j =
|
|
\begin{bmatrix}
|
|
0 & 1 & 0 & 0 & 0 & 0 & 0 \\
|
|
0 & 0 & 0 & 1 & 0 & 0 & 0 \\
|
|
0 & 0 & 0 & 0 & 0 & 1 & 0 \\
|
|
\end{bmatrix}
|
|
.\end{align*}%
|
|
%
|
|
|
|
In figure \ref{fig:lp:poly}, the two relaxations are compared for an
|
|
examplary code, which is described by the generator and parity-check matrices%
|
|
%
|
|
\begin{align}
|
|
\boldsymbol{G} =
|
|
\begin{bmatrix}
|
|
0 & 1 & 1
|
|
\end{bmatrix} \label{eq:lp:example_code_def_gen} \\[1em]
|
|
\boldsymbol{H} =
|
|
\begin{bmatrix}
|
|
1 & 1 & 1\\
|
|
0 & 1 & 1
|
|
\end{bmatrix} \label{eq:lp:example_code_def_par}
|
|
\end{align}%
|
|
%
|
|
and has only two possible codewords:
|
|
%
|
|
\begin{align*}
|
|
\mathcal{C} = \left\{ \begin{bmatrix} 0 & 0 & 0 \end{bmatrix},
|
|
\begin{bmatrix} 0 & 1 & 1 \end{bmatrix} \right\}
|
|
.\end{align*}
|
|
%
|
|
Figure \ref{fig:lp:poly:exact_ilp} shows the domain of exact \ac{ML} decoding.
|
|
The first relaxation onto the codeword polytope $\text{poly}\left( \mathcal{C} \right) $
|
|
is shown in figure \ref{fig:lp:poly:exact};
|
|
this expresses the constraints for the equivalent linear program to exact \ac{ML} decoding.
|
|
$\text{poly}\left( \mathcal{C} \right) $ is further relaxed onto the relaxed codeword polytope
|
|
$\overline{Q}$, shown in figure \ref{fig:lp:poly:relaxed}.
|
|
Figure \ref{fig:lp:poly:local} shows how $\overline{Q}$ is formed by intersecting the
|
|
local codeword polytopes of each check node.
|
|
%
|
|
%
|
|
%
|
|
% Codeword polytope visualization figure
|
|
%
|
|
%
|
|
\begin{figure}[H]
|
|
\centering
|
|
|
|
%
|
|
% Left side - codeword polytope
|
|
%
|
|
|
|
\begin{subfigure}[b]{0.35\textwidth}
|
|
\centering
|
|
|
|
\begin{subfigure}{\textwidth}
|
|
\centering
|
|
|
|
\tikzstyle{codeword} = [color=KITblue, fill=KITblue,
|
|
draw, circle, inner sep=0pt, minimum size=4pt]
|
|
|
|
\tdplotsetmaincoords{60}{25}
|
|
\begin{tikzpicture}[scale=0.9, tdplot_main_coords]
|
|
% Cube
|
|
|
|
\coordinate (p000) at (0, 0, 0);
|
|
\coordinate (p001) at (0, 0, 2);
|
|
\coordinate (p010) at (0, 2, 0);
|
|
\coordinate (p011) at (0, 2, 2);
|
|
\coordinate (p100) at (2, 0, 0);
|
|
\coordinate (p101) at (2, 0, 2);
|
|
\coordinate (p110) at (2, 2, 0);
|
|
\coordinate (p111) at (2, 2, 2);
|
|
|
|
\draw[] (p000) -- (p100);
|
|
\draw[] (p100) -- (p101);
|
|
\draw[] (p101) -- (p001);
|
|
\draw[] (p001) -- (p000);
|
|
|
|
\draw[dashed] (p010) -- (p110);
|
|
\draw[] (p110) -- (p111);
|
|
\draw[] (p111) -- (p011);
|
|
\draw[dashed] (p011) -- (p010);
|
|
|
|
\draw[dashed] (p000) -- (p010);
|
|
\draw[] (p100) -- (p110);
|
|
\draw[] (p101) -- (p111);
|
|
\draw[] (p001) -- (p011);
|
|
|
|
% Polytope Vertices
|
|
|
|
\node[codeword] (c000) at (p000) {};
|
|
\node[codeword] (c011) at (p011) {};
|
|
|
|
% Polytope Annotations
|
|
|
|
\node[color=KITblue, below=0cm of c000] {$\left( 0, 0, 0 \right) $};
|
|
\node[color=KITblue, above=0cm of c011] {$\left( 0, 1, 1 \right) $};
|
|
\end{tikzpicture}
|
|
|
|
\caption{Set of all codewords $\mathcal{C}$}
|
|
\label{fig:lp:poly:exact_ilp}
|
|
\end{subfigure}\\[1em]
|
|
\begin{subfigure}{\textwidth}
|
|
\centering
|
|
|
|
\begin{tikzpicture}
|
|
\node (relaxation) at (0, 0) {Relaxation};
|
|
|
|
\draw (0, 0.61) -- (relaxation);
|
|
\draw[->] (relaxation) -- (0, -0.7);
|
|
\end{tikzpicture}
|
|
|
|
\vspace{4mm}
|
|
|
|
\tikzstyle{codeword} = [color=KITblue, fill=KITblue,
|
|
draw, circle, inner sep=0pt, minimum size=4pt]
|
|
|
|
\tdplotsetmaincoords{60}{25}
|
|
\begin{tikzpicture}[scale=0.9, tdplot_main_coords]
|
|
% Cube
|
|
|
|
\coordinate (p000) at (0, 0, 0);
|
|
\coordinate (p001) at (0, 0, 2);
|
|
\coordinate (p010) at (0, 2, 0);
|
|
\coordinate (p011) at (0, 2, 2);
|
|
\coordinate (p100) at (2, 0, 0);
|
|
\coordinate (p101) at (2, 0, 2);
|
|
\coordinate (p110) at (2, 2, 0);
|
|
\coordinate (p111) at (2, 2, 2);
|
|
|
|
\draw[] (p000) -- (p100);
|
|
\draw[] (p100) -- (p101);
|
|
\draw[] (p101) -- (p001);
|
|
\draw[] (p001) -- (p000);
|
|
|
|
\draw[dashed] (p010) -- (p110);
|
|
\draw[] (p110) -- (p111);
|
|
\draw[] (p111) -- (p011);
|
|
\draw[dashed] (p011) -- (p010);
|
|
|
|
\draw[dashed] (p000) -- (p010);
|
|
\draw[] (p100) -- (p110);
|
|
\draw[] (p101) -- (p111);
|
|
\draw[] (p001) -- (p011);
|
|
|
|
% Polytope Vertices
|
|
|
|
\node[codeword] (c000) at (p000) {};
|
|
\node[codeword] (c011) at (p011) {};
|
|
|
|
% Polytope Edges
|
|
|
|
\draw[line width=1pt, color=KITblue] (c000) -- (c011);
|
|
|
|
% Polytope Annotations
|
|
|
|
\node[color=KITblue, below=0cm of c000] {$\left( 0, 0, 0 \right) $};
|
|
\node[color=KITblue, above=0cm of c011] {$\left( 0, 1, 1 \right) $};
|
|
\end{tikzpicture}
|
|
|
|
\caption{Codeword polytope $\text{poly}\left( \mathcal{C} \right) $}
|
|
\label{fig:lp:poly:exact}
|
|
\end{subfigure}
|
|
\end{subfigure} \hfill%
|
|
%
|
|
%
|
|
% Right side - relaxed polytope
|
|
%
|
|
%
|
|
\begin{subfigure}[b]{0.55\textwidth}
|
|
\centering
|
|
|
|
\begin{subfigure}{\textwidth}
|
|
\centering
|
|
|
|
\begin{minipage}{0.5\textwidth}
|
|
\centering
|
|
|
|
\tikzstyle{codeword} = [color=KITblue, fill=KITblue,
|
|
draw, circle, inner sep=0pt, minimum size=4pt]
|
|
|
|
\tdplotsetmaincoords{60}{25}
|
|
\begin{tikzpicture}[scale=0.9, tdplot_main_coords]
|
|
% Cube
|
|
|
|
\coordinate (p000) at (0, 0, 0);
|
|
\coordinate (p001) at (0, 0, 2);
|
|
\coordinate (p010) at (0, 2, 0);
|
|
\coordinate (p011) at (0, 2, 2);
|
|
\coordinate (p100) at (2, 0, 0);
|
|
\coordinate (p101) at (2, 0, 2);
|
|
\coordinate (p110) at (2, 2, 0);
|
|
\coordinate (p111) at (2, 2, 2);
|
|
|
|
\draw[] (p000) -- (p100);
|
|
\draw[] (p100) -- (p101);
|
|
\draw[] (p101) -- (p001);
|
|
\draw[] (p001) -- (p000);
|
|
|
|
\draw[dashed] (p010) -- (p110);
|
|
\draw[] (p110) -- (p111);
|
|
\draw[] (p111) -- (p011);
|
|
\draw[dashed] (p011) -- (p010);
|
|
|
|
\draw[dashed] (p000) -- (p010);
|
|
\draw[] (p100) -- (p110);
|
|
\draw[] (p101) -- (p111);
|
|
\draw[] (p001) -- (p011);
|
|
|
|
% Polytope Vertices
|
|
|
|
\node[codeword] (c000) at (p000) {};
|
|
\node[codeword] (c101) at (p101) {};
|
|
\node[codeword] (c110) at (p110) {};
|
|
\node[codeword] (c011) at (p011) {};
|
|
|
|
% Polytope Edges & Faces
|
|
|
|
\draw[line width=1pt, color=KITblue] (c000) -- (c101);
|
|
\draw[line width=1pt, color=KITblue] (c000) -- (c110);
|
|
\draw[line width=1pt, color=KITblue] (c000) -- (c011);
|
|
|
|
\draw[line width=1pt, color=KITblue] (c101) -- (c110);
|
|
\draw[line width=1pt, color=KITblue] (c101) -- (c011);
|
|
|
|
\draw[line width=1pt, color=KITblue] (c011) -- (c110);
|
|
|
|
\fill[KITblue, opacity=0.15] (p000) -- (p101) -- (p011) -- cycle;
|
|
\fill[KITblue, opacity=0.15] (p000) -- (p110) -- (p101) -- cycle;
|
|
\fill[KITblue, opacity=0.15] (p110) -- (p011) -- (p101) -- cycle;
|
|
|
|
% Polytope Annotations
|
|
|
|
\node[color=KITblue, below=0cm of c000] {$\left( 0, 0, 0 \right) $};
|
|
\node[color=KITblue, right=0.07cm of c101] {$\left( 1, 0, 1 \right) $};
|
|
\node[color=KITblue, right=0cm of c110] {$\left( 1, 1, 0 \right) $};
|
|
\node[color=KITblue, above=0cm of c011] {$\left( 0, 1, 1 \right) $};
|
|
\end{tikzpicture}
|
|
\end{minipage}%
|
|
\begin{minipage}{0.5\textwidth}
|
|
\centering
|
|
|
|
\tikzstyle{codeword} = [color=KITblue, fill=KITblue,
|
|
draw, circle, inner sep=0pt, minimum size=4pt]
|
|
|
|
\tdplotsetmaincoords{60}{25}
|
|
\begin{tikzpicture}[scale=0.9, tdplot_main_coords]
|
|
% Cube
|
|
|
|
\coordinate (p000) at (0, 0, 0);
|
|
\coordinate (p001) at (0, 0, 2);
|
|
\coordinate (p010) at (0, 2, 0);
|
|
\coordinate (p011) at (0, 2, 2);
|
|
\coordinate (p100) at (2, 0, 0);
|
|
\coordinate (p101) at (2, 0, 2);
|
|
\coordinate (p110) at (2, 2, 0);
|
|
\coordinate (p111) at (2, 2, 2);
|
|
|
|
\draw[] (p000) -- (p100);
|
|
\draw[] (p100) -- (p101);
|
|
\draw[] (p101) -- (p001);
|
|
\draw[] (p001) -- (p000);
|
|
|
|
\draw[dashed] (p010) -- (p110);
|
|
\draw[] (p110) -- (p111);
|
|
\draw[] (p111) -- (p011);
|
|
\draw[dashed] (p011) -- (p010);
|
|
|
|
\draw[dashed] (p000) -- (p010);
|
|
\draw[] (p100) -- (p110);
|
|
\draw[] (p101) -- (p111);
|
|
\draw[] (p001) -- (p011);
|
|
|
|
% Polytope Vertices
|
|
|
|
\node[codeword] (c000) at (p000) {};
|
|
\node[codeword] (c011) at (p011) {};
|
|
\node[codeword] (c100) at (p100) {};
|
|
\node[codeword] (c111) at (p111) {};
|
|
|
|
% Polytope Edges & Faces
|
|
|
|
\draw[line width=1pt, color=KITblue] (c000) -- (c011);
|
|
\draw[line width=1pt, color=KITblue] (c000) -- (c100);
|
|
\draw[line width=1pt, color=KITblue] (c100) -- (c111);
|
|
\draw[line width=1pt, color=KITblue] (c111) -- (c011);
|
|
|
|
\fill[KITblue, opacity=0.2] (p000) -- (p100) -- (p111) -- (p011) -- cycle;
|
|
|
|
% Polytope Annotations
|
|
|
|
\node[color=KITblue, below=0cm of c000] {$\left( 0, 0, 0 \right) $};
|
|
\node[color=KITblue, above=0cm of c011] {$\left( 0, 1, 1 \right) $};
|
|
\node[color=KITblue, below=0cm of c100] {$\left( 1, 0, 0 \right) $};
|
|
\node[color=KITblue, above=0cm of c111] {$\left( 1, 1, 1 \right) $};
|
|
\end{tikzpicture}
|
|
\end{minipage}
|
|
|
|
\begin{tikzpicture}
|
|
\node[color=KITblue, align=center] at (-2,0)
|
|
{$j=1$\\ $\left( c_1 + c_2+ c_3 = 0 \right) $};
|
|
\node[color=KITblue, align=center] at (2,0)
|
|
{$j=2$\\ $\left(c_2 + c_3 = 0\right)$};
|
|
\end{tikzpicture}
|
|
|
|
\caption{Local codeword polytopes of the check nodes}
|
|
\label{fig:lp:poly:local}
|
|
\end{subfigure}\\[1em]
|
|
\begin{subfigure}{\textwidth}
|
|
\centering
|
|
|
|
\begin{tikzpicture}
|
|
\draw[densely dashed] (-2, 0) -- (2, 0);
|
|
\draw[densely dashed] (-2, 0.5) -- (-2, 0);
|
|
\draw[densely dashed] (2, 0.5) -- (2, 0);
|
|
|
|
\node (intersection) at (0, -0.5) {Intersection};
|
|
|
|
\draw[densely dashed] (0, 0) -- (intersection);
|
|
\draw[densely dashed, ->] (intersection) -- (0, -1);
|
|
\end{tikzpicture}
|
|
|
|
\vspace{2mm}
|
|
|
|
\tikzstyle{codeword} = [color=KITblue, fill=KITblue,
|
|
draw, circle, inner sep=0pt, minimum size=4pt]
|
|
\tikzstyle{pseudocodeword} = [color=KITred, fill=KITred,
|
|
draw, circle, inner sep=0pt, minimum size=4pt]
|
|
|
|
\tdplotsetmaincoords{60}{25}
|
|
\begin{tikzpicture}[scale=0.9, tdplot_main_coords]
|
|
% Cube
|
|
|
|
\coordinate (p000) at (0, 0, 0);
|
|
\coordinate (p001) at (0, 0, 2);
|
|
\coordinate (p010) at (0, 2, 0);
|
|
\coordinate (p011) at (0, 2, 2);
|
|
\coordinate (p100) at (2, 0, 0);
|
|
\coordinate (p101) at (2, 0, 2);
|
|
\coordinate (p110) at (2, 2, 0);
|
|
\coordinate (p111) at (2, 2, 2);
|
|
|
|
\draw[] (p000) -- (p100);
|
|
\draw[] (p100) -- (p101);
|
|
\draw[] (p101) -- (p001);
|
|
\draw[] (p001) -- (p000);
|
|
|
|
\draw[dashed] (p010) -- (p110);
|
|
\draw[] (p110) -- (p111);
|
|
\draw[] (p111) -- (p011);
|
|
\draw[dashed] (p011) -- (p010);
|
|
|
|
\draw[dashed] (p000) -- (p010);
|
|
\draw[] (p100) -- (p110);
|
|
\draw[] (p101) -- (p111);
|
|
\draw[] (p001) -- (p011);
|
|
|
|
% Polytope Vertices
|
|
|
|
\node[codeword] (c000) at (p000) {};
|
|
\node[codeword] (c011) at (p011) {};
|
|
\node[pseudocodeword] (cpseudo) at (2, 1, 1) {};
|
|
|
|
% Polytope Edges & Faces
|
|
|
|
\draw[line width=1pt, color=KITblue] (c000) -- (c011);
|
|
\draw[line width=1pt, color=KITred] (cpseudo) -- (c000);
|
|
\draw[line width=1pt, color=KITred] (cpseudo) -- (c011);
|
|
|
|
\fill[KITred, opacity=0.2] (p000) -- (p011) -- (2,1,1) -- cycle;
|
|
|
|
% Polytope Annotations
|
|
|
|
\node[color=KITblue, below=0cm of c000] {$\left( 0, 0, 0 \right) $};
|
|
\node[color=KITblue, above=0cm of c011] {$\left( 0, 1, 1 \right) $};
|
|
\node[color=KITred, right=0cm of cpseudo]
|
|
{$\left( 1, \frac{1}{2}, \frac{1}{2} \right) $};
|
|
\end{tikzpicture}
|
|
|
|
\caption{Relaxed codeword polytope $\overline{Q}$}
|
|
\label{fig:lp:poly:relaxed}
|
|
\end{subfigure}
|
|
\end{subfigure}
|
|
|
|
\vspace*{-2.5cm}
|
|
\hspace*{-0.1\textwidth}
|
|
\begin{tikzpicture}
|
|
\draw[->] (0,0) -- (2.5, 0);
|
|
\node[above] at (1.25, 0) {Relaxation};
|
|
|
|
% Dummy node to make tikzpicture slightly larger
|
|
\node[below] at (1.25, 0) {};
|
|
\end{tikzpicture}
|
|
\vspace{2.5cm}
|
|
|
|
\caption{Visualization of the codeword polytope and the relaxed codeword
|
|
polytope of the code described by equations (\ref{eq:lp:example_code_def_gen})
|
|
and (\ref{eq:lp:example_code_def_par})}
|
|
\label{fig:lp:poly}
|
|
\end{figure}%
|
|
%
|
|
\noindent It can be seen that the relaxed codeword polytope $\overline{Q}$ introduces
|
|
vertices with fractional values;
|
|
these represent erroneous non-codeword solutions to the linear program and
|
|
correspond to the so-called \textit{pseudo-codewords} introduced in
|
|
\cite{feldman_paper}.
|
|
However, since for \ac{LDPC} codes $\overline{Q}$ scales linearly with $n$ instead of
|
|
exponentially, it is a lot more tractable for practical applications.
|
|
|
|
The resulting formulation of the relaxed optimization problem becomes%
|
|
%
|
|
\begin{align}
|
|
\begin{aligned}
|
|
\text{minimize }\hspace{2mm} & \boldsymbol{\gamma}^\text{T}\tilde{\boldsymbol{c}} \\
|
|
\text{subject to }\hspace{2mm} &\boldsymbol{T}_j \tilde{\boldsymbol{c}} \in \mathcal{P}_{d_j}
|
|
\hspace{5mm}\forall j\in\mathcal{J}.
|
|
\end{aligned} \label{eq:lp:relaxed_formulation}
|
|
\end{align}%
|
|
|
|
One aspect making \ac{LP} decoding especially appealing is the very strong
|
|
theoretical guarantee that comes with it, called the
|
|
\textit{\ac{ML} certificate property} \cite[Sec. III. B.]{feldman_paper}.
|
|
This is the property that when a valid result is produced by an \ac{LP}
|
|
decoder, it is always the \ac{ML} codeword.
|
|
This leads to an interesting application of \ac{LP} decoding to
|
|
approximate \ac{ML} decoding behavior, by successively adding redundant
|
|
parity-checks until a valid result is returned \cite[Sec. IV.]{alp}.
|
|
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\section{Decoding Algorithm}%
|
|
\label{sec:lp:Decoding Algorithm}
|
|
|
|
The \ac{LP} decoding formulation in section \ref{sec:lp:LP Decoding}
|
|
is a very general one that can be solved with a number of different optimization methods.
|
|
In this work \ac{ADMM} is examined, as its distributed nature allows for a very efficient
|
|
implementation.
|
|
\ac{LP} decoding using \ac{ADMM} can be regarded as a message
|
|
passing algorithm with separate variable- and check-node update steps;
|
|
the resulting algorithm has a striking similarity to \ac{BP} and its computational
|
|
complexity has been demonstrated to compare favorably to \ac{BP} \cite{original_admm},
|
|
\cite{efficient_lp_dec_admm}.
|
|
|
|
The \ac{LP} decoding problem in (\ref{eq:lp:relaxed_formulation}) can be
|
|
slightly rewritten using the auxiliary variables
|
|
$\boldsymbol{z}_{[1:m]}$:%
|
|
%
|
|
\begin{align}
|
|
\begin{aligned}
|
|
\begin{array}{r}
|
|
\text{minimize }
|
|
\end{array}\hspace{0.5mm} & \boldsymbol{\gamma}^\text{T}\tilde{\boldsymbol{c}} \\
|
|
\begin{array}{r}
|
|
\text{subject to }\\
|
|
\phantom{te}
|
|
\end{array}\hspace{0.5mm} & \setlength{\arraycolsep}{1.4pt}
|
|
\begin{array}{rl}
|
|
\boldsymbol{T}_j\tilde{\boldsymbol{c}}
|
|
&= \boldsymbol{z}_j\\
|
|
\boldsymbol{z}_j
|
|
&\in \mathcal{P}_{d_j}
|
|
\end{array}
|
|
\hspace{5mm} \forall j\in\mathcal{J}.
|
|
\end{aligned}
|
|
\label{eq:lp:admm_reformulated}
|
|
\end{align}
|
|
%
|
|
In this form, the problem almost fits the \ac{ADMM} template described in section
|
|
\ref{sec:theo:Optimization Methods}, except for the fact that there are multiple equality
|
|
constraints $\boldsymbol{T}_j \tilde{\boldsymbol{c}} = \boldsymbol{z}_j$ and the
|
|
additional constraints $\boldsymbol{z}_j \in \mathcal{P}_{d_j} \, \forall\, j\in\mathcal{J}$.
|
|
The multiple constraints can be addressed by introducing additional terms in the
|
|
augmented lagrangian:%
|
|
%
|
|
\begin{align*}
|
|
\mathcal{L}_{\mu}\left( \tilde{\boldsymbol{c}}, \boldsymbol{z}_{[1:m]},
|
|
\boldsymbol{\lambda}_{[1:m]} \right)
|
|
= \boldsymbol{\gamma}^\text{T}\tilde{\boldsymbol{c}}
|
|
+ \sum_{j\in\mathcal{J}} \boldsymbol{\lambda}^\text{T}_j
|
|
\left( \boldsymbol{T}_j\tilde{\boldsymbol{c}} - \boldsymbol{z}_j \right)
|
|
+ \frac{\mu}{2}\sum_{j\in\mathcal{J}}
|
|
\lVert \boldsymbol{T}_j\tilde{\boldsymbol{c}} - \boldsymbol{z}_j \rVert^2_2
|
|
.\end{align*}%
|
|
%
|
|
The additional constraints remain in the dual optimization problem:%
|
|
%
|
|
\begin{align*}
|
|
\text{maximize } \min_{\substack{\tilde{\boldsymbol{c}} \\
|
|
\boldsymbol{z}_j \in \mathcal{P}_{d_j}\,\forall\,j\in\mathcal{J}}}
|
|
\mathcal{L}_{\mu}\left( \tilde{\boldsymbol{c}}, \boldsymbol{z}_{[1:m]},
|
|
\boldsymbol{\lambda}_{[1:m]} \right)
|
|
.\end{align*}%
|
|
%
|
|
The steps to solve the dual problem then become:
|
|
%
|
|
\begin{alignat*}{3}
|
|
\tilde{\boldsymbol{c}} &\leftarrow \argmin_{\tilde{\boldsymbol{c}}} \mathcal{L}_{\mu} \left(
|
|
\tilde{\boldsymbol{c}}, \boldsymbol{z}_{[1:m]}, \boldsymbol{\lambda}_{[1:m]} \right) \\
|
|
\boldsymbol{z}_j &\leftarrow \argmin_{\boldsymbol{z}_j \in \mathcal{P}_{d_j}}
|
|
\mathcal{L}_{\mu} \left(
|
|
\tilde{\boldsymbol{c}}, \boldsymbol{z}_{[1:m]}, \boldsymbol{\lambda}_{[1:m]} \right)
|
|
\hspace{3mm} &&\forall j\in\mathcal{J} \\
|
|
\boldsymbol{\lambda}_j &\leftarrow \boldsymbol{\lambda}_j
|
|
+ \mu\left( \boldsymbol{T}_j\tilde{\boldsymbol{c}}
|
|
- \boldsymbol{z}_j \right)
|
|
\hspace{3mm} &&\forall j\in\mathcal{J}
|
|
.\end{alignat*}
|
|
%
|
|
Luckily, the additional constraints only affect the $\boldsymbol{z}_j$-update steps.
|
|
Furthermore, the $\boldsymbol{z}_j$-update steps can be shown to be equivalent to projections
|
|
onto the check polytopes $\mathcal{P}_{d_j}$
|
|
and the $\tilde{\boldsymbol{c}}$-update can be computed analytically%
|
|
%
|
|
\footnote{In the $\tilde{c}_i$-update rule, the term
|
|
$\left( \boldsymbol{z}_j \right)_i$ is a slight abuse of notation, as
|
|
$\boldsymbol{z}_j$ has less components than there are variable-nodes $i$.
|
|
What is actually meant is the component of $\boldsymbol{z}_j$ that is associated
|
|
with the variable node $i$, i.e., $\left( \boldsymbol{T}_j^\text{T}\boldsymbol{z}_j\right)_i$.
|
|
The same is true for $\left( \boldsymbol{\lambda}_j \right)_i$.}
|
|
%
|
|
\cite[Sec. III. B.]{original_admm}:%
|
|
%
|
|
\begin{alignat*}{3}
|
|
\tilde{c}_i &\leftarrow \frac{1}{\left| N_v\left( i \right) \right|} \left(
|
|
\sum_{j\in N_v\left( i \right) } \Big( \left( \boldsymbol{z}_j \right)_i
|
|
- \frac{1}{\mu} \left( \boldsymbol{\lambda}_j \right)_i \Big)
|
|
- \frac{\gamma_i}{\mu} \right)
|
|
\hspace{3mm} && \forall i\in\mathcal{I} \\
|
|
\boldsymbol{z}_j &\leftarrow \Pi_{\mathcal{P}_{d_j}}\left(
|
|
\boldsymbol{T}_j\tilde{\boldsymbol{c}} + \frac{\boldsymbol{\lambda}_j}{\mu} \right)
|
|
\hspace{3mm} && \forall j\in\mathcal{J} \\
|
|
\boldsymbol{\lambda}_j &\leftarrow \boldsymbol{\lambda}_j
|
|
+ \mu\left( \boldsymbol{T}_j\tilde{\boldsymbol{c}}
|
|
- \boldsymbol{z}_j \right)
|
|
\hspace{3mm} && \forall j\in\mathcal{J}
|
|
.\end{alignat*}
|
|
%
|
|
It should be noted that all of the $\boldsymbol{z}_j$-updates can be computed simultaneously,
|
|
as they are independent of one another.
|
|
The same is true for the updates of the individual components of $\tilde{\boldsymbol{c}}$.
|
|
This representation can be slightly simplified by substituting
|
|
$\boldsymbol{\lambda}_j = \mu \cdot \boldsymbol{u}_j \,\forall\,j\in\mathcal{J}$:%
|
|
%
|
|
\begin{alignat*}{3}
|
|
\tilde{c}_i &\leftarrow \frac{1}{\left| N_v\left( i \right) \right|} \left(
|
|
\sum_{j\in N_v\left( i \right) } \Big( \left( \boldsymbol{z}_j \right)_i
|
|
- \left( \boldsymbol{u}_j \right)_i \Big)
|
|
- \frac{\gamma_i}{\mu} \right)
|
|
\hspace{3mm} && \forall i\in\mathcal{I} \\
|
|
\boldsymbol{z}_j &\leftarrow \Pi_{\mathcal{P}_{d_j}}\left(
|
|
\boldsymbol{T}_j\tilde{\boldsymbol{c}} + \boldsymbol{u}_j \right)
|
|
\hspace{3mm} && \forall j\in\mathcal{J} \\
|
|
\boldsymbol{u}_j &\leftarrow \boldsymbol{u}_j
|
|
+ \boldsymbol{T}_j\tilde{\boldsymbol{c}}
|
|
- \boldsymbol{z}_j
|
|
\hspace{3mm} && \forall j\in\mathcal{J}
|
|
.\end{alignat*}
|
|
%
|
|
|
|
The reason \ac{ADMM} is able to perform so well is due to the relocation of the constraints
|
|
$\boldsymbol{T}_j\tilde{\boldsymbol{c}}_j\in\mathcal{P}_{d_j}\,\forall\, j\in\mathcal{J}$
|
|
into the objective function itself.
|
|
The minimization of the new objective function can then take place simultaneously
|
|
with respect to all $\boldsymbol{z}_j, j\in\mathcal{J}$.
|
|
Effectively, all of the $\left|\mathcal{J}\right|$ parity constraints can be
|
|
handled at the same time.
|
|
This can also be understood by interpreting the decoding process as a message-passing
|
|
algorithm \cite[Sec. III. D.]{original_admm}, \cite[Sec. II. B.]{efficient_lp_dec_admm},
|
|
depicted in algorithm \ref{alg:admm}.
|
|
|
|
\begin{genericAlgorithm}[caption={\ac{LP} decoding using \ac{ADMM} interpreted
|
|
as a message passing algorithm\protect\footnotemark{}}, label={alg:admm},
|
|
basicstyle=\fontsize{11}{16}\selectfont
|
|
]
|
|
Initialize $\tilde{\boldsymbol{c}}, \boldsymbol{z}_{[1:m]}$ and $\boldsymbol{u}_{[1:m]}$
|
|
while $\sum_{j\in\mathcal{J}} \lVert \boldsymbol{T}_j\tilde{\boldsymbol{c}} - \boldsymbol{z}_j \rVert_2 \ge \epsilon_{\text{pri}}$ or $\sum_{j\in\mathcal{J}} \lVert \boldsymbol{z}^\prime_j - \boldsymbol{z}_j \rVert_2 \ge \epsilon_{\text{dual}}$ do
|
|
for $j$ in $\mathcal{J}$ do
|
|
$\boldsymbol{z}_j \leftarrow \Pi_{\mathcal{P}_{d_j}}\left(
|
|
\boldsymbol{T}_j\tilde{\boldsymbol{c}} + \boldsymbol{u}_j \right)$
|
|
$\boldsymbol{u}_j \leftarrow \boldsymbol{u}_j
|
|
+ \boldsymbol{T}_j\tilde{\boldsymbol{c}}
|
|
- \boldsymbol{z}_j$
|
|
end for
|
|
for $i$ in $\mathcal{I}$ do
|
|
$\tilde{c}_i \leftarrow \frac{1}{\left| N_v\left( i \right) \right|} \left(
|
|
\sum_{j\in N_v\left( i \right) } \Big(
|
|
\left( \boldsymbol{z}_j \right)_i - \left( \boldsymbol{u}_j
|
|
\right)_i
|
|
\Big) - \frac{\gamma_i}{\mu} \right)$
|
|
end for
|
|
end while
|
|
\end{genericAlgorithm}
|
|
%
|
|
\footnotetext{$\epsilon_{\text{pri}} > 0$ and $\epsilon_{\text{dual}} > 0$
|
|
are additional parameters
|
|
defining the tolerances for the stopping criteria of the algorithm.
|
|
The variable $\boldsymbol{z}_j^\prime$ denotes the value of
|
|
$\boldsymbol{z}_j$ in the previous iteration.}%
|
|
%
|
|
\noindent The $\boldsymbol{z}_j$- and $\boldsymbol{\lambda}_j$-updates can be understood as
|
|
a check-node update step (lines $3$-$6$) and the $\tilde{c}_i$-updates can be understood as
|
|
a variable-node update step (lines $7$-$9$ in figure \ref{alg:admm}).
|
|
The updates for each variable- and check-node can be perfomed in parallel.
|
|
|
|
The main computational effort in solving the linear program then amounts to
|
|
computing the projection operation $\Pi_{\mathcal{P}_{d_j}} \left( \cdot \right) $
|
|
onto each check polytope. Various different methods to perform this projection
|
|
have been proposed (e.g., in \cite{original_admm}, \cite{efficient_lp_dec_admm},
|
|
\cite{lautern}).
|
|
The method chosen here is the one presented in \cite{lautern}.
|
|
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\section{Implementation Details}%
|
|
\label{sec:lp:Implementation Details}
|
|
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\section{Results}%
|
|
\label{sec:lp:Results}
|
|
|