Reworked proximal decoding
This commit is contained in:
parent
51bec2b5a7
commit
252736ff31
@ -34,8 +34,8 @@ The goal is to arrive at a formulation, where a certain objective function
|
|||||||
$f$ has to be minimized under certain constraints:%
|
$f$ has to be minimized under certain constraints:%
|
||||||
%
|
%
|
||||||
\begin{align*}
|
\begin{align*}
|
||||||
\text{minimize } f\left( \boldsymbol{c} \right)\\
|
\text{minimize}\hspace{2mm} &f\left( \boldsymbol{c} \right)\\
|
||||||
\text{subject to $\boldsymbol{c} \in D$}
|
\text{subject to}\hspace{2mm} &\boldsymbol{c} \in D
|
||||||
,\end{align*}%
|
,\end{align*}%
|
||||||
%
|
%
|
||||||
where $D$ is the domain of values attainable for $c$ and represents the
|
where $D$ is the domain of values attainable for $c$ and represents the
|
||||||
@ -256,7 +256,7 @@ the transfer matrix would be $\boldsymbol{T}_j =
|
|||||||
0 & 1 & 0 & 0 & 0 & 0 & 0 \\
|
0 & 1 & 0 & 0 & 0 & 0 & 0 \\
|
||||||
0 & 0 & 0 & 1 & 0 & 0 & 0 \\
|
0 & 0 & 0 & 1 & 0 & 0 & 0 \\
|
||||||
0 & 0 & 0 & 0 & 0 & 1 & 0 \\
|
0 & 0 & 0 & 0 & 0 & 1 & 0 \\
|
||||||
\end{bmatrix} $ (example taken from \cite[Sec. II, A]{efficient_lp_dec_admm})}%
|
\end{bmatrix} $ (example taken from \cite[Sec. II, A]{efficient_lp_dec_admm})}
|
||||||
(i.e. the relevant components of $\boldsymbol{c}$ for parity-check $j$)
|
(i.e. the relevant components of $\boldsymbol{c}$ for parity-check $j$)
|
||||||
and $\mathcal{P}_{d}$ is the \textit{check polytope}, the convex hull of all
|
and $\mathcal{P}_{d}$ is the \textit{check polytope}, the convex hull of all
|
||||||
binary vectors of length $d$ with even parity%
|
binary vectors of length $d$ with even parity%
|
||||||
@ -274,6 +274,22 @@ Figures \ref{fig:dec:poly:local1} and \ref{fig:dec:poly:local2} show the local
|
|||||||
codeword polytopes of each check node.
|
codeword polytopes of each check node.
|
||||||
Their intersection, the relaxed codeword polytope $\overline{Q}$, is shown in
|
Their intersection, the relaxed codeword polytope $\overline{Q}$, is shown in
|
||||||
figure \ref{fig:dec:poly:relaxed}.
|
figure \ref{fig:dec:poly:relaxed}.
|
||||||
|
It can be seen, that the relaxed codeword polytope $\overline{Q}$ introduces
|
||||||
|
vertices with fractional values;
|
||||||
|
these represent erroneous non-codeword solutions to the linear program and
|
||||||
|
correspond to the so-called \textit{pseudocodewords} introduced in
|
||||||
|
\cite{feldman_paper}.
|
||||||
|
However, since for \ac{LDPC} codes $\overline{Q}$ scales linearly with $n$ instead of
|
||||||
|
exponentially, it is a lot more tractable for practical applications.
|
||||||
|
|
||||||
|
The resulting formulation of the relaxed optimization problem is the following:%
|
||||||
|
%
|
||||||
|
\begin{align*}
|
||||||
|
\text{minimize }\hspace{2mm} &\sum_{i=1}^{n} \gamma_i c_i \\
|
||||||
|
\text{subject to }\hspace{2mm} &\boldsymbol{T}_j \boldsymbol{c} \in \mathcal{P}_{d_j},
|
||||||
|
\hspace{5mm}j\in\mathcal{J}
|
||||||
|
.\end{align*}%
|
||||||
|
%
|
||||||
%
|
%
|
||||||
%
|
%
|
||||||
% Codeword polytope visualization figure
|
% Codeword polytope visualization figure
|
||||||
@ -566,22 +582,6 @@ figure \ref{fig:dec:poly:relaxed}.
|
|||||||
\label{fig:dec:poly}
|
\label{fig:dec:poly}
|
||||||
\end{figure}%
|
\end{figure}%
|
||||||
%
|
%
|
||||||
It can be seen, that the relaxed codeword polytope $\overline{Q}$ introduces
|
|
||||||
vertices with fractional values;
|
|
||||||
these represent erroneous non-codeword solutions to the linear program and
|
|
||||||
correspond to the so-called \textit{pseudocodewords} introduced in
|
|
||||||
\cite{feldman_paper}.
|
|
||||||
However, since for \ac{LDPC} codes $\overline{Q}$ scales linearly with $n$ instead of
|
|
||||||
exponentially, it is a lot more tractable for practical applications.
|
|
||||||
|
|
||||||
The resulting formulation of the relaxed optimization problem is the following:%
|
|
||||||
%
|
|
||||||
\begin{align*}
|
|
||||||
\text{minimize }\hspace{2mm} &\sum_{i=1}^{n} \gamma_i c_i \\
|
|
||||||
\text{subject to }\hspace{2mm} &\boldsymbol{T}_j \boldsymbol{c} \in \mathcal{P}_{d_j}
|
|
||||||
\hspace{5mm}j\in\mathcal{J}
|
|
||||||
.\end{align*}%
|
|
||||||
%
|
|
||||||
|
|
||||||
|
|
||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||||
@ -599,14 +599,16 @@ The resulting formulation of the relaxed optimization problem is the following:%
|
|||||||
\section{Proximal Decoding}%
|
\section{Proximal Decoding}%
|
||||||
\label{sec:dec:Proximal Decoding}
|
\label{sec:dec:Proximal Decoding}
|
||||||
|
|
||||||
Proximal decoding was proposed by Wadayama et. al \cite{proximal_paper}.
|
Proximal decoding was proposed by Wadayama et. al as a novel formulation of
|
||||||
With this decoding algorithm, the objective function is minimized using
|
optimization based decoding \cite{proximal_paper}.
|
||||||
the proximal gradient method.
|
With this algorithm, minimization is performed using the proximal gradient
|
||||||
|
method.
|
||||||
In contrast to \ac{LP} decoding, the objective function is based on a
|
In contrast to \ac{LP} decoding, the objective function is based on a
|
||||||
non-convex optimization formulation of the \ac{MAP} decoding problem.
|
non-convex optimization formulation of the \ac{MAP} decoding problem.
|
||||||
|
|
||||||
In order to derive the objective function, the authors reformulate the
|
In order to derive the objective function, the authors begin with the
|
||||||
\ac{MAP} decoding problem:%
|
\ac{MAP} decoding rule, expressed as a continuous minimization problem over
|
||||||
|
$\boldsymbol{x}$:%
|
||||||
%
|
%
|
||||||
\begin{align}
|
\begin{align}
|
||||||
\hat{\boldsymbol{x}} = \argmax_{\boldsymbol{x} \in \mathbb{R}^{n}}
|
\hat{\boldsymbol{x}} = \argmax_{\boldsymbol{x} \in \mathbb{R}^{n}}
|
||||||
@ -616,19 +618,37 @@ In order to derive the objective function, the authors reformulate the
|
|||||||
\left( \boldsymbol{y} \mid \boldsymbol{x} \right)
|
\left( \boldsymbol{y} \mid \boldsymbol{x} \right)
|
||||||
f_{\boldsymbol{X}}\left( \boldsymbol{x} \right)%
|
f_{\boldsymbol{X}}\left( \boldsymbol{x} \right)%
|
||||||
\label{eq:prox:vanilla_MAP}
|
\label{eq:prox:vanilla_MAP}
|
||||||
|
.\end{align}%
|
||||||
|
%
|
||||||
|
The likelihood $f_{\boldsymbol{Y} \mid \boldsymbol{X}}
|
||||||
|
\left( \boldsymbol{y} \mid \boldsymbol{x} \right) $ is a known function
|
||||||
|
determined by the channel model.
|
||||||
|
The prior \ac{PDF} $f_{\boldsymbol{X}}\left( \boldsymbol{x} \right)$ is also
|
||||||
|
known, as the equal probability assumption is made on
|
||||||
|
$\mathcal{C}\left( \boldsymbol{H} \right)$.
|
||||||
|
However, because in this case the considered domain is continuous,
|
||||||
|
the prior \ac{PDF} cannot be ignored as a constant during the minimization
|
||||||
|
as is often done, and has a rather unwieldy representation:%
|
||||||
|
%
|
||||||
|
\begin{align}
|
||||||
|
f_{\boldsymbol{X}}\left( \boldsymbol{x} \right) =
|
||||||
|
\frac{1}{\left| \mathcal{C}\left( \boldsymbol{H} \right) \right| }
|
||||||
|
\sum_{c \in \mathcal{C}\left( \boldsymbol{H} \right) }
|
||||||
|
\delta\left( \boldsymbol{x} - \left( -1 \right) ^{\boldsymbol{c}}\right)
|
||||||
|
\label{eq:prox:prior_pdf}
|
||||||
\end{align}%
|
\end{align}%
|
||||||
%
|
%
|
||||||
The likelihood is usually a known function determined by the channel model.
|
|
||||||
In order to rewrite the prior \ac{PDF}
|
In order to rewrite the prior \ac{PDF}
|
||||||
$f_{\boldsymbol{X}}\left( \boldsymbol{x} \right)$,
|
$f_{\boldsymbol{X}}\left( \boldsymbol{x} \right)$,
|
||||||
the so-called \textit{code-constraint polynomial} is introduced:%
|
the so-called \textit{code-constraint polynomial} is introduced:%
|
||||||
%
|
%
|
||||||
\begin{align}
|
\begin{align*}
|
||||||
h\left( \boldsymbol{x} \right) = \sum_{j=1}^{n} \left( x_j^2-1 \right) ^2
|
h\left( \boldsymbol{x} \right) =
|
||||||
+ \sum_{i=1}^{m} \left[
|
\underbrace{\sum_{j=1}^{n} \left( x_j^2-1 \right) ^2}_{\text{Bipolar constraint}}
|
||||||
\left( \prod_{j\in \mathcal{A}\left( i \right) } x_j \right) -1 \right] ^2%
|
+ \underbrace{\sum_{i=1}^{m} \left[
|
||||||
\label{eq:prox:ccp}
|
\left( \prod_{j\in \mathcal{A}
|
||||||
\end{align}%
|
\left( i \right) } x_j \right) -1 \right] ^2}_{\text{Parity Constraint}}%
|
||||||
|
.\end{align*}%
|
||||||
%
|
%
|
||||||
The intention of this function is to provide a way to penalize vectors far
|
The intention of this function is to provide a way to penalize vectors far
|
||||||
from a codeword and favor those close to a codeword.
|
from a codeword and favor those close to a codeword.
|
||||||
@ -636,69 +656,74 @@ In order to achieve this, the polynomial is composed of two parts: one term
|
|||||||
representing the bibolar constraint, providing for a discrete solution of the
|
representing the bibolar constraint, providing for a discrete solution of the
|
||||||
continuous optimization problem, and one term representing the parity
|
continuous optimization problem, and one term representing the parity
|
||||||
constraint, accomodating the role of the parity-check matrix $\boldsymbol{H}$.
|
constraint, accomodating the role of the parity-check matrix $\boldsymbol{H}$.
|
||||||
%
|
|
||||||
The equal probability assumption is made on $\mathcal{C}\left( \boldsymbol{H} \right) $.
|
|
||||||
The prior \ac{PDF} is then approximated using the code-constraint polynomial:%
|
The prior \ac{PDF} is then approximated using the code-constraint polynomial:%
|
||||||
%
|
%
|
||||||
\begin{align}
|
\begin{align}
|
||||||
f_{\boldsymbol{X}}\left( \boldsymbol{x} \right) =
|
f_{\boldsymbol{X}}\left( \boldsymbol{x} \right)
|
||||||
\frac{1}{\left| \mathcal{C}\left( \boldsymbol{H} \right) \right| }
|
|
||||||
\sum_{c \in \mathcal{C}\left( \boldsymbol{H} \right) }
|
|
||||||
\delta\left( \boldsymbol{x} - \left( -1 \right) ^{\boldsymbol{c}}\right)
|
|
||||||
\approx \frac{1}{Z}e^{-\gamma h\left( \boldsymbol{x} \right) }%
|
\approx \frac{1}{Z}e^{-\gamma h\left( \boldsymbol{x} \right) }%
|
||||||
\label{eq:prox:prior_pdf_approx}
|
\label{eq:prox:prior_pdf_approx}
|
||||||
\end{align}%
|
.\end{align}%
|
||||||
%
|
%
|
||||||
The authors justify this approximation by arguing that for
|
The authors justify this approximation by arguing that for
|
||||||
$\gamma \rightarrow \infty$, the right-hand side aproaches the left-hand
|
$\gamma \rightarrow \infty$, the approximation in equation
|
||||||
side. In equation \ref{eq:prox:vanilla_MAP}, the prior \ac{PDF}
|
\ref{eq:prox:prior_pdf_approx} aproaches the original fuction in equation
|
||||||
$f_{\boldsymbol{X}}\left( \boldsymbol{x} \right) $ can then be subsituted
|
\ref{eq:prox:prior_pdf}.
|
||||||
for equation \ref{eq:prox:prior_pdf_approx} and the likelihood can be rewritten using
|
This approximation can then be plugged into equation \ref{eq:prox:vanilla_MAP}
|
||||||
the negative log-likelihood
|
and the likelihood can be rewritten using the negative log-likelihood
|
||||||
$L \left( \boldsymbol{y} \mid \boldsymbol{x} \right) = -\ln\left(
|
$L \left( \boldsymbol{y} \mid \boldsymbol{x} \right) = -\ln\left(
|
||||||
f_{\boldsymbol{X} \mid \boldsymbol{Y}}\left(
|
f_{\boldsymbol{Y} \mid \boldsymbol{X}}\left(
|
||||||
\boldsymbol{x} \mid \boldsymbol{y} \right) \right) $:%
|
\boldsymbol{y} \mid \boldsymbol{x} \right) \right) $:%
|
||||||
%
|
%
|
||||||
\begin{align}
|
\begin{align*}
|
||||||
\hat{\boldsymbol{x}} &= \argmax_{\boldsymbol{x} \in \mathbb{R}^{n}}
|
\hat{\boldsymbol{x}} &= \argmax_{\boldsymbol{x} \in \mathbb{R}^{n}}
|
||||||
e^{- L\left( \boldsymbol{y} \mid \boldsymbol{x} \right) }
|
e^{- L\left( \boldsymbol{y} \mid \boldsymbol{x} \right) }
|
||||||
e^{-\gamma h\left( \boldsymbol{x} \right) } \nonumber \\
|
e^{-\gamma h\left( \boldsymbol{x} \right) } \\
|
||||||
&= \argmin_{\boldsymbol{x} \in \mathbb{R}^n} \left(
|
&= \argmin_{\boldsymbol{x} \in \mathbb{R}^n} \left(
|
||||||
L\left( \boldsymbol{y} \mid \boldsymbol{x} \right)
|
L\left( \boldsymbol{y} \mid \boldsymbol{x} \right)
|
||||||
+ \gamma h\left( \boldsymbol{x} \right)
|
+ \gamma h\left( \boldsymbol{x} \right)
|
||||||
\right)%
|
\right)%
|
||||||
\label{eq:prox:approx_map_problem}
|
.\end{align*}%
|
||||||
.\end{align}%
|
|
||||||
%
|
%
|
||||||
Thus, with proximal decoding, the objective function
|
Thus, with proximal decoding, the objective function
|
||||||
$f\left( \boldsymbol{x} \right)$ to be minimized is%
|
$f\left( \boldsymbol{x} \right)$ considered is%
|
||||||
%
|
%
|
||||||
\begin{align}
|
\begin{align}
|
||||||
f\left( \boldsymbol{x} \right) = L\left( \boldsymbol{x} \mid \boldsymbol{y} \right)
|
f\left( \boldsymbol{x} \right) = L\left( \boldsymbol{x} \mid \boldsymbol{y} \right)
|
||||||
+ \gamma h\left( \boldsymbol{x} \right)%
|
+ \gamma h\left( \boldsymbol{x} \right)%
|
||||||
\label{eq:prox:objective_function}
|
\label{eq:prox:objective_function}
|
||||||
.\end{align}
|
\end{align}%
|
||||||
|
%
|
||||||
|
and the decoding problem is reformulated to%
|
||||||
|
%
|
||||||
|
\begin{align*}
|
||||||
|
\text{minimize}\hspace{2mm} &L\left( \boldsymbol{x} \mid \boldsymbol{y} \right)
|
||||||
|
+ \gamma h\left( \boldsymbol{x} \right)\\
|
||||||
|
\text{subject to}\hspace{2mm} &\boldsymbol{x} \in \mathbb{R}^n
|
||||||
|
.\end{align*}
|
||||||
|
%
|
||||||
|
|
||||||
For the solution of the approximalte \ac{MAP} decoding problem, the two parts
|
For the solution of the approximate \ac{MAP} decoding problem, the two parts
|
||||||
of \ref{eq:prox:objective_function} are considered separately:
|
of \ref{eq:prox:objective_function} are considered separately:
|
||||||
the minimization of the objective function occurs in an alternating
|
the minimization of the objective function occurs in an alternating
|
||||||
manner, switching between the minimization of the negative log-likelihood
|
fashion, switching between the negative log-likelihood
|
||||||
$L\left( \boldsymbol{y} \mid \boldsymbol{x} \right) $ and the scaled
|
$L\left( \boldsymbol{y} \mid \boldsymbol{x} \right) $ and the scaled
|
||||||
code-constaint polynomial $\gamma h\left( \boldsymbol{x} \right) $.
|
code-constaint polynomial $\gamma h\left( \boldsymbol{x} \right) $.
|
||||||
Two helper variables, $\boldsymbol{r}$ and $\boldsymbol{s}$ are introduced,
|
Two helper variables, $\boldsymbol{r}$ and $\boldsymbol{s}$ are introduced,
|
||||||
describing the result of each of the two steps.
|
describing the result of each of the two steps.
|
||||||
The first step, minimizing the log-likelihood using gradient descent, yields%
|
The first step, minimizing the log-likelihood, is performed using gradient
|
||||||
|
descent:%
|
||||||
%
|
%
|
||||||
\begin{align*}
|
\begin{align}
|
||||||
\boldsymbol{r} \leftarrow \boldsymbol{s} - \omega \nabla
|
\boldsymbol{r} \leftarrow \boldsymbol{s} - \omega \nabla
|
||||||
L\left( \boldsymbol{y} \mid \boldsymbol{s} \right),
|
L\left( \boldsymbol{y} \mid \boldsymbol{s} \right),
|
||||||
\hspace{5mm}\omega > 0
|
\hspace{5mm}\omega > 0
|
||||||
.\end{align*}%
|
\label{eq:prox:step_log_likelihood}
|
||||||
|
.\end{align}%
|
||||||
%
|
%
|
||||||
For the second step, minimizig the scaled code-constraint polynomial using
|
For the second step, minimizig the scaled code-constraint polynomial, the
|
||||||
the proximal gradient method, the proximal operator of
|
proximal gradient method is used and the \textit{proximal operator} of
|
||||||
$\gamma h\left( \boldsymbol{x} \right) $ has to be computed and is
|
$\gamma h\left( \boldsymbol{x} \right) $ has to be computed.
|
||||||
immediately approximalted by a gradient-descent step:%
|
It is then immediately approximalted with gradient-descent:%
|
||||||
%
|
%
|
||||||
\begin{align*}
|
\begin{align*}
|
||||||
\text{prox}_{\gamma h} \left( \boldsymbol{x} \right) &\equiv
|
\text{prox}_{\gamma h} \left( \boldsymbol{x} \right) &\equiv
|
||||||
@ -709,8 +734,7 @@ immediately approximalted by a gradient-descent step:%
|
|||||||
\hspace{5mm} \gamma \text{ small}
|
\hspace{5mm} \gamma \text{ small}
|
||||||
.\end{align*}%
|
.\end{align*}%
|
||||||
%
|
%
|
||||||
The second step thus becomes \todo{Write the formulation optimization problem properly
|
The second step thus becomes%
|
||||||
(as shown in the introductory section)}%
|
|
||||||
%
|
%
|
||||||
\begin{align*}
|
\begin{align*}
|
||||||
\boldsymbol{s} \leftarrow \boldsymbol{r} - \gamma \nabla h\left( \boldsymbol{r} \right),
|
\boldsymbol{s} \leftarrow \boldsymbol{r} - \gamma \nabla h\left( \boldsymbol{r} \right),
|
||||||
@ -725,42 +749,19 @@ of the objective function small.
|
|||||||
Otherwise, unwanted stationary points, including local minima, are introduced.
|
Otherwise, unwanted stationary points, including local minima, are introduced.
|
||||||
The authors say that in practice, the value of $\gamma$ should be adjusted
|
The authors say that in practice, the value of $\gamma$ should be adjusted
|
||||||
according to the decoding performance.
|
according to the decoding performance.
|
||||||
The iterative decoding process \todo{projection with $\eta$} resulting from this considreation is shown in
|
|
||||||
figure \ref{fig:prox:alg}.
|
|
||||||
|
|
||||||
\begin{figure}[H]
|
%The components of the gradient of the code-constraint polynomial can be computed as follows:%
|
||||||
\centering
|
%%
|
||||||
|
%\begin{align*}
|
||||||
\begin{genericAlgorithm}[caption={}, label={}]
|
% \frac{\partial}{\partial x_k} h\left( \boldsymbol{x} \right) =
|
||||||
$\boldsymbol{s} \leftarrow \boldsymbol{0}$
|
% 4\left( x_k^2 - 1 \right) x_k + \frac{2}{x_k}
|
||||||
for $K$ iterations do
|
% \sum_{i\in \mathcal{B}\left( k \right) } \left(
|
||||||
$\boldsymbol{r} \leftarrow \boldsymbol{s} - \omega \nabla L \left( \boldsymbol{y} \mid \boldsymbol{s} \right) $
|
% \left( \prod_{j\in\mathcal{A}\left( i \right)} x_j\right)^2
|
||||||
$\boldsymbol{s} \leftarrow \boldsymbol{r} - \gamma \nabla h\left( \boldsymbol{r} \right) $
|
% - \prod_{j\in\mathcal{A}\left( i \right) }x_j \right)
|
||||||
$\boldsymbol{\hat{x}} \leftarrow \text{sign}\left( \boldsymbol{s} \right) $
|
%.\end{align*}%
|
||||||
if $\boldsymbol{H}\boldsymbol{\hat{c}} = \boldsymbol{0}$ do
|
%\todo{Only multiplication?}%
|
||||||
return $\boldsymbol{\hat{c}}$
|
%\todo{$x_k$: $k$ or some other indexing variable?}%
|
||||||
end if
|
%%
|
||||||
end for
|
|
||||||
return $\boldsymbol{\hat{c}}$
|
|
||||||
\end{genericAlgorithm}
|
|
||||||
|
|
||||||
|
|
||||||
\caption{Proximal decoding algorithm}
|
|
||||||
\label{fig:prox:alg}
|
|
||||||
\end{figure}
|
|
||||||
|
|
||||||
The components of the gradient of the code-constraint polynomial can be computed as follows:%
|
|
||||||
%
|
|
||||||
\begin{align*}
|
|
||||||
\frac{\partial}{\partial x_k} h\left( \boldsymbol{x} \right) =
|
|
||||||
4\left( x_k^2 - 1 \right) x_k + \frac{2}{x_k}
|
|
||||||
\sum_{i\in \mathcal{B}\left( k \right) } \left(
|
|
||||||
\left( \prod_{j\in\mathcal{A}\left( i \right)} x_j\right)^2
|
|
||||||
- \prod_{j\in\mathcal{A}\left( i \right) }x_j \right)
|
|
||||||
.\end{align*}%
|
|
||||||
\todo{Only multiplication?}%
|
|
||||||
\todo{$x_k$: $k$ or some other indexing variable?}%
|
|
||||||
%
|
|
||||||
In the case of \ac{AWGN}, the likelihood
|
In the case of \ac{AWGN}, the likelihood
|
||||||
$f_{\boldsymbol{Y} \mid \boldsymbol{X}}\left( \boldsymbol{y} \mid \boldsymbol{x} \right)$
|
$f_{\boldsymbol{Y} \mid \boldsymbol{X}}\left( \boldsymbol{y} \mid \boldsymbol{x} \right)$
|
||||||
is%
|
is%
|
||||||
@ -778,12 +779,50 @@ it suffices to consider only the proportionality instead of the equality.}%
|
|||||||
\nabla L \left( \boldsymbol{y} \mid \boldsymbol{x} \right)
|
\nabla L \left( \boldsymbol{y} \mid \boldsymbol{x} \right)
|
||||||
&\propto -\nabla \lVert \boldsymbol{y} - \boldsymbol{x} \rVert^2\\
|
&\propto -\nabla \lVert \boldsymbol{y} - \boldsymbol{x} \rVert^2\\
|
||||||
&\propto \boldsymbol{x} - \boldsymbol{y}
|
&\propto \boldsymbol{x} - \boldsymbol{y}
|
||||||
.\end{align*}%
|
,\end{align*}%
|
||||||
%
|
%
|
||||||
The resulting iterative decoding process under the assumption of \ac{AWGN} is
|
Allowing equation \ref{eq:prox:step_log_likelihood} to be rewritten as%
|
||||||
described by%
|
|
||||||
%
|
%
|
||||||
\begin{align*}
|
\begin{align*}
|
||||||
\boldsymbol{r} \leftarrow \boldsymbol{s} - \omega\left( \boldsymbol{s}-\boldsymbol{y} \right)\\
|
\boldsymbol{r} \leftarrow \boldsymbol{s}
|
||||||
\boldsymbol{s} \leftarrow \boldsymbol{r} - \gamma \nabla h\left( \boldsymbol{r} \right)
|
- \omega \left( \boldsymbol{s} - \boldsymbol{y} \right)
|
||||||
.\end{align*}
|
.\end{align*}
|
||||||
|
%
|
||||||
|
|
||||||
|
One thing to consider during the actual decoding process, is that the gradient
|
||||||
|
of the code-constraint polynomial can take on extremely large values.
|
||||||
|
In order to avoid numeric instability, an additional step is added, where all
|
||||||
|
components of the current estimate are clipped to $\left[-\eta, \eta \right]$,
|
||||||
|
where $\eta$ is a positive constant slightly larger than one:%
|
||||||
|
%
|
||||||
|
\begin{align*}
|
||||||
|
\boldsymbol{s} \leftarrow \Pi_{\eta} \left( \boldsymbol{r}
|
||||||
|
- \gamma \nabla h\left( \boldsymbol{r} \right) \right)
|
||||||
|
,\end{align*}
|
||||||
|
%
|
||||||
|
$\Pi_{\eta}\left( \cdot \right) $ expressing the projection onto
|
||||||
|
$\left[ -\eta, \eta \right]^n$.
|
||||||
|
|
||||||
|
The iterative decoding process resulting from these considreations is shown in
|
||||||
|
figure \ref{fig:prox:alg}.
|
||||||
|
|
||||||
|
\begin{figure}[H]
|
||||||
|
\centering
|
||||||
|
|
||||||
|
\begin{genericAlgorithm}[caption={}, label={}]
|
||||||
|
$\boldsymbol{s} \leftarrow \boldsymbol{0}$
|
||||||
|
for $K$ iterations do
|
||||||
|
$\boldsymbol{r} \leftarrow \boldsymbol{s} - \omega \left( \boldsymbol{s} - \boldsymbol{y} \right) $
|
||||||
|
$\boldsymbol{s} \leftarrow \Pi_\eta \left(\boldsymbol{r} - \gamma \nabla h\left( \boldsymbol{r} \right) \right)$
|
||||||
|
$\boldsymbol{\hat{x}} \leftarrow \text{sign}\left( \boldsymbol{s} \right) $
|
||||||
|
if $\boldsymbol{H}\boldsymbol{\hat{c}} = \boldsymbol{0}$ do
|
||||||
|
return $\boldsymbol{\hat{c}}$
|
||||||
|
end if
|
||||||
|
end for
|
||||||
|
return $\boldsymbol{\hat{c}}$
|
||||||
|
\end{genericAlgorithm}
|
||||||
|
|
||||||
|
|
||||||
|
\caption{Proximal decoding algorithm}
|
||||||
|
\label{fig:prox:alg}
|
||||||
|
\end{figure}
|
||||||
|
|||||||
Loading…
Reference in New Issue
Block a user