Minor wording changes; Last edit before submitting for review
This commit is contained in:
parent
bfd0aeaf8b
commit
9edbbda163
@ -259,7 +259,8 @@ binary vectors of length $d_j$ with even parity%
|
||||
parity-check $j$, but extended to the continuous domain.}%
|
||||
and $\boldsymbol{T}_j$ is the \textit{transfer matrix}, which selects the
|
||||
neighboring variable nodes
|
||||
of check node $j$ (i.e., the relevant components of $\boldsymbol{c}$ for parity-check $j$).
|
||||
of check node $j$ (i.e., the relevant components of $\tilde{\boldsymbol{c}}$
|
||||
for parity-check $j$).
|
||||
For example, if the $j$th row of the parity-check matrix
|
||||
$\boldsymbol{H}$ was $\boldsymbol{h}_j =
|
||||
\begin{bmatrix} 0 & 1 & 0 & 1 & 0 & 1 & 0 \end{bmatrix}$,
|
||||
@ -676,7 +677,7 @@ correspond to the so-called \textit{pseudo-codewords} introduced in
|
||||
However, since for \ac{LDPC} codes $\overline{Q}$ scales linearly with $n$ instead of
|
||||
exponentially, it is a lot more tractable for practical applications.
|
||||
|
||||
The resulting formulation of the relaxed optimization problem becomes:%
|
||||
The resulting formulation of the relaxed optimization problem becomes%
|
||||
%
|
||||
\begin{align}
|
||||
\begin{aligned}
|
||||
@ -805,7 +806,7 @@ $\boldsymbol{\lambda}_j = \mu \cdot \boldsymbol{u}_j \,\forall\,j\in\mathcal{J}$
|
||||
\tilde{c}_i &\leftarrow \frac{1}{\left| N_v\left( i \right) \right|} \left(
|
||||
\sum_{j\in N_v\left( i \right) } \Big( \left( \boldsymbol{z}_j \right)_i
|
||||
- \left( \boldsymbol{u}_j \right)_i \Big)
|
||||
- \gamma_i \right)
|
||||
- \frac{\gamma_i}{\mu} \right)
|
||||
\hspace{3mm} && \forall i\in\mathcal{I} \\
|
||||
\boldsymbol{z}_j &\leftarrow \Pi_{\mathcal{P}_{d_j}}\left(
|
||||
\boldsymbol{T}_j\tilde{\boldsymbol{c}} + \boldsymbol{u}_j \right)
|
||||
@ -898,8 +899,7 @@ In order to derive the objective function, the authors begin with the
|
||||
material difference in the meaning of the rule.
|
||||
The only change is that what previously were \acp{PMF} now have to be expressed
|
||||
in terms of \acp{PDF}.}
|
||||
over $\boldsymbol{x}$
|
||||
:%
|
||||
over $\boldsymbol{x}$:%
|
||||
%
|
||||
\begin{align}
|
||||
\hat{\boldsymbol{x}} = \argmax_{\tilde{\boldsymbol{x}} \in \mathbb{R}^{n}}
|
||||
@ -1015,7 +1015,7 @@ descent:%
|
||||
%
|
||||
For the second step, minimizing the scaled code-constraint polynomial, the
|
||||
proximal gradient method is used and the \textit{proximal operator} of
|
||||
$\gamma h\left( \boldsymbol{x} \right) $ has to be computed.
|
||||
$\gamma h\left( \tilde{\boldsymbol{x}} \right) $ has to be computed.
|
||||
It is then immediately approximated with gradient-descent:%
|
||||
%
|
||||
\begin{align*}
|
||||
@ -1037,7 +1037,7 @@ The second step thus becomes%
|
||||
While the approximation of the prior \ac{PDF} made in equation (\ref{eq:prox:prior_pdf_approx})
|
||||
theoretically becomes better
|
||||
with larger $\gamma$, the constraint that $\gamma$ be small is important,
|
||||
as it keeps the effect of $h\left( \boldsymbol{x} \right) $ on the landscape
|
||||
as it keeps the effect of $h\left( \tilde{\boldsymbol{x}} \right) $ on the landscape
|
||||
of the objective function small.
|
||||
Otherwise, unwanted stationary points, including local minima, are introduced.
|
||||
The authors say that ``in practice, the value of $\gamma$ should be adjusted
|
||||
@ -1079,7 +1079,7 @@ it suffices to consider only proportionality instead of equality.}%
|
||||
&\propto \tilde{\boldsymbol{x}} - \boldsymbol{y}
|
||||
,\end{align*}%
|
||||
%
|
||||
allowing equation \ref{eq:prox:step_log_likelihood} to be rewritten as%
|
||||
allowing equation (\ref{eq:prox:step_log_likelihood}) to be rewritten as%
|
||||
%
|
||||
\begin{align*}
|
||||
\boldsymbol{r} \leftarrow \boldsymbol{s}
|
||||
|
||||
@ -203,7 +203,7 @@ by computing \cite[Sec. 2.1]{admm_distr_stats}%
|
||||
%
|
||||
|
||||
The dual problem can then be solved iteratively using \textit{dual ascent}: starting with an
|
||||
initial estimate of $\boldsymbol{\lambda}$, calculate an estimate for $\boldsymbol{x}$
|
||||
initial estimate for $\boldsymbol{\lambda}$, calculate an estimate for $\boldsymbol{x}$
|
||||
using equation (\ref{eq:theo:admm_obtain_primal}); then, update $\boldsymbol{\lambda}$
|
||||
using gradient descent \cite[Sec. 2.1]{admm_distr_stats}:%
|
||||
%
|
||||
@ -215,7 +215,7 @@ using gradient descent \cite[Sec. 2.1]{admm_distr_stats}:%
|
||||
\hspace{5mm} \alpha > 0
|
||||
.\end{align*}
|
||||
%
|
||||
The algorithm can be improved by observing that when hen the objective function is separable in $\boldsymbol{x}$, the lagrangian is as well:
|
||||
The algorithm can be improved by observing that when the objective function is separable in $\boldsymbol{x}$, the lagrangian is as well:
|
||||
%
|
||||
\begin{align*}
|
||||
\text{minimize }\hspace{5mm} & \sum_{i=1}^{N} g_i\left( \boldsymbol{x}_i \right) \\
|
||||
@ -246,7 +246,7 @@ This modified version of dual ascent is called \textit{dual decomposition}:
|
||||
|
||||
The \ac{ADMM} works the same way as dual decomposition.
|
||||
It only differs in the use of an \textit{augmented lagrangian}
|
||||
$\mathcal{L}_\mu\left( \boldsymbol{x}_{[1:N]} \boldsymbol{\lambda} \right)$
|
||||
$\mathcal{L}_\mu\left( \boldsymbol{x}_{[1:N]}, \boldsymbol{\lambda} \right)$
|
||||
in order to robustify the convergence properties.
|
||||
The augmented lagrangian extends the ordinary one with an additional penalty term
|
||||
with the penaly parameter $\mu$:
|
||||
|
||||
Loading…
Reference in New Issue
Block a user