Minor wording changes; Last edit before submitting for review

This commit is contained in:
Andreas Tsouchlos 2023-03-23 13:12:09 +01:00
parent bfd0aeaf8b
commit 9edbbda163
2 changed files with 11 additions and 11 deletions

View File

@ -259,7 +259,8 @@ binary vectors of length $d_j$ with even parity%
parity-check $j$, but extended to the continuous domain.}% parity-check $j$, but extended to the continuous domain.}%
and $\boldsymbol{T}_j$ is the \textit{transfer matrix}, which selects the and $\boldsymbol{T}_j$ is the \textit{transfer matrix}, which selects the
neighboring variable nodes neighboring variable nodes
of check node $j$ (i.e., the relevant components of $\boldsymbol{c}$ for parity-check $j$). of check node $j$ (i.e., the relevant components of $\tilde{\boldsymbol{c}}$
for parity-check $j$).
For example, if the $j$th row of the parity-check matrix For example, if the $j$th row of the parity-check matrix
$\boldsymbol{H}$ was $\boldsymbol{h}_j = $\boldsymbol{H}$ was $\boldsymbol{h}_j =
\begin{bmatrix} 0 & 1 & 0 & 1 & 0 & 1 & 0 \end{bmatrix}$, \begin{bmatrix} 0 & 1 & 0 & 1 & 0 & 1 & 0 \end{bmatrix}$,
@ -676,7 +677,7 @@ correspond to the so-called \textit{pseudo-codewords} introduced in
However, since for \ac{LDPC} codes $\overline{Q}$ scales linearly with $n$ instead of However, since for \ac{LDPC} codes $\overline{Q}$ scales linearly with $n$ instead of
exponentially, it is a lot more tractable for practical applications. exponentially, it is a lot more tractable for practical applications.
The resulting formulation of the relaxed optimization problem becomes:% The resulting formulation of the relaxed optimization problem becomes%
% %
\begin{align} \begin{align}
\begin{aligned} \begin{aligned}
@ -805,7 +806,7 @@ $\boldsymbol{\lambda}_j = \mu \cdot \boldsymbol{u}_j \,\forall\,j\in\mathcal{J}$
\tilde{c}_i &\leftarrow \frac{1}{\left| N_v\left( i \right) \right|} \left( \tilde{c}_i &\leftarrow \frac{1}{\left| N_v\left( i \right) \right|} \left(
\sum_{j\in N_v\left( i \right) } \Big( \left( \boldsymbol{z}_j \right)_i \sum_{j\in N_v\left( i \right) } \Big( \left( \boldsymbol{z}_j \right)_i
- \left( \boldsymbol{u}_j \right)_i \Big) - \left( \boldsymbol{u}_j \right)_i \Big)
- \gamma_i \right) - \frac{\gamma_i}{\mu} \right)
\hspace{3mm} && \forall i\in\mathcal{I} \\ \hspace{3mm} && \forall i\in\mathcal{I} \\
\boldsymbol{z}_j &\leftarrow \Pi_{\mathcal{P}_{d_j}}\left( \boldsymbol{z}_j &\leftarrow \Pi_{\mathcal{P}_{d_j}}\left(
\boldsymbol{T}_j\tilde{\boldsymbol{c}} + \boldsymbol{u}_j \right) \boldsymbol{T}_j\tilde{\boldsymbol{c}} + \boldsymbol{u}_j \right)
@ -898,8 +899,7 @@ In order to derive the objective function, the authors begin with the
material difference in the meaning of the rule. material difference in the meaning of the rule.
The only change is that what previously were \acp{PMF} now have to be expressed The only change is that what previously were \acp{PMF} now have to be expressed
in terms of \acp{PDF}.} in terms of \acp{PDF}.}
over $\boldsymbol{x}$ over $\boldsymbol{x}$:%
:%
% %
\begin{align} \begin{align}
\hat{\boldsymbol{x}} = \argmax_{\tilde{\boldsymbol{x}} \in \mathbb{R}^{n}} \hat{\boldsymbol{x}} = \argmax_{\tilde{\boldsymbol{x}} \in \mathbb{R}^{n}}
@ -1015,7 +1015,7 @@ descent:%
% %
For the second step, minimizing the scaled code-constraint polynomial, the For the second step, minimizing the scaled code-constraint polynomial, the
proximal gradient method is used and the \textit{proximal operator} of proximal gradient method is used and the \textit{proximal operator} of
$\gamma h\left( \boldsymbol{x} \right) $ has to be computed. $\gamma h\left( \tilde{\boldsymbol{x}} \right) $ has to be computed.
It is then immediately approximated with gradient-descent:% It is then immediately approximated with gradient-descent:%
% %
\begin{align*} \begin{align*}
@ -1037,7 +1037,7 @@ The second step thus becomes%
While the approximation of the prior \ac{PDF} made in equation (\ref{eq:prox:prior_pdf_approx}) While the approximation of the prior \ac{PDF} made in equation (\ref{eq:prox:prior_pdf_approx})
theoretically becomes better theoretically becomes better
with larger $\gamma$, the constraint that $\gamma$ be small is important, with larger $\gamma$, the constraint that $\gamma$ be small is important,
as it keeps the effect of $h\left( \boldsymbol{x} \right) $ on the landscape as it keeps the effect of $h\left( \tilde{\boldsymbol{x}} \right) $ on the landscape
of the objective function small. of the objective function small.
Otherwise, unwanted stationary points, including local minima, are introduced. Otherwise, unwanted stationary points, including local minima, are introduced.
The authors say that ``in practice, the value of $\gamma$ should be adjusted The authors say that ``in practice, the value of $\gamma$ should be adjusted
@ -1079,7 +1079,7 @@ it suffices to consider only proportionality instead of equality.}%
&\propto \tilde{\boldsymbol{x}} - \boldsymbol{y} &\propto \tilde{\boldsymbol{x}} - \boldsymbol{y}
,\end{align*}% ,\end{align*}%
% %
allowing equation \ref{eq:prox:step_log_likelihood} to be rewritten as% allowing equation (\ref{eq:prox:step_log_likelihood}) to be rewritten as%
% %
\begin{align*} \begin{align*}
\boldsymbol{r} \leftarrow \boldsymbol{s} \boldsymbol{r} \leftarrow \boldsymbol{s}

View File

@ -203,7 +203,7 @@ by computing \cite[Sec. 2.1]{admm_distr_stats}%
% %
The dual problem can then be solved iteratively using \textit{dual ascent}: starting with an The dual problem can then be solved iteratively using \textit{dual ascent}: starting with an
initial estimate of $\boldsymbol{\lambda}$, calculate an estimate for $\boldsymbol{x}$ initial estimate for $\boldsymbol{\lambda}$, calculate an estimate for $\boldsymbol{x}$
using equation (\ref{eq:theo:admm_obtain_primal}); then, update $\boldsymbol{\lambda}$ using equation (\ref{eq:theo:admm_obtain_primal}); then, update $\boldsymbol{\lambda}$
using gradient descent \cite[Sec. 2.1]{admm_distr_stats}:% using gradient descent \cite[Sec. 2.1]{admm_distr_stats}:%
% %
@ -215,7 +215,7 @@ using gradient descent \cite[Sec. 2.1]{admm_distr_stats}:%
\hspace{5mm} \alpha > 0 \hspace{5mm} \alpha > 0
.\end{align*} .\end{align*}
% %
The algorithm can be improved by observing that when hen the objective function is separable in $\boldsymbol{x}$, the lagrangian is as well: The algorithm can be improved by observing that when the objective function is separable in $\boldsymbol{x}$, the lagrangian is as well:
% %
\begin{align*} \begin{align*}
\text{minimize }\hspace{5mm} & \sum_{i=1}^{N} g_i\left( \boldsymbol{x}_i \right) \\ \text{minimize }\hspace{5mm} & \sum_{i=1}^{N} g_i\left( \boldsymbol{x}_i \right) \\
@ -246,7 +246,7 @@ This modified version of dual ascent is called \textit{dual decomposition}:
The \ac{ADMM} works the same way as dual decomposition. The \ac{ADMM} works the same way as dual decomposition.
It only differs in the use of an \textit{augmented lagrangian} It only differs in the use of an \textit{augmented lagrangian}
$\mathcal{L}_\mu\left( \boldsymbol{x}_{[1:N]} \boldsymbol{\lambda} \right)$ $\mathcal{L}_\mu\left( \boldsymbol{x}_{[1:N]}, \boldsymbol{\lambda} \right)$
in order to robustify the convergence properties. in order to robustify the convergence properties.
The augmented lagrangian extends the ordinary one with an additional penalty term The augmented lagrangian extends the ordinary one with an additional penalty term
with the penaly parameter $\mu$: with the penaly parameter $\mu$: