Fixed most TODOs in decoding techniques section
This commit is contained in:
parent
8cf2e4e6d7
commit
ad4dfc6325
@ -196,12 +196,10 @@ of the \acp{LLR} $\gamma_i$ \cite[Sec. 2.5]{feldman_thesis}:%
|
|||||||
\hat{\boldsymbol{c}}_{\text{\ac{ML}}} = \argmin_{\boldsymbol{c}\in\mathcal{C}}
|
\hat{\boldsymbol{c}}_{\text{\ac{ML}}} = \argmin_{\boldsymbol{c}\in\mathcal{C}}
|
||||||
\sum_{i=1}^{n} \gamma_i c_i,%
|
\sum_{i=1}^{n} \gamma_i c_i,%
|
||||||
\hspace{5mm} \gamma_i = \ln\left(
|
\hspace{5mm} \gamma_i = \ln\left(
|
||||||
\frac{f_{Y_i | C_i} \left( y_i \mid C_i = 0 \right) }
|
\frac{f_{Y_i | C_i} \left( y_i \mid c_i = 0 \right) }
|
||||||
{f_{Y_i | C_i} \left( y_i \mid C_i = 1 \right) } \right)
|
{f_{Y_i | C_i} \left( y_i \mid c_i = 1 \right) } \right)
|
||||||
.\end{align*}
|
.\end{align*}
|
||||||
%
|
%
|
||||||
\todo{$C_i$ or $c_i$?}%
|
|
||||||
%
|
|
||||||
The authors propose the following cost function%
|
The authors propose the following cost function%
|
||||||
\footnote{In this context, \textit{cost function} and \textit{objective function}
|
\footnote{In this context, \textit{cost function} and \textit{objective function}
|
||||||
have the same meaning.}
|
have the same meaning.}
|
||||||
@ -209,15 +207,14 @@ for the \ac{LP} decoding problem:%
|
|||||||
%
|
%
|
||||||
\begin{align*}
|
\begin{align*}
|
||||||
g\left( \boldsymbol{c} \right) = \sum_{i=1}^{n} \gamma_i c_i
|
g\left( \boldsymbol{c} \right) = \sum_{i=1}^{n} \gamma_i c_i
|
||||||
|
= \boldsymbol{\gamma}^\text{T}\boldsymbol{c}
|
||||||
.\end{align*}
|
.\end{align*}
|
||||||
%
|
%
|
||||||
\todo{Write as dot product}
|
|
||||||
%
|
|
||||||
With this cost function, the exact integer linear program formulation of \ac{ML}
|
With this cost function, the exact integer linear program formulation of \ac{ML}
|
||||||
decoding is the following:%
|
decoding becomes the following:%
|
||||||
%
|
%
|
||||||
\begin{align*}
|
\begin{align*}
|
||||||
\text{minimize }\hspace{2mm} &\sum_{i=1}^{n} \gamma_i c_i \\
|
\text{minimize }\hspace{2mm} & \boldsymbol{\gamma}^\text{T}\boldsymbol{c} \\
|
||||||
\text{subject to }\hspace{2mm} &\boldsymbol{c} \in \mathcal{C}
|
\text{subject to }\hspace{2mm} &\boldsymbol{c} \in \mathcal{C}
|
||||||
.\end{align*}%
|
.\end{align*}%
|
||||||
%
|
%
|
||||||
@ -234,13 +231,11 @@ decoding, redefining the constraints in terms of the \text{codeword polytope}
|
|||||||
%
|
%
|
||||||
\begin{align*}
|
\begin{align*}
|
||||||
\text{poly}\left( \mathcal{C} \right) = \left\{
|
\text{poly}\left( \mathcal{C} \right) = \left\{
|
||||||
\sum_{\boldsymbol{c} \in \mathcal{C}} \lambda_{\boldsymbol{c}} \boldsymbol{c}
|
\sum_{\boldsymbol{c} \in \mathcal{C}} \alpha_{\boldsymbol{c}} \boldsymbol{c}
|
||||||
\text{ : } \lambda_{\boldsymbol{c}} \ge 0,
|
\text{ : } \alpha_{\boldsymbol{c}} \ge 0,
|
||||||
\sum_{\boldsymbol{c} \in \mathcal{C}} \lambda_{\boldsymbol{c}} = 1 \right\}
|
\sum_{\boldsymbol{c} \in \mathcal{C}} \alpha_{\boldsymbol{c}} = 1 \right\}
|
||||||
,\end{align*} %
|
,\end{align*} %
|
||||||
%
|
%
|
||||||
\todo{$\lambda$ might be confusing here}%
|
|
||||||
%
|
|
||||||
which represents the \textit{convex hull} of all possible codewords,
|
which represents the \textit{convex hull} of all possible codewords,
|
||||||
i.e., the convex set of linear combinations of all codewords.
|
i.e., the convex set of linear combinations of all codewords.
|
||||||
This corresponds to simply lifting the integer requirement.
|
This corresponds to simply lifting the integer requirement.
|
||||||
@ -685,12 +680,11 @@ The resulting formulation of the relaxed optimization problem becomes:%
|
|||||||
%
|
%
|
||||||
\begin{align}
|
\begin{align}
|
||||||
\begin{aligned}
|
\begin{aligned}
|
||||||
\text{minimize }\hspace{2mm} &\sum_{i=1}^{n} \gamma_i \tilde{c}_i \\
|
\text{minimize }\hspace{2mm} & \boldsymbol{\gamma}^\text{T}\tilde{\boldsymbol{c}} \\
|
||||||
\text{subject to }\hspace{2mm} &\boldsymbol{T}_j \tilde{\boldsymbol{c}} \in \mathcal{P}_{d_j}
|
\text{subject to }\hspace{2mm} &\boldsymbol{T}_j \tilde{\boldsymbol{c}} \in \mathcal{P}_{d_j}
|
||||||
\hspace{5mm}\forall j\in\mathcal{J}.
|
\hspace{5mm}\forall j\in\mathcal{J}.
|
||||||
\end{aligned} \label{eq:lp:relaxed_formulation}
|
\end{aligned} \label{eq:lp:relaxed_formulation}
|
||||||
\end{align}%
|
\end{align}%
|
||||||
\todo{Rewrite sum as dot product}
|
|
||||||
\todo{Space before $\forall$?}
|
\todo{Space before $\forall$?}
|
||||||
|
|
||||||
|
|
||||||
@ -703,8 +697,7 @@ is a very general one that can be solved with a number of different optimization
|
|||||||
In this work \ac{ADMM} is examined, as its distributed nature allows for a very efficient
|
In this work \ac{ADMM} is examined, as its distributed nature allows for a very efficient
|
||||||
implementation.
|
implementation.
|
||||||
\ac{LP} decoding using \ac{ADMM} can be regarded as a message
|
\ac{LP} decoding using \ac{ADMM} can be regarded as a message
|
||||||
passing algorithm with two separate update steps that can be performed
|
passing algorithm with separate variable- and check-node update steps;
|
||||||
simulatneously;
|
|
||||||
the resulting algorithm has a striking similarity to \ac{BP} and its computational
|
the resulting algorithm has a striking similarity to \ac{BP} and its computational
|
||||||
complexity has been demonstrated to compare favorably to \ac{BP} \cite{original_admm},
|
complexity has been demonstrated to compare favorably to \ac{BP} \cite{original_admm},
|
||||||
\cite{efficient_lp_dec_admm}.
|
\cite{efficient_lp_dec_admm}.
|
||||||
@ -778,15 +771,25 @@ The steps to solve the dual problem then become:
|
|||||||
Luckily, the additional constaints only affect the $\boldsymbol{z}_j$-update steps.
|
Luckily, the additional constaints only affect the $\boldsymbol{z}_j$-update steps.
|
||||||
Furthermore, the $\boldsymbol{z}_j$-update steps can be shown to be equivalent to projections
|
Furthermore, the $\boldsymbol{z}_j$-update steps can be shown to be equivalent to projections
|
||||||
onto the check polytopes $\mathcal{P}_{d_j}$ \cite[Sec. III. B.]{original_admm}
|
onto the check polytopes $\mathcal{P}_{d_j}$ \cite[Sec. III. B.]{original_admm}
|
||||||
and the $\tilde{\boldsymbol{c}}$-update can be computed analytically \cite[Sec. III.]{lautern}:%
|
and the $\tilde{\boldsymbol{c}}$-update can be computed analytically%
|
||||||
|
%
|
||||||
|
\footnote{In the $\tilde{c}_i$-update rule, the term
|
||||||
|
$\left( \boldsymbol{z}_j \right)_i$ is a slight abuse of notation, as
|
||||||
|
$\boldsymbol{z}_j$ has less components than there are variable-nodes $i$.
|
||||||
|
What is actually meant is the component of $\boldsymbol{z}_j$ that is associated
|
||||||
|
with the variable node $i$, i.e., $\left( \boldsymbol{T}_j^\text{T}\boldsymbol{z}_j\right)_i$.
|
||||||
|
The same is true for $\left( \boldsymbol{\lambda}_j \right)_i$.}
|
||||||
|
%
|
||||||
|
\cite[Sec. III.]{lautern}:%
|
||||||
%
|
%
|
||||||
\begin{alignat*}{3}
|
\begin{alignat*}{3}
|
||||||
\tilde{c}_i &\leftarrow \frac{1}{\left| N_v\left( i \right) \right|} \left(
|
\tilde{c}_i &\leftarrow \frac{1}{\left| N_v\left( i \right) \right|} \left(
|
||||||
\sum_{j\in N_v\left( i \right) } \Big( \left( \boldsymbol{\lambda}_j \right)_i
|
\sum_{j\in N_v\left( i \right) } \Big( \left( \boldsymbol{z}_j \right)_i
|
||||||
- \left( \boldsymbol{z}_j \right)_i \Big) - \frac{\gamma_i}{\mu} \right)
|
- \frac{1}{\mu} \left( \boldsymbol{\lambda}_j \right)_i \Big)
|
||||||
|
- \frac{\gamma_i}{\mu} \right)
|
||||||
\hspace{3mm} && \forall i\in\mathcal{I} \\
|
\hspace{3mm} && \forall i\in\mathcal{I} \\
|
||||||
\boldsymbol{z}_j &\leftarrow \Pi_{\mathcal{P}_{d_j}}\left(
|
\boldsymbol{z}_j &\leftarrow \Pi_{\mathcal{P}_{d_j}}\left(
|
||||||
\boldsymbol{T}_j\tilde{\boldsymbol{c}} + \boldsymbol{\lambda}_j \right)
|
\boldsymbol{T}_j\tilde{\boldsymbol{c}} + \frac{\boldsymbol{\lambda}_j}{\mu} \right)
|
||||||
\hspace{3mm} && \forall j\in\mathcal{J} \\
|
\hspace{3mm} && \forall j\in\mathcal{J} \\
|
||||||
\boldsymbol{\lambda}_j &\leftarrow \boldsymbol{\lambda}_j
|
\boldsymbol{\lambda}_j &\leftarrow \boldsymbol{\lambda}_j
|
||||||
+ \mu\left( \boldsymbol{T}_j\tilde{\boldsymbol{c}}
|
+ \mu\left( \boldsymbol{T}_j\tilde{\boldsymbol{c}}
|
||||||
@ -794,9 +797,7 @@ and the $\tilde{\boldsymbol{c}}$-update can be computed analytically \cite[Sec.
|
|||||||
\hspace{3mm} && \forall j\in\mathcal{J}
|
\hspace{3mm} && \forall j\in\mathcal{J}
|
||||||
.\end{alignat*}
|
.\end{alignat*}
|
||||||
%
|
%
|
||||||
\todo{$\tilde{c}_i$-update With or without projection onto $\left[ 0, 1 \right] ^n$?}
|
It should be noted that all of the $\boldsymbol{z}_j$-updates can be computed simultaneously,
|
||||||
%
|
|
||||||
One thing to note is that all of the $\boldsymbol{z}_j$-updates can be computed simultaneously,
|
|
||||||
as they are independent of one another.
|
as they are independent of one another.
|
||||||
The same is true for the updates of the individual components of $\tilde{\boldsymbol{c}}$.
|
The same is true for the updates of the individual components of $\tilde{\boldsymbol{c}}$.
|
||||||
|
|
||||||
@ -809,13 +810,7 @@ Effectively, all of the $\left|\mathcal{J}\right|$ parity constraints are
|
|||||||
able to be handled at the same time.
|
able to be handled at the same time.
|
||||||
This can also be understood by interpreting the decoding process as a message-passing
|
This can also be understood by interpreting the decoding process as a message-passing
|
||||||
algorithm \cite[Sec. III. D.]{original_admm}, \cite[Sec. II. B.]{efficient_lp_dec_admm},
|
algorithm \cite[Sec. III. D.]{original_admm}, \cite[Sec. II. B.]{efficient_lp_dec_admm},
|
||||||
as is shown in figure \ref{fig:lp:message_passing}
|
as is shown in figure \ref{fig:lp:message_passing}.%
|
||||||
\footnote{$\epsilon_{\text{pri}} > 0$ and $\epsilon_{\text{dual}} > 0$ are additional parameters
|
|
||||||
defining the tolerances for the stopping criteria of the algorithm.
|
|
||||||
The variable $\boldsymbol{z}_j^\prime$ denotes the value of
|
|
||||||
$\boldsymbol{z}_j$ in the previous iteration.}%
|
|
||||||
\todo{Move footnote to figure caption}%
|
|
||||||
.%
|
|
||||||
\todo{Explicitly specify sections?}%
|
\todo{Explicitly specify sections?}%
|
||||||
%
|
%
|
||||||
\begin{figure}[H]
|
\begin{figure}[H]
|
||||||
@ -828,26 +823,35 @@ Initialize $\tilde{\boldsymbol{c}}, \boldsymbol{z}_{[1:m]}$ and $\boldsymbol{\la
|
|||||||
while $\sum_{j\in\mathcal{J}} \lVert \boldsymbol{T}_j\tilde{\boldsymbol{c}} - \boldsymbol{z}_j \rVert_2 \ge \epsilon_{\text{pri}}$ or $\sum_{j\in\mathcal{J}} \lVert \boldsymbol{z}^\prime_j - \boldsymbol{z}_j \rVert_2 \ge \epsilon_{\text{dual}}$ do
|
while $\sum_{j\in\mathcal{J}} \lVert \boldsymbol{T}_j\tilde{\boldsymbol{c}} - \boldsymbol{z}_j \rVert_2 \ge \epsilon_{\text{pri}}$ or $\sum_{j\in\mathcal{J}} \lVert \boldsymbol{z}^\prime_j - \boldsymbol{z}_j \rVert_2 \ge \epsilon_{\text{dual}}$ do
|
||||||
for all $j$ in $\mathcal{J}$ do
|
for all $j$ in $\mathcal{J}$ do
|
||||||
$\boldsymbol{z}_j \leftarrow \Pi_{\mathcal{P}_{d_j}}\left(
|
$\boldsymbol{z}_j \leftarrow \Pi_{\mathcal{P}_{d_j}}\left(
|
||||||
\boldsymbol{T}_j\tilde{\boldsymbol{c}} + \boldsymbol{\lambda}_j \right)$
|
\boldsymbol{T}_j\tilde{\boldsymbol{c}} + \frac{\boldsymbol{\lambda}_j}{\mu} \right)$
|
||||||
$\boldsymbol{\lambda}_j \leftarrow \boldsymbol{\lambda}_j
|
$\boldsymbol{\lambda}_j \leftarrow \boldsymbol{\lambda}_j
|
||||||
+ \mu\left( \boldsymbol{T}_j\tilde{\boldsymbol{c}}
|
+ \mu\left( \boldsymbol{T}_j\tilde{\boldsymbol{c}}
|
||||||
- \boldsymbol{z}_j \right)$
|
- \boldsymbol{z}_j \right)$
|
||||||
end for
|
end for
|
||||||
for all $i$ in $\mathcal{I}$ do
|
for all $i$ in $\mathcal{I}$ do
|
||||||
$\tilde{c}_i \leftarrow \frac{1}{\left| N_v\left( i \right) \right|} \left(
|
$\tilde{c}_i \leftarrow \frac{1}{\left| N_v\left( i \right) \right|} \left(
|
||||||
\sum_{j\in N_v\left( i \right) } \Big( \left( \boldsymbol{\lambda}_j \right)_i
|
\sum_{j\in N_v\left( i \right) } \Big(
|
||||||
- \left( \boldsymbol{z}_j \right)_i \Big) - \frac{\gamma_i}{\mu} \right)$
|
\left( \boldsymbol{z}_j \right)_i - \frac{1}{\mu} \left( \boldsymbol{\lambda}_j
|
||||||
|
\right)_i
|
||||||
|
\Big) - \frac{\gamma_i}{\mu} \right)$
|
||||||
end for
|
end for
|
||||||
end while
|
end while
|
||||||
\end{genericAlgorithm}
|
\end{genericAlgorithm}
|
||||||
|
|
||||||
\caption{\ac{LP} decoding using \ac{ADMM} interpreted as a message passing algorithm}
|
\caption{\ac{LP} decoding using \ac{ADMM} interpreted as a message passing algorithm%
|
||||||
|
\protect\footnotemark{}}
|
||||||
\label{fig:lp:message_passing}
|
\label{fig:lp:message_passing}
|
||||||
\end{figure}%
|
\end{figure}%
|
||||||
%
|
%
|
||||||
\noindent The $\tilde{c}_i$-updates can be understood as a variable-node update step,
|
\footnotetext{$\epsilon_{\text{pri}} > 0$ and $\epsilon_{\text{dual}} > 0$
|
||||||
and the $\boldsymbol{z}_j$- and $\boldsymbol{\lambda}_j$-updates can be understood as
|
are additional parameters
|
||||||
a check-node update step.
|
defining the tolerances for the stopping criteria of the algorithm.
|
||||||
|
The variable $\boldsymbol{z}_j^\prime$ denotes the value of
|
||||||
|
$\boldsymbol{z}_j$ in the previous iteration.}%
|
||||||
|
%
|
||||||
|
\noindent The $\boldsymbol{z}_j$- and $\boldsymbol{\lambda}_j$-updates can be understood as
|
||||||
|
a check-node update step (lines $3$-$6$) and the $\tilde{c}_i$-updates can be understood as
|
||||||
|
a variable-node update step (lines $7$-$9$ in figure \ref{fig:lp:message_passing}).
|
||||||
The updates for each variable- and check-node can be perfomed in parallel.
|
The updates for each variable- and check-node can be perfomed in parallel.
|
||||||
With this interpretation it becomes clear why \ac{LP} decoding using \ac{ADMM}
|
With this interpretation it becomes clear why \ac{LP} decoding using \ac{ADMM}
|
||||||
is able to achieve similar computational complexity to \ac{BP}.
|
is able to achieve similar computational complexity to \ac{BP}.
|
||||||
@ -948,17 +952,13 @@ $L \left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right) = -\ln\left(
|
|||||||
%
|
%
|
||||||
\begin{align*}
|
\begin{align*}
|
||||||
\hat{\boldsymbol{x}} &= \argmax_{\tilde{\boldsymbol{x}} \in \mathbb{R}^{n}}
|
\hat{\boldsymbol{x}} &= \argmax_{\tilde{\boldsymbol{x}} \in \mathbb{R}^{n}}
|
||||||
\mathrm{e}^{- L\left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right) }
|
\mathrm{e}^{- L\left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right) }
|
||||||
\mathrm{e}^{-\gamma h\left( \tilde{\boldsymbol{x}} \right) } \\
|
\mathrm{e}^{-\gamma h\left( \tilde{\boldsymbol{x}} \right) } \\
|
||||||
&= \argmin_{\tilde{\boldsymbol{x}} \in \mathbb{R}^n} \left(
|
&= \argmin_{\tilde{\boldsymbol{x}} \in \mathbb{R}^n} \big(
|
||||||
L\left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right)
|
L\left( \boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right)
|
||||||
+ \gamma h\left( \tilde{\boldsymbol{x}} \right)
|
+ \gamma h\left( \tilde{\boldsymbol{x}} \right)
|
||||||
\right)%
|
\big)%
|
||||||
.\end{align*}%
|
.\end{align*}%
|
||||||
\todo{\textbackslash left($\cdot$ \textbackslash right)\\
|
|
||||||
$\rightarrow$\\
|
|
||||||
\textbackslash big( $\cdot$ \textbackslash big)\\
|
|
||||||
?}%
|
|
||||||
%
|
%
|
||||||
Thus, with proximal decoding, the objective function
|
Thus, with proximal decoding, the objective function
|
||||||
$g\left( \tilde{\boldsymbol{x}} \right)$ considered is%
|
$g\left( \tilde{\boldsymbol{x}} \right)$ considered is%
|
||||||
@ -1011,7 +1011,6 @@ It is then immediately approximated with gradient-descent:%
|
|||||||
\hspace{5mm} \gamma > 0, \text{ small}
|
\hspace{5mm} \gamma > 0, \text{ small}
|
||||||
.\end{align*}%
|
.\end{align*}%
|
||||||
%
|
%
|
||||||
\todo{explicitly state $\nabla h$?}
|
|
||||||
The second step thus becomes%
|
The second step thus becomes%
|
||||||
%
|
%
|
||||||
\begin{align*}
|
\begin{align*}
|
||||||
|
|||||||
Loading…
Reference in New Issue
Block a user