Fixed tilde over x; Wrote convergence properties subsection; Minor different changes
This commit is contained in:
parent
9c9aa11669
commit
2312f40d94
@ -270,28 +270,29 @@ The gradient of the code-constraint polynomial \cite[Sec. 2.3]{proximal_paper}
|
|||||||
is given by%
|
is given by%
|
||||||
%
|
%
|
||||||
\begin{align*}
|
\begin{align*}
|
||||||
\nabla h\left( \boldsymbol{x} \right) &= \begin{bmatrix}
|
\nabla h\left( \tilde{\boldsymbol{x}} \right) &= \begin{bmatrix}
|
||||||
\frac{\partial}{\partial x_1}h\left( \boldsymbol{x} \right) &
|
\frac{\partial}{\partial \tilde{x}_1}h\left( \tilde{\boldsymbol{x}} \right) &
|
||||||
\ldots &
|
\ldots &
|
||||||
\frac{\partial}{\partial x_n}h\left( \boldsymbol{x} \right) &
|
\frac{\partial}{\partial \tilde{x}_n}h\left( \tilde{\boldsymbol{x}} \right) &
|
||||||
\end{bmatrix}^\text{T}, \\[1em]
|
\end{bmatrix}^\text{T}, \\[1em]
|
||||||
\frac{\partial}{\partial x_k}h\left( \boldsymbol{x} \right) &= 4\left( x_k^2 - 1 \right) x_k
|
\frac{\partial}{\partial \tilde{x}_k}h\left( \tilde{\boldsymbol{x}} \right)
|
||||||
+ \frac{2}{x_k} \sum_{j\in N_v\left( k \right) }\left(
|
&= 4\left( \tilde{x}_k^2 - 1 \right) \tilde{x}_k
|
||||||
\left( \prod_{i \in N_c\left( j \right)} x_i \right)^2
|
+ \frac{2}{\tilde{x}_k} \sum_{j\in N_v\left( k \right) }\left(
|
||||||
- \prod_{i\in N_c\left( j \right) } x_i \right)
|
\left( \prod_{i \in N_c\left( j \right)} \tilde{x}_i \right)^2
|
||||||
|
- \prod_{i\in N_c\left( j \right) } \tilde{x}_i \right)
|
||||||
.\end{align*}
|
.\end{align*}
|
||||||
%
|
%
|
||||||
Since the products
|
Since the products
|
||||||
$\prod_{i\in N_c\left( j \right) } x_i,\hspace{2mm}j\in \mathcal{J}$
|
$\prod_{i\in N_c\left( j \right) } \tilde{x}_i,\hspace{2mm}j\in \mathcal{J}$
|
||||||
are the same for all components $x_k$ of $\boldsymbol{x}$, they can be
|
are the same for all components $\tilde{x}_k$ of $\tilde{\boldsymbol{x}}$, they can be
|
||||||
precomputed.
|
precomputed.
|
||||||
Defining%
|
Defining%
|
||||||
%
|
%
|
||||||
\begin{align*}
|
\begin{align*}
|
||||||
\boldsymbol{p} := \begin{bmatrix}
|
\boldsymbol{p} := \begin{bmatrix}
|
||||||
\prod_{i\in N_c\left( 1 \right) }x_i \\
|
\prod_{i\in N_c\left( 1 \right) } \tilde{x}_i \\
|
||||||
\vdots \\
|
\vdots \\
|
||||||
\prod_{i\in N_c\left( m \right) }x_i \\
|
\prod_{i\in N_c\left( m \right) } \tilde{x}_i \\
|
||||||
\end{bmatrix}
|
\end{bmatrix}
|
||||||
\hspace{5mm}
|
\hspace{5mm}
|
||||||
\text{and}
|
\text{and}
|
||||||
@ -302,9 +303,9 @@ Defining%
|
|||||||
the gradient can be written as%
|
the gradient can be written as%
|
||||||
%
|
%
|
||||||
\begin{align*}
|
\begin{align*}
|
||||||
\nabla h\left( \boldsymbol{x} \right) =
|
\nabla h\left( \tilde{\boldsymbol{x}} \right) =
|
||||||
4\left( \boldsymbol{x}^{\circ 3} - \boldsymbol{x} \right)
|
4\left( \tilde{\boldsymbol{x}}^{\circ 3} - \tilde{\boldsymbol{x}} \right)
|
||||||
+ 2\boldsymbol{x}^{\circ -1} \circ \boldsymbol{H}^\text{T}
|
+ 2\tilde{\boldsymbol{x}}^{\circ -1} \circ \boldsymbol{H}^\text{T}
|
||||||
\boldsymbol{v}
|
\boldsymbol{v}
|
||||||
,\end{align*}
|
,\end{align*}
|
||||||
%
|
%
|
||||||
@ -331,7 +332,7 @@ The impact of the parameters $\gamma$, as well as $\omega$, $K$ and $\eta$ is
|
|||||||
examined.
|
examined.
|
||||||
The decoding performance is assessed on the basis of the \ac{BER} and the
|
The decoding performance is assessed on the basis of the \ac{BER} and the
|
||||||
\ac{FER} as well as the \textit{decoding failure rate} - the rate at which
|
\ac{FER} as well as the \textit{decoding failure rate} - the rate at which
|
||||||
the algorithm produces erroneous results.
|
the algorithm produces invalid results.
|
||||||
The convergence properties are reviewed and related to the decoding
|
The convergence properties are reviewed and related to the decoding
|
||||||
performance.
|
performance.
|
||||||
Finally, the computational performance is examined on a theoretical basis
|
Finally, the computational performance is examined on a theoretical basis
|
||||||
@ -497,6 +498,17 @@ undertaking an extensive search for an exact optimum.
|
|||||||
Rather, a preliminary examination providing a rough window for $\gamma$ may
|
Rather, a preliminary examination providing a rough window for $\gamma$ may
|
||||||
be sufficient.
|
be sufficient.
|
||||||
|
|
||||||
|
TODO: $\omega, K$
|
||||||
|
|
||||||
|
Changing the parameter $\eta$ does not appear to have a significant effect on
|
||||||
|
the decoding performance when keeping the value within a reasonable window
|
||||||
|
(''slightly larger than one``, as stated in \cite[Sec. 3.2]{proximal_paper}),
|
||||||
|
which seems plausible considering its only function is ensuring numerical stability.
|
||||||
|
|
||||||
|
\begin{itemize}
|
||||||
|
\item Conclusion: Number of iterations independent of \ac{SNR}
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
\begin{figure}[H]
|
\begin{figure}[H]
|
||||||
\centering
|
\centering
|
||||||
|
|
||||||
@ -716,19 +728,13 @@ be sufficient.
|
|||||||
\addlegendentry{$\gamma = 0.15$};
|
\addlegendentry{$\gamma = 0.15$};
|
||||||
\end{axis}
|
\end{axis}
|
||||||
\end{tikzpicture}
|
\end{tikzpicture}
|
||||||
|
|
||||||
\end{subfigure}
|
\end{subfigure}
|
||||||
|
|
||||||
\caption{BER for $\omega = 0.05, K=100$ (different codes)}
|
\caption{BER for $\omega = 0.05, K=100$ (different codes)}
|
||||||
\label{fig:prox:results_3d_multiple}
|
\label{fig:prox:results_3d_multiple}
|
||||||
\end{figure}
|
\end{figure}
|
||||||
|
|
||||||
A similar analysis was performed to determine the optimal values for the other
|
|
||||||
parameters, $\omega$, $K$ and $\eta$.
|
|
||||||
|
|
||||||
Changing the parameter $\eta$ does not appear to have a significant effect on
|
|
||||||
the decoding performance, which seems sensible considering its only purpose
|
|
||||||
is ensuring numerical stability.
|
|
||||||
|
|
||||||
\subsection{Decoding Performance}
|
\subsection{Decoding Performance}
|
||||||
|
|
||||||
\begin{figure}[H]
|
\begin{figure}[H]
|
||||||
@ -815,67 +821,414 @@ is ensuring numerical stability.
|
|||||||
|
|
||||||
Until now, only the \ac{BER} has been considered to assess the decoding
|
Until now, only the \ac{BER} has been considered to assess the decoding
|
||||||
performance.
|
performance.
|
||||||
The \ac{FER}, however, shows considerably worse performance, as can be seen in
|
The \ac{FER}, however, shows considerably worse behaviour, as can be seen in
|
||||||
figure \ref{fig:prox:ber_fer_dfr}.
|
figure \ref{fig:prox:ber_fer_dfr}.
|
||||||
Besides the \ac{BER} and \ac{FER} curves, the figure also shows the
|
Besides the \ac{BER} and \ac{FER} curves, the figure also shows the
|
||||||
\textit{decoding failure rate}.
|
\textit{decoding failure rate}.
|
||||||
This is the rate at which the iterative process produces erroneous codewords,
|
This is the rate at which the iterative process produces invalid codewords,
|
||||||
i.e., the stopping criterion (line 6 of algorithm \ref{TODO}) is never
|
i.e., the stopping criterion (line 6 of algorithm \ref{TODO}) is never
|
||||||
satisfied and the maximum number of itertations $K$ is reached without
|
satisfied and the maximum number of itertations $K$ is reached without
|
||||||
converging to a valid codeword.
|
converging to a valid codeword.
|
||||||
|
Three lines are plotted in each case, corresponding to different values of
|
||||||
|
the parameter $\gamma$.
|
||||||
|
The values chosen are the same as in figure \ref{fig:prox:results}, as they
|
||||||
|
seem to adequately describe the behaviour across a wide range of values
|
||||||
|
(see figure \ref{fig:prox:results_3d}).
|
||||||
|
|
||||||
One possible explanation might be found in the structure of the proxmal
|
It is apparent that the \ac{FER} and the decoding failure rate are extremely
|
||||||
decoding algorithm \ref{TODO} itself.
|
similar, especially for higher \acp{SNR}.
|
||||||
As it comprises two separate steps, one responsible for addressing the
|
This leads to the hypothesis that, at least for higher \acp{SNR}, frame errors
|
||||||
likelihood and one for addressing the constraints imposed by the parity-check
|
arise mainly due to the non-convergence of the algorithm instead of
|
||||||
matrix, the algorithm could tend to gravitate toward the correct codeword
|
convergence to the wrong codeword.
|
||||||
but then get stuck in a local minimum introduced by the code-constraint
|
|
||||||
polynomial.
|
|
||||||
This would yield fewer bit-errors, while still producing a frame error.
|
|
||||||
This course of thought will be picked up in section
|
This course of thought will be picked up in section
|
||||||
\ref{sec:prox:Improved Implementation} to try to improve the algorithm.
|
\ref{sec:prox:Improved Implementation} to try to improve the algorithm.
|
||||||
|
|
||||||
|
In summary, the \ac{BER} and \ac{FER} indicate dissimilar decoding
|
||||||
|
performance.
|
||||||
|
The decoding failure rate closely resembles the \ac{FER}, suggesting that
|
||||||
|
the frame errors may largely be attributed to decoding failures.
|
||||||
|
|
||||||
|
\todo{Maybe reference to the structure of the algorithm (1 part likelihood
|
||||||
|
1 part constraints)}
|
||||||
|
|
||||||
\begin{itemize}
|
|
||||||
\item Introduction
|
|
||||||
\begin{itemize}
|
|
||||||
\item asdf
|
|
||||||
\item ghjk
|
|
||||||
\end{itemize}
|
|
||||||
\item Reconstruction of results from paper
|
|
||||||
\begin{itemize}
|
|
||||||
\item asdf
|
|
||||||
\item ghjk
|
|
||||||
\end{itemize}
|
|
||||||
\item Choice of parameters, in particular gamma
|
|
||||||
\begin{itemize}
|
|
||||||
\item Introduction (``Looking at these results, the question arises \ldots'')
|
|
||||||
\item Different gammas simulated for same code as in paper
|
|
||||||
\item
|
|
||||||
\end{itemize}
|
|
||||||
\item The FER problem
|
|
||||||
\begin{itemize}
|
|
||||||
\item Intro (``\acs{FER} not as good as the \acs{BER} would have one assume'')
|
|
||||||
\item Possible explanation
|
|
||||||
\end{itemize}
|
|
||||||
\item Computational performance
|
|
||||||
\begin{itemize}
|
|
||||||
\item Theoretical analysis
|
|
||||||
\item Simulation results to substantiate theoretical analysis
|
|
||||||
\end{itemize}
|
|
||||||
\item Conclusion
|
|
||||||
\begin{itemize}
|
|
||||||
\item Choice of $\gamma$ code-dependant but decoding performance largely unaffected
|
|
||||||
by small variations
|
|
||||||
\item Number of iterations independent of \ac{SNR}
|
|
||||||
\item $\mathcal{O}\left( n \right)$ time complexity, implementation heavily
|
|
||||||
optimizable
|
|
||||||
\end{itemize}
|
|
||||||
\end{itemize}
|
|
||||||
|
|
||||||
\subsection{Convergence Properties}
|
\subsection{Convergence Properties}
|
||||||
|
|
||||||
|
The previous observation, that the \ac{FER} arises mainly due to the
|
||||||
|
non-convergence of the algorithm instead of convergence to the wrong codeword,
|
||||||
|
raises the question why the decoding process does not converge so often.
|
||||||
|
In figure \ref{fig:prox:convergence}, the iterative process is visualized
|
||||||
|
for each iteration.
|
||||||
|
In order to be able to simultaneously consider all components of the vectors
|
||||||
|
being dealt with, a BCH code with $n=7$ and $k=4$ is chosen.
|
||||||
|
Each chart shows one component of the current estimates during a given
|
||||||
|
iteration (alternating between $\boldsymbol{r}$ and $\boldsymbol{s}$), as well
|
||||||
|
as the gradients of the negative log-likelihood and the code-constraint
|
||||||
|
polynomial, which influence the next estimate.
|
||||||
|
|
||||||
|
\begin{figure}[H]
|
||||||
|
\begin{minipage}[c]{0.25\textwidth}
|
||||||
|
\centering
|
||||||
|
|
||||||
|
\begin{tikzpicture}[scale = 0.35]
|
||||||
|
\begin{axis}[
|
||||||
|
grid=both,
|
||||||
|
xlabel={Iterations},
|
||||||
|
width=8cm,
|
||||||
|
height=3cm,
|
||||||
|
scale only axis,
|
||||||
|
xtick={0, 50, ..., 200},
|
||||||
|
xticklabels={0, 25, ..., 100},
|
||||||
|
]
|
||||||
|
\addplot [NavyBlue, mark=none, line width=1]
|
||||||
|
table [col sep=comma, x=k, y=comb_r_s_1]
|
||||||
|
{res/proximal/comp_bch_7_4_combined.csv};
|
||||||
|
\addplot [ForestGreen, mark=none, line width=1]
|
||||||
|
table [col sep=comma, x=k, y=grad_L_1]
|
||||||
|
{res/proximal/comp_bch_7_4_combined.csv};
|
||||||
|
\addplot [RedOrange, mark=none, line width=1]
|
||||||
|
table [col sep=comma, x=k, y=grad_h_1]
|
||||||
|
{res/proximal/comp_bch_7_4_combined.csv};
|
||||||
|
\addlegendentry{est}
|
||||||
|
\addlegendentry{$\left(\nabla L \right)_2$}
|
||||||
|
\addlegendentry{$\left(\nabla h \right)_2 $}
|
||||||
|
\end{axis}
|
||||||
|
\end{tikzpicture}\\
|
||||||
|
\begin{tikzpicture}[scale = 0.35]
|
||||||
|
\begin{axis}[
|
||||||
|
grid=both,
|
||||||
|
xlabel={Iterations},
|
||||||
|
width=8cm,
|
||||||
|
height=3cm,
|
||||||
|
scale only axis,
|
||||||
|
xtick={0, 50, ..., 200},
|
||||||
|
xticklabels={0, 25, ..., 100},
|
||||||
|
]
|
||||||
|
\addplot [NavyBlue, mark=none, line width=1]
|
||||||
|
table [col sep=comma, x=k, y=comb_r_s_2]
|
||||||
|
{res/proximal/comp_bch_7_4_combined.csv};
|
||||||
|
\addplot [ForestGreen, mark=none, line width=1]
|
||||||
|
table [col sep=comma, x=k, y=grad_L_2]
|
||||||
|
{res/proximal/comp_bch_7_4_combined.csv};
|
||||||
|
\addplot [RedOrange, mark=none, line width=1]
|
||||||
|
table [col sep=comma, x=k, y=grad_h_2]
|
||||||
|
{res/proximal/comp_bch_7_4_combined.csv};
|
||||||
|
\addlegendentry{est}
|
||||||
|
\addlegendentry{$\left(\nabla L \right)_3$}
|
||||||
|
\addlegendentry{$\left(\nabla h \right)_3 $}
|
||||||
|
\end{axis}
|
||||||
|
\end{tikzpicture}\\
|
||||||
|
\begin{tikzpicture}[scale = 0.35]
|
||||||
|
\begin{axis}[
|
||||||
|
grid=both,
|
||||||
|
xlabel={Iterations},
|
||||||
|
width=8cm,
|
||||||
|
height=3cm,
|
||||||
|
scale only axis,
|
||||||
|
xtick={0, 50, ..., 200},
|
||||||
|
xticklabels={0, 25, ..., 100},
|
||||||
|
]
|
||||||
|
\addplot [NavyBlue, mark=none, line width=1]
|
||||||
|
table [col sep=comma, x=k, y=comb_r_s_3]
|
||||||
|
{res/proximal/comp_bch_7_4_combined.csv};
|
||||||
|
\addplot [ForestGreen, mark=none, line width=1]
|
||||||
|
table [col sep=comma, x=k, y=grad_L_3]
|
||||||
|
{res/proximal/comp_bch_7_4_combined.csv};
|
||||||
|
\addplot [RedOrange, mark=none, line width=1]
|
||||||
|
table [col sep=comma, x=k, y=grad_h_3]
|
||||||
|
{res/proximal/comp_bch_7_4_combined.csv};
|
||||||
|
\addlegendentry{est}
|
||||||
|
\addlegendentry{$\left(\nabla L \right)_4$}
|
||||||
|
\addlegendentry{$\left(\nabla h \right)_4 $}
|
||||||
|
\end{axis}
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{minipage}%
|
||||||
|
\begin{minipage}[c]{0.5\textwidth}
|
||||||
|
\vspace*{-1cm}
|
||||||
|
\centering
|
||||||
|
|
||||||
|
\begin{tikzpicture}[scale = 0.85, spy using outlines={circle, magnification=6,
|
||||||
|
connect spies}]
|
||||||
|
\begin{axis}[
|
||||||
|
grid=both,
|
||||||
|
xlabel={Iterations},
|
||||||
|
width=8cm,
|
||||||
|
height=3cm,
|
||||||
|
scale only axis,
|
||||||
|
xtick={0, 50, ..., 200},
|
||||||
|
xticklabels={0, 25, ..., 100},
|
||||||
|
]
|
||||||
|
\addplot [NavyBlue, mark=none, line width=1]
|
||||||
|
table [col sep=comma, x=k, y=comb_r_s_0]
|
||||||
|
{res/proximal/comp_bch_7_4_combined.csv};
|
||||||
|
\addplot [ForestGreen, mark=none, line width=1]
|
||||||
|
table [col sep=comma, x=k, y=grad_L_0]
|
||||||
|
{res/proximal/comp_bch_7_4_combined.csv};
|
||||||
|
\addplot [RedOrange, mark=none, line width=1]
|
||||||
|
table [col sep=comma, x=k, y=grad_h_0]
|
||||||
|
{res/proximal/comp_bch_7_4_combined.csv};
|
||||||
|
\addlegendentry{est}
|
||||||
|
\addlegendentry{$\left(\nabla L \right)_1$}
|
||||||
|
\addlegendentry{$\left(\nabla h \right)_1 $}
|
||||||
|
|
||||||
|
\coordinate (spypoint) at (axis cs:100,0.53);
|
||||||
|
\coordinate (magnifyglass) at (axis cs:175,2);
|
||||||
|
\end{axis}
|
||||||
|
\spy [black, size=2cm] on (spypoint)
|
||||||
|
in node[fill=white] at (magnifyglass);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{minipage}%
|
||||||
|
\begin{minipage}[c]{0.25\textwidth}
|
||||||
|
\centering
|
||||||
|
|
||||||
|
\begin{tikzpicture}[scale = 0.35]
|
||||||
|
\begin{axis}[
|
||||||
|
grid=both,
|
||||||
|
xlabel={Iterations},
|
||||||
|
width=8cm,
|
||||||
|
height=3cm,
|
||||||
|
scale only axis,
|
||||||
|
xtick={0, 50, ..., 200},
|
||||||
|
xticklabels={0, 25, ..., 100},
|
||||||
|
]
|
||||||
|
\addplot [NavyBlue, mark=none, line width=1]
|
||||||
|
table [col sep=comma, x=k, y=comb_r_s_4]
|
||||||
|
{res/proximal/comp_bch_7_4_combined.csv};
|
||||||
|
\addplot [ForestGreen, mark=none, line width=1]
|
||||||
|
table [col sep=comma, x=k, y=grad_L_4]
|
||||||
|
{res/proximal/comp_bch_7_4_combined.csv};
|
||||||
|
\addplot [RedOrange, mark=none, line width=1]
|
||||||
|
table [col sep=comma, x=k, y=grad_h_4]
|
||||||
|
{res/proximal/comp_bch_7_4_combined.csv};
|
||||||
|
\addlegendentry{est}
|
||||||
|
\addlegendentry{$\left(\nabla L \right)_5$}
|
||||||
|
\addlegendentry{$\left(\nabla h \right)_5 $}
|
||||||
|
\end{axis}
|
||||||
|
\end{tikzpicture}\\
|
||||||
|
\begin{tikzpicture}[scale = 0.35]
|
||||||
|
\begin{axis}[
|
||||||
|
grid=both,
|
||||||
|
xlabel={Iterations},
|
||||||
|
width=8cm,
|
||||||
|
height=3cm,
|
||||||
|
scale only axis,
|
||||||
|
xtick={0, 50, ..., 200},
|
||||||
|
xticklabels={0, 25, ..., 100},
|
||||||
|
]
|
||||||
|
\addplot [NavyBlue, mark=none, line width=1]
|
||||||
|
table [col sep=comma, x=k, y=comb_r_s_5]
|
||||||
|
{res/proximal/comp_bch_7_4_combined.csv};
|
||||||
|
\addplot [ForestGreen, mark=none, line width=1]
|
||||||
|
table [col sep=comma, x=k, y=grad_L_5]
|
||||||
|
{res/proximal/comp_bch_7_4_combined.csv};
|
||||||
|
\addplot [RedOrange, mark=none, line width=1]
|
||||||
|
table [col sep=comma, x=k, y=grad_h_5]
|
||||||
|
{res/proximal/comp_bch_7_4_combined.csv};
|
||||||
|
\addlegendentry{est}
|
||||||
|
\addlegendentry{$\left(\nabla L \right)_6$}
|
||||||
|
\addlegendentry{$\left(\nabla h \right)_6 $}
|
||||||
|
\end{axis}
|
||||||
|
\end{tikzpicture}\\
|
||||||
|
\begin{tikzpicture}[scale = 0.35]
|
||||||
|
\begin{axis}[
|
||||||
|
grid=both,
|
||||||
|
xlabel={Iterations},
|
||||||
|
width=8cm,
|
||||||
|
height=3cm,
|
||||||
|
scale only axis,
|
||||||
|
xtick={0, 50, ..., 200},
|
||||||
|
xticklabels={0, 25, ..., 100},
|
||||||
|
]
|
||||||
|
\addplot [NavyBlue, mark=none, line width=1]
|
||||||
|
table [col sep=comma, x=k, y=comb_r_s_6]
|
||||||
|
{res/proximal/comp_bch_7_4_combined.csv};
|
||||||
|
\addplot [ForestGreen, mark=none, line width=1]
|
||||||
|
table [col sep=comma, x=k, y=grad_L_6]
|
||||||
|
{res/proximal/comp_bch_7_4_combined.csv};
|
||||||
|
\addplot [RedOrange, mark=none, line width=1]
|
||||||
|
table [col sep=comma, x=k, y=grad_h_6]
|
||||||
|
{res/proximal/comp_bch_7_4_combined.csv};
|
||||||
|
\addlegendentry{est}
|
||||||
|
\addlegendentry{$\left(\nabla L \right)_7$}
|
||||||
|
\addlegendentry{$\left(\nabla h \right)_7 $}
|
||||||
|
\end{axis}
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{minipage}
|
||||||
|
|
||||||
|
\caption{Internal variables of proximal decoder
|
||||||
|
as a function of the number of iterations ($n=7$)\protect\footnotemark{}}
|
||||||
|
\label{fig:prox:convergence}
|
||||||
|
\end{figure}%
|
||||||
|
%
|
||||||
|
\footnotetext{A single decoding is shown, using the BCH$\left( 7,4 \right) $ code;
|
||||||
|
$\gamma = 0.05, \omega = 0.05, E_b / N_0 = \SI{5}{dB}$
|
||||||
|
}%
|
||||||
|
%
|
||||||
|
\noindent It is evident that in all cases, past a certain number of
|
||||||
|
iterations, the estimate starts to oscillate around a particular value.
|
||||||
|
After a certain point, the two gradients stop further approaching the value
|
||||||
|
zero.
|
||||||
|
In particular, this leads to the code-constraints polynomial not being
|
||||||
|
minimized.
|
||||||
|
As such, the constraints are not being satisfied and the estimate is not
|
||||||
|
converging towards a valid codeword.
|
||||||
|
|
||||||
|
While figure \ref{fig:prox:convergence} shows only one instance of a decoding
|
||||||
|
task, it is indicative of the general behaviour of the algorithm.
|
||||||
|
This can be justified by looking at the gradients themselves.
|
||||||
|
In figure \ref{fig:prox:gradients} the gradients of the negative
|
||||||
|
log-likelihood and the code-constraint polynomial for a repetition code with
|
||||||
|
$n=2$ are shown.
|
||||||
|
It is obvious that walking along the gradients in an alternating fashion will
|
||||||
|
produce a net movement in a certain direction, as long as the two gradients
|
||||||
|
have a common component.
|
||||||
|
As soon as this common component is exhausted, they will start pulling the
|
||||||
|
estimate in opposing directions, leading to an oscillation as illustrated
|
||||||
|
in figure \ref{fig:prox:convergence}.
|
||||||
|
Consequently, this oscillation is an intrinsic property of the structure of
|
||||||
|
the proximal decoding algorithm, where the two parts of the objective function
|
||||||
|
are minimized in an alternating manner using their gradients.
|
||||||
|
|
||||||
|
\begin{figure}[H]
|
||||||
|
\centering
|
||||||
|
|
||||||
|
\begin{subfigure}[c]{0.5\textwidth}
|
||||||
|
\centering
|
||||||
|
|
||||||
|
\begin{tikzpicture}
|
||||||
|
\begin{axis}[xmin = -1.25, xmax=1.25,
|
||||||
|
ymin = -1.25, ymax=1.25,
|
||||||
|
xlabel={$x_1$}, ylabel={$x_2$},
|
||||||
|
width=\textwidth,
|
||||||
|
height=0.75\textwidth,
|
||||||
|
grid=major, grid style={dotted},
|
||||||
|
view={0}{90}]
|
||||||
|
|
||||||
|
\addplot3[point meta=\thisrow{grad_norm},
|
||||||
|
point meta min=1,
|
||||||
|
point meta max=3,
|
||||||
|
quiver={u=\thisrow{grad_0},
|
||||||
|
v=\thisrow{grad_1},
|
||||||
|
scale arrows=.05,
|
||||||
|
every arrow/.append style={%
|
||||||
|
line width=.3+\pgfplotspointmetatransformed/1000,
|
||||||
|
-{Latex[length=0pt 5,width=0pt 3]}
|
||||||
|
},
|
||||||
|
},
|
||||||
|
quiver/colored = {mapped color},
|
||||||
|
colormap/rocket,
|
||||||
|
-stealth,
|
||||||
|
]
|
||||||
|
table[col sep=comma] {res/proximal/2d_grad_L.csv};
|
||||||
|
\end{axis}
|
||||||
|
\end{tikzpicture}
|
||||||
|
|
||||||
|
\caption{$\nabla L \left(\boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right) $
|
||||||
|
for a repetition code with $n=2$}
|
||||||
|
\end{subfigure}%
|
||||||
|
\hfill%
|
||||||
|
\begin{subfigure}[c]{0.5\textwidth}
|
||||||
|
\centering
|
||||||
|
|
||||||
|
\begin{tikzpicture}
|
||||||
|
\begin{axis}[xmin = -1.25, xmax=1.25,
|
||||||
|
ymin = -1.25, ymax=1.25,
|
||||||
|
xlabel={$x_1$}, ylabel={$x_2$},
|
||||||
|
grid=major, grid style={dotted},
|
||||||
|
width=\textwidth,
|
||||||
|
height=0.75\textwidth,
|
||||||
|
view={0}{90}]
|
||||||
|
\addplot3[point meta=\thisrow{grad_norm},
|
||||||
|
point meta min=1,
|
||||||
|
point meta max=4,
|
||||||
|
quiver={u=\thisrow{grad_0},
|
||||||
|
v=\thisrow{grad_1},
|
||||||
|
scale arrows=.03,
|
||||||
|
every arrow/.append style={%
|
||||||
|
line width=.3+\pgfplotspointmetatransformed/1000,
|
||||||
|
-{Latex[length=0pt 5,width=0pt 3]}
|
||||||
|
},
|
||||||
|
},
|
||||||
|
quiver/colored = {mapped color},
|
||||||
|
colormap/rocket,
|
||||||
|
-stealth,
|
||||||
|
]
|
||||||
|
table[col sep=comma] {res/proximal/2d_grad_h.csv};
|
||||||
|
\end{axis}
|
||||||
|
\end{tikzpicture}
|
||||||
|
|
||||||
|
\caption{$\nabla h \left( \tilde{\boldsymbol{x}} \right) $
|
||||||
|
for a repetition code with $n=2$}
|
||||||
|
\end{subfigure}%
|
||||||
|
\end{figure}
|
||||||
|
|
||||||
|
|
||||||
|
While the initial net movement is generally directed in the right direction
|
||||||
|
owing to the gradient of the negative log-likelihood, the final oscillation
|
||||||
|
may well take place in a segment of space not corresponding to a valid
|
||||||
|
codeword, leading to the aforementioned non-convergence of the algorithm.
|
||||||
|
This also partly explains the difference in decoding performance when looking
|
||||||
|
at the \ac{BER} and \ac{FER}, as it would lower the amount of bit errors while
|
||||||
|
still yielding an invalid codeword.
|
||||||
|
|
||||||
|
When considering codes with larger $n$, the behaviour generally stays the
|
||||||
|
same, with some minor differences.
|
||||||
|
In figure \ref{fig:prox:convergence_large_n} the decoding process is
|
||||||
|
visualized for one component of a code with $n=204$, for a single decoding.
|
||||||
|
The two gradients still start to fight each other and the estimate still
|
||||||
|
starts to oscillate, the same as illustrated on the basis of figure
|
||||||
|
\ref{fig:prox:convergence} for a code with $n=7$.
|
||||||
|
However, in this case, the gradient of the code-constraint polynomial iself
|
||||||
|
starts to oscillate, its average value being such that the effect of the
|
||||||
|
gradient of the negative log-likelihood is counteracted.
|
||||||
|
|
||||||
|
In conclusion, as a general rule, the proximal decoding algorithm reaches
|
||||||
|
an oscillatory state which it cannot escape as a consequence of its structure.
|
||||||
|
In this state, the constraints may not be satisfied, leading to the algorithm
|
||||||
|
returning an invalid codeword.
|
||||||
|
|
||||||
|
\begin{figure}[H]
|
||||||
|
\centering
|
||||||
|
|
||||||
|
\begin{tikzpicture}
|
||||||
|
\begin{axis}[
|
||||||
|
grid=both,
|
||||||
|
xlabel={Iterations},
|
||||||
|
width=0.6\textwidth,
|
||||||
|
height=0.45\textwidth,
|
||||||
|
scale only axis,
|
||||||
|
xtick={0, 100, ..., 400},
|
||||||
|
xticklabels={0, 50, ..., 200},
|
||||||
|
]
|
||||||
|
\addplot [NavyBlue, mark=none, line width=1]
|
||||||
|
table [col sep=comma, x=k, y=comb_r_s_0]
|
||||||
|
{res/proximal/extreme_components_20433484_combined.csv};
|
||||||
|
\addplot [ForestGreen, mark=none, line width=1]
|
||||||
|
table [col sep=comma, x=k, y=grad_L_0]
|
||||||
|
{res/proximal/extreme_components_20433484_combined.csv};
|
||||||
|
\addplot [RedOrange, mark=none, line width=1]
|
||||||
|
table [col sep=comma, x=k, y=grad_h_0]
|
||||||
|
{res/proximal/extreme_components_20433484_combined.csv};
|
||||||
|
\addlegendentry{est}
|
||||||
|
\addlegendentry{$\left(\nabla L\right)_1$}
|
||||||
|
\addlegendentry{$\left(\nabla h\right)_1$}
|
||||||
|
\end{axis}
|
||||||
|
\end{tikzpicture}
|
||||||
|
|
||||||
|
\caption{Internal variables of proximal decoder as a function of the iteration ($n=204$)}
|
||||||
|
\label{fig:prox:convergence_large_n}
|
||||||
|
\end{figure}%
|
||||||
|
|
||||||
|
|
||||||
\subsection{Computational Performance}
|
\subsection{Computational Performance}
|
||||||
|
|
||||||
|
\begin{itemize}
|
||||||
|
\item Theoretical analysis
|
||||||
|
\item Simulation results to substantiate theoretical analysis
|
||||||
|
\item Conclusion: $\mathcal{O}\left( n \right)$ time complexity, implementation heavily
|
||||||
|
optimizable
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
|
||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||||
\section{Improved Implementation}%
|
\section{Improved Implementation}%
|
||||||
|
|||||||
Loading…
Reference in New Issue
Block a user