Reworked rest of proximal decoding and fixed most figures and captions
This commit is contained in:
parent
5c135e085e
commit
0d4b13ccda
@ -557,8 +557,8 @@ The plots have been generated by averaging the error over $\SI{500000}{}$
|
||||
decodings.
|
||||
As some decodings go one for more iterations than others, the number of values
|
||||
which are averaged for each datapoints vary.
|
||||
This explains the dip visible in all curves around $k=20$, since after
|
||||
this point more and more correct decodings are completed,
|
||||
This explains the dip visible in all curves around the 20th iteration, since
|
||||
after this point more and more correct decodings are completed,
|
||||
leaving more and more faulty ones to be averaged.
|
||||
Additionally, at this point the decline in the average error stagnates,
|
||||
rendering an increase in $K$ counterproductive as it only raises the average
|
||||
@ -628,7 +628,7 @@ means to bring about numerical stability.
|
||||
|
||||
\subsection{Decoding Performance}
|
||||
|
||||
\begin{figure}[H]
|
||||
\begin{figure}[h]
|
||||
\centering
|
||||
|
||||
\begin{tikzpicture}
|
||||
@ -749,13 +749,14 @@ non-convergence of the algorithm instead of convergence to the wrong codeword,
|
||||
raises the question why the decoding process does not converge so often.
|
||||
In figure \ref{fig:prox:convergence}, the iterative process is visualized.
|
||||
In order to be able to simultaneously consider all components of the vectors
|
||||
being dealt with, a BCH code with $n=7$ and $k=4$ has been chosen.
|
||||
Each chart shows one component of the current estimate during a given
|
||||
iteration (alternating between $\boldsymbol{r}$ and $\boldsymbol{s}$), as well
|
||||
being dealt with, a BCH code with $n=7$ and $k=4$ is chosen.
|
||||
Each plot shows one component of the current estimate during a given
|
||||
iteration ($\boldsymbol{r}$ and $\boldsymbol{s}$ are counted as different
|
||||
estimates and their values are interwoven to obtain the shown result), as well
|
||||
as the gradients of the negative log-likelihood and the code-constraint
|
||||
polynomial, which influence the next estimate.
|
||||
|
||||
\begin{figure}[H]
|
||||
\begin{figure}[h]
|
||||
\begin{minipage}[c]{0.25\textwidth}
|
||||
\centering
|
||||
|
||||
@ -955,10 +956,10 @@ polynomial, which influence the next estimate.
|
||||
%
|
||||
\noindent It is evident that in all cases, past a certain number of
|
||||
iterations, the estimate starts to oscillate around a particular value.
|
||||
After a certain point, the two gradients stop further approaching the value
|
||||
Jointly, the two gradients stop further approaching the value
|
||||
zero.
|
||||
In particular, this leads to the code-constraints polynomial not being
|
||||
minimized.
|
||||
This leads to the two terms of the objective function an in particular the
|
||||
code-constraint polynomial not being minimized.
|
||||
As such, the constraints are not being satisfied and the estimate is not
|
||||
converging towards a valid codeword.
|
||||
|
||||
@ -969,17 +970,28 @@ This can be justified by looking at the gradients themselves.
|
||||
In figure \ref{fig:prox:gradients} the gradients of the negative
|
||||
log-likelihood and the code-constraint polynomial for a repetition code with
|
||||
$n=2$ are shown.
|
||||
The two valid codewords of the $n=2$ repetition code can be recognized in
|
||||
figure \ref{fig:prox:gradients:h} as
|
||||
$\boldsymbol{c}_1 = \begin{bmatrix} -1 & -1 \end{bmatrix} $ and
|
||||
$\boldsymbol{c}_2 = \begin{bmatrix} 1 & 1 \end{bmatrix}$;
|
||||
these are also the points producing the global minima of the code-constraint
|
||||
polynomial.
|
||||
The gradient of the negative log-likelihood points towards the received
|
||||
codeword as can be seen in figure \ref{fig:prox:gradients:L},
|
||||
since assuming \ac{AWGN} and no other information that is the
|
||||
estimate maximizing the likelihood.
|
||||
|
||||
It is obvious that walking along the gradients in an alternating fashion will
|
||||
produce a net movement in a certain direction, as long as the two gradients
|
||||
produce a net movement in a certain direction, as long as they
|
||||
have a common component.
|
||||
As soon as this common component is exhausted, they will start pulling the
|
||||
estimate in opposing directions, leading to an oscillation as illustrated
|
||||
in figure \ref{fig:prox:convergence}.
|
||||
Consequently, this oscillation is an intrinsic property of the structure of
|
||||
the proximal decoding algorithm, where the two parts of the objective function
|
||||
are minimized in an alternating manner by use of their gradients.%
|
||||
are minimized in an alternating manner by use of their gradients.
|
||||
%
|
||||
\begin{figure}[H]
|
||||
\begin{figure}[h]
|
||||
\centering
|
||||
|
||||
\begin{subfigure}[c]{0.5\textwidth}
|
||||
@ -1058,12 +1070,6 @@ are minimized in an alternating manner by use of their gradients.%
|
||||
\label{fig:prox:gradients}
|
||||
\end{figure}%
|
||||
%
|
||||
|
||||
\todo{Better explain what is visible on the two gradient plots: the two valid
|
||||
codewords (-1, -1) and (1, 1); the section between them where a decision in
|
||||
which direction to move is difficult; maybe say why the gradient of $L$ points
|
||||
to one specific point}
|
||||
|
||||
While the initial net movement is generally directed in the right direction
|
||||
owing to the gradient of the negative log-likelihood, the final oscillation
|
||||
may well take place in a segment of space not corresponding to a valid
|
||||
@ -1087,9 +1093,9 @@ not immediately clear which codeword is the most likely one.
|
||||
Raising the value of $\gamma$ results in
|
||||
$h \left( \tilde{\boldsymbol{x}} \right)$ dominating the landscape of the
|
||||
objective function, thereby introducing these local minima into the objective
|
||||
function. \todo{Show equation again and explain on the basis of the equation}
|
||||
function.
|
||||
|
||||
When considering codes with larger $n$, the behaviour generally stays the
|
||||
When considering codes with larger $n$ the behaviour generally stays the
|
||||
same, with some minor differences.
|
||||
In figure \ref{fig:prox:convergence_large_n} the decoding process is
|
||||
visualized for one component of a code with $n=204$, for a single decoding.
|
||||
@ -1103,9 +1109,10 @@ gradient of the negative log-likelihood is counteracted.
|
||||
In conclusion, as a general rule, the proximal decoding algorithm reaches
|
||||
an oscillatory state which it cannot escape as a consequence of its structure.
|
||||
In this state the constraints may not be satisfied, leading to the algorithm
|
||||
returning an invalid codeword.
|
||||
exhausting its maximum number of iterations without converging and returning
|
||||
an invalid codeword.
|
||||
|
||||
\begin{figure}[H]
|
||||
\begin{figure}[h]
|
||||
\centering
|
||||
|
||||
\begin{tikzpicture}
|
||||
@ -1133,11 +1140,15 @@ returning an invalid codeword.
|
||||
\end{axis}
|
||||
\end{tikzpicture}
|
||||
|
||||
\caption{Internal variables of proximal decoder as a function of the iteration ($n=204$)}
|
||||
\caption{Visualization of a single decoding operation\protect\footnotemark{}
|
||||
for a code with $n=204$}
|
||||
\label{fig:prox:convergence_large_n}
|
||||
\end{figure}%
|
||||
|
||||
\todo{Fix captions / footnotes referencing the different codes in all figures}
|
||||
%
|
||||
\footnotetext{(3,6) regular \ac{LDPC} code with n = 204, k = 102
|
||||
\cite[\text{204.33.484}]{mackay_enc}; $\gamma=0.05, \omega = 0.05, K=200, \eta=1.5$
|
||||
}%
|
||||
%
|
||||
|
||||
|
||||
\subsection{Computational Performance}
|
||||
@ -1161,16 +1172,16 @@ codes in an \ac{AWGN} channel is $\mathcal{O}\left( n \right)$, which is
|
||||
practical since it is the same as that of $\ac{BP}$.
|
||||
|
||||
This theoretical analysis is also corroborated by the practical results shown
|
||||
in figure \ref{fig:prox:time_comp}. \todo{Note about no very large $n$ codes being
|
||||
used due to memory requirements?}
|
||||
in figure \ref{fig:prox:time_comp}.
|
||||
Some deviations from linear behaviour are unavoidable because not all codes
|
||||
considered are actually \ac{LDPC} codes, or \ac{LDPC} codes constructed
|
||||
according to the same scheme.
|
||||
\todo{Mention on what hardware the results where generated}
|
||||
Nontheless, a generally linear relationship between the average time needed to
|
||||
decode a received frame and the length $n$ of the frame can be observed.
|
||||
These results were generated on an Intel Core i7-7700HQ 4-core CPU, running at
|
||||
$\SI{2.80}{GHz}$ and utilizing all cores.
|
||||
|
||||
\begin{figure}[H]
|
||||
\begin{figure}[h]
|
||||
\centering
|
||||
|
||||
\begin{tikzpicture}
|
||||
@ -1186,7 +1197,7 @@ decode a received frame and the length $n$ of the frame can be observed.
|
||||
\end{axis}
|
||||
\end{tikzpicture}
|
||||
|
||||
\caption{Time requirements of proximal decoding algorithm imlementation%
|
||||
\caption{Time requirements of the proximal decoding algorithm imlementation%
|
||||
\protect\footnotemark{}}
|
||||
\label{fig:prox:time_comp}
|
||||
\end{figure}%
|
||||
@ -1224,11 +1235,12 @@ $\nabla h\left( \tilde{\boldsymbol{x}} \right) $ may be related in its
|
||||
magnitude to the confidence that a given bit is correct.
|
||||
And indeed, the magnitude of the oscillation of
|
||||
$\nabla h\left( \tilde{\boldsymbol{x}} \right)$ (introduced previously in
|
||||
section \ref{subsec:prox:conv_properties}) and the probability of having a bit
|
||||
section \ref{subsec:prox:conv_properties} and shown in figure
|
||||
\ref{fig:prox:convergence_large_n}) and the probability of having a bit
|
||||
error are strongly correlated, a relationship depicted in figure
|
||||
\ref{fig:prox:correlation}.
|
||||
|
||||
\begin{figure}[H]
|
||||
\begin{figure}[h]
|
||||
\centering
|
||||
|
||||
\begin{tikzpicture}
|
||||
@ -1249,19 +1261,23 @@ error are strongly correlated, a relationship depicted in figure
|
||||
\end{axis}
|
||||
\end{tikzpicture}
|
||||
|
||||
\caption{Correlation between bit error and amplitude of oscillation}
|
||||
\caption{Correlation between the occurrence of a bit error and the
|
||||
amplitude of oscillation of the gradient of the code-constraint polynomial%
|
||||
\protect\footnotemark{}}
|
||||
\label{fig:prox:correlation}
|
||||
\end{figure}
|
||||
|
||||
\todo{Mention that the variance of the oscillation is measured
|
||||
after a given number of iterations}
|
||||
\end{figure}%
|
||||
%
|
||||
\footnotetext{(3,6) regular \ac{LDPC} code with n = 204, k = 102
|
||||
\cite[\text{204.33.484}]{mackay_enc}; $\gamma = 0.05, \omega = 0.05, K=100, \eta=1.5$
|
||||
}%
|
||||
%
|
||||
|
||||
\noindent The y-axis depicts whether there is a bit error and the x-axis the
|
||||
variance in $\nabla h\left( \tilde{\boldsymbol{x}} \right)$ past the iteration
|
||||
$k=100$. While this is not exactly the magnitude of the oscillation, it is
|
||||
variance in $\nabla h\left( \tilde{\boldsymbol{x}} \right)$ after the
|
||||
100th iteration.
|
||||
While this is not exactly the magnitude of the oscillation, it is
|
||||
proportional and easier to compute.
|
||||
The datapoints are taken from a single decoding operation
|
||||
\todo{Generate same figure with multiple decodings}.
|
||||
|
||||
Using this observation as a rule to determine the $N\in\mathbb{N}$ most
|
||||
probably wrong bits, all variations of the estimate with those bits modified
|
||||
@ -1286,14 +1302,14 @@ for $K$ iterations do
|
||||
end for
|
||||
$\textcolor{KITblue}{\text{Find }N\text{ most probably wrong bits}}$
|
||||
$\textcolor{KITblue}{\text{Generate variations } \boldsymbol{\tilde{c}}_l,\hspace{1mm}
|
||||
l\in [1:n]\text{ of } \boldsymbol{\hat{c}}\text{ with the }N\text{ bits modified}}$
|
||||
l\in \mathbb{N}\text{ of } \boldsymbol{\hat{c}}\text{ with the }N\text{ bits modified}}$
|
||||
$\textcolor{KITblue}{\text{Compute }d_H\left( \boldsymbol{ \tilde{c}}_l,
|
||||
\boldsymbol{\hat{c}} \right) \text{ for all valid codewords } \boldsymbol{\tilde{c}}_l}$
|
||||
$\textcolor{KITblue}{\text{Output }\boldsymbol{\tilde{c}}_l\text{ with lowest }
|
||||
d_H\left( \boldsymbol{ \tilde{c}}_l, \boldsymbol{\hat{c}} \right)}$
|
||||
\end{genericAlgorithm}
|
||||
|
||||
\todo{Not hamming distance, correlation}
|
||||
%\todo{Not hamming distance, correlation}
|
||||
|
||||
Figure \ref{fig:prox:improved_results} shows the gain that can be achieved
|
||||
when the number $N$ is chosen to be 12.
|
||||
@ -1304,13 +1320,12 @@ with solid lines and the results for the improved version are shown with
|
||||
dashed lines.
|
||||
For the case of $\gamma = 0.05$, the number of frame errors produced for the
|
||||
datapoints at $\SI{6}{dB}$, $\SI{6.5}{dB}$ and $\SI{7}{dB}$ are
|
||||
70, 17 and 2, respectively. \todo{Redo simulation with higher number of iterations}
|
||||
70, 17 and 2, respectively.
|
||||
The gain seems to depend on the value of $\gamma$, as well as become more
|
||||
pronounced for higher \ac{SNR} values.
|
||||
This is to be expected, since with higher \ac{SNR} values the number of bit
|
||||
errors decreases, making the correction of those errors in the ML-in-the-List
|
||||
step more likely.
|
||||
|
||||
In figure \ref{fig:prox:improved:comp} the decoding performance
|
||||
between proximal decoding and the improved algorithm is compared for a number
|
||||
of different codes.
|
||||
@ -1320,8 +1335,9 @@ generate the point for the improved algorithm for $\gamma=0.05$ at
|
||||
$\SI{5.5}{dB}$.
|
||||
Similar behaviour can be observed in all cases, with varying improvement over
|
||||
standard proximal decoding.
|
||||
In some cases, a gain of up to $\SI{1}{dB}$ and higher can be achieved.
|
||||
|
||||
\begin{figure}[H]
|
||||
\begin{figure}[h]
|
||||
\centering
|
||||
|
||||
\begin{tikzpicture}
|
||||
@ -1459,13 +1475,12 @@ average time needed to decode a single received frame is visualized for
|
||||
proximal decoding as well as for the improved algorithm.
|
||||
It should be noted that some variability in the data is to be expected,
|
||||
since the timing of the actual simulations depends on a multitude of other
|
||||
parameters such as the outside temperature (because of thermal throttling),
|
||||
the scheduling choices of the operating system as well as variations in the
|
||||
implementations themselves.
|
||||
parameters such as the scheduling choices of the operating system as well as
|
||||
variations in the implementations themselves.
|
||||
Nevertheless, the empirical data serves, at least in part, to validate the
|
||||
theoretical considerations.
|
||||
|
||||
\begin{figure}[H]
|
||||
\begin{figure}[h]
|
||||
\centering
|
||||
|
||||
\begin{tikzpicture}
|
||||
@ -1501,8 +1516,7 @@ theoretical considerations.
|
||||
In conclusion, the decoding performance of proximal decoding can be improved
|
||||
by appending an ML-in-the-List step when the algorithm does not produce a
|
||||
valid result.
|
||||
The gain can in some cases be as high as $\SI{1}{dB}$ \todo{Explicitly mention this value earlier}
|
||||
and is achievable with
|
||||
The gain can in some cases be as high as $\SI{1}{dB}$ and is achievable with
|
||||
negligible computational performance penalty.
|
||||
The improvement is mainly noticable for higher \ac{SNR} values and depends on
|
||||
the code as well as the chosen parameters.
|
||||
|
||||
Loading…
Reference in New Issue
Block a user