Minor wording changes; Added paragraphs in proximal improvement section
This commit is contained in:
parent
1632c0744e
commit
3b4e0e885f
@ -263,7 +263,8 @@ It was subsequently reimplemented in C++ using the Eigen%
|
||||
\footnote{\url{https://eigen.tuxfamily.org}}
|
||||
linear algebra library to achieve higher performance.
|
||||
The focus has been set on a fast implementation, sometimes at the expense of
|
||||
memory usage.
|
||||
memory usage, somewhat limiting the size of the codes the implemenation can be
|
||||
used with \todo{Is this a appropriate for a bachelor's thesis?}.
|
||||
The evaluation of the simulation results has been wholly realized in Python.
|
||||
|
||||
The gradient of the code-constraint polynomial \cite[Sec. 2.3]{proximal_paper}
|
||||
@ -859,8 +860,7 @@ the frame errors may largely be attributed to decoding failures.
|
||||
The previous observation, that the \ac{FER} arises mainly due to the
|
||||
non-convergence of the algorithm instead of convergence to the wrong codeword,
|
||||
raises the question why the decoding process does not converge so often.
|
||||
In figure \ref{fig:prox:convergence}, the iterative process is visualized
|
||||
for each iteration.
|
||||
In figure \ref{fig:prox:convergence}, the iterative process is visualized.
|
||||
In order to be able to simultaneously consider all components of the vectors
|
||||
being dealt with, a BCH code with $n=7$ and $k=4$ is chosen.
|
||||
Each chart shows one component of the current estimates during a given
|
||||
@ -1076,7 +1076,8 @@ As such, the constraints are not being satisfied and the estimate is not
|
||||
converging towards a valid codeword.
|
||||
|
||||
While figure \ref{fig:prox:convergence} shows only one instance of a decoding
|
||||
task, it is indicative of the general behaviour of the algorithm.
|
||||
task, with no statistical significance, it is indicative of the general
|
||||
behaviour of the algorithm.
|
||||
This can be justified by looking at the gradients themselves.
|
||||
In figure \ref{fig:prox:gradients} the gradients of the negative
|
||||
log-likelihood and the code-constraint polynomial for a repetition code with
|
||||
@ -1089,8 +1090,8 @@ estimate in opposing directions, leading to an oscillation as illustrated
|
||||
in figure \ref{fig:prox:convergence}.
|
||||
Consequently, this oscillation is an intrinsic property of the structure of
|
||||
the proximal decoding algorithm, where the two parts of the objective function
|
||||
are minimized in an alternating manner using their gradients.
|
||||
|
||||
are minimized in an alternating manner by use of their gradients.%
|
||||
%
|
||||
\begin{figure}[H]
|
||||
\centering
|
||||
|
||||
@ -1127,6 +1128,7 @@ are minimized in an alternating manner using their gradients.
|
||||
|
||||
\caption{$\nabla L \left(\boldsymbol{y} \mid \tilde{\boldsymbol{x}} \right) $
|
||||
for a repetition code with $n=2$}
|
||||
\label{fig:prox:gradients:L}
|
||||
\end{subfigure}%
|
||||
\hfill%
|
||||
\begin{subfigure}[c]{0.5\textwidth}
|
||||
@ -1161,10 +1163,14 @@ are minimized in an alternating manner using their gradients.
|
||||
|
||||
\caption{$\nabla h \left( \tilde{\boldsymbol{x}} \right) $
|
||||
for a repetition code with $n=2$}
|
||||
\label{fig:prox:gradients:h}
|
||||
\end{subfigure}%
|
||||
\end{figure}
|
||||
|
||||
|
||||
\caption{Gradiensts of the negative log-likelihood and the code-constraint
|
||||
polynomial}
|
||||
\label{fig:prox:gradients}
|
||||
\end{figure}%
|
||||
%
|
||||
While the initial net movement is generally directed in the right direction
|
||||
owing to the gradient of the negative log-likelihood, the final oscillation
|
||||
may well take place in a segment of space not corresponding to a valid
|
||||
@ -1173,17 +1179,34 @@ This also partly explains the difference in decoding performance when looking
|
||||
at the \ac{BER} and \ac{FER}, as it would lower the amount of bit errors while
|
||||
still yielding an invalid codeword.
|
||||
|
||||
The higher the \ac{SNR}, the more likely the gradient of the negative
|
||||
log-likelihood is to point to a valid codeword.
|
||||
The common component of the two gradient then pulls the estimate closer to
|
||||
a valid codeword before the oscillation takes place.
|
||||
This explains why the decoding performance is so much better for higher
|
||||
\acp{SNR}.
|
||||
|
||||
When considering codes with larger $n$, the behaviour generally stays the
|
||||
same, with some minor differences.
|
||||
In figure \ref{fig:prox:convergence_large_n} the decoding process is
|
||||
visualized for one component of a code with $n=204$, for a single decoding.
|
||||
The two gradients still start to fight each other and the estimate still
|
||||
starts to oscillate, the same as illustrated on the basis of figure
|
||||
\ref{fig:prox:convergence} for a code with $n=7$.
|
||||
The two gradients still eventually oppose each other and the estimate still
|
||||
starts to oscillate, the same as illustrated in figure
|
||||
\ref{fig:prox:convergence} on the basis of a code with $n=7$.
|
||||
However, in this case, the gradient of the code-constraint polynomial iself
|
||||
starts to oscillate, its average value being such that the effect of the
|
||||
gradient of the negative log-likelihood is counteracted.
|
||||
|
||||
Looking at figure \ref{fig:prox:gradients:h} it also becomes apparent why the
|
||||
value of the parameter $\gamma$ has to be kept small, as mentioned in section
|
||||
\ref{sec:prox:Decoding Algorithm}.
|
||||
Local minima are introduced between the codewords, in the ares in which it is
|
||||
not immediately clear which codeword is the most likely one.
|
||||
Raising the value of $\gamma$ results in
|
||||
$h \left( \tilde{\boldsymbol{x}} \right)$ dominating the landscape of the
|
||||
objective function, thereby introducing these local minima into the objective
|
||||
function.
|
||||
|
||||
In conclusion, as a general rule, the proximal decoding algorithm reaches
|
||||
an oscillatory state which it cannot escape as a consequence of its structure.
|
||||
In this state, the constraints may not be satisfied, leading to the algorithm
|
||||
@ -1237,8 +1260,8 @@ returning an invalid codeword.
|
||||
\label{sec:prox:Improved Implementation}
|
||||
|
||||
As mentioned earlier, frame errors seem to mainly stem from decoding failures.
|
||||
This, coupled with the fact that the \ac{BER} indicates so much better
|
||||
performance than the \ac{FER}, leads to the assumption that only a small
|
||||
Coupled with the fact that the \ac{BER} indicates so much better
|
||||
performance than the \ac{FER}, this leads to the assumption that only a small
|
||||
number of components of the estimated vector may be responsible for an invalid
|
||||
result.
|
||||
If it was possible to limit the number of possibly wrong components of the
|
||||
@ -1247,13 +1270,66 @@ a limited number of possible results (``ML-in-the-List'' as it will
|
||||
subsequently be called) to improve the decoding performance.
|
||||
This concept is pursued in this section.
|
||||
|
||||
\begin{itemize}
|
||||
\item Decoding performance and comparison with standard proximal decoding
|
||||
\item Computational performance and comparison with standard proximal decoding
|
||||
\item Conclusion
|
||||
\begin{itemize}
|
||||
\item Summary
|
||||
\item Up to $\SI{1}{dB}$ gain possible
|
||||
\end{itemize}
|
||||
\end{itemize}
|
||||
First, a guideline has to be found with which to assess the probability that
|
||||
a given component of an estimate is wrong.
|
||||
One compelling observation is that the closer an estimate is to being at a
|
||||
valid codeword, the smaller the magnitude of the gradient of the
|
||||
code-constraint polynomial, as illustrated in figure \ref{fig:prox:gradients}.
|
||||
This gives rise to the notion that some property or behaviour of
|
||||
$\nabla h\left( \tilde{\boldsymbol{x}} \right) $ may be related in its
|
||||
magnitude to the confidence that a given bit is correct.
|
||||
And indeed, the magnitude of the oscillation of
|
||||
$\nabla h\left( \tilde{\boldsymbol{x}} \right)$ (introduced in a previous
|
||||
section) and the probability of having a bit error are strongly correlated,
|
||||
a relationship depicted in figure \ref{fig:prox:correlation}.
|
||||
|
||||
TODO: Figure
|
||||
|
||||
\noindent The y-axis depicts whether there is a bit error and the x-axis the
|
||||
variance in $\nabla h\left( \tilde{\boldsymbol{x}} \right)$ past the iteration
|
||||
$k=100$. While this is not exactly the magnitude of the oscillation, it is
|
||||
proportional and easier to compute.
|
||||
The datapoints are taken from a single decoding operation
|
||||
\todo{Generate same figure with multiple decodings}.
|
||||
|
||||
Using this observation as a rule to determine the $N\in\mathbb{N}$ most
|
||||
probably wrong bits, all variations of the estimate with those bits modified
|
||||
can be generated.
|
||||
An \ac{ML}-in-the-List step can then be performed in order to determine the
|
||||
most likely candidate.
|
||||
This process is outlined in figure \ref{fig:prox:improved_algorithm}.
|
||||
|
||||
Figure \ref{fig:prox:improved_results} shows the gain that can be achieved.
|
||||
Again, three values of gamma are chosen, for which the \ac{BER}, \ac{FER}
|
||||
and decoding failure rate is plotted.
|
||||
The simulation results for the original proximal decoding algorithm are shown
|
||||
with solid lines and the results for the improved version are shown with
|
||||
dashed lines.
|
||||
The gain seems to depend on the value of $\gamma$, as well as become more
|
||||
pronounced for higher \ac{SNR} values.
|
||||
This is to be expected, since with higher \ac{SNR} values the number of bit
|
||||
errors decreases, making the correction of those errors in the ML-in-the-List
|
||||
step more likely.
|
||||
In figure \ref{fig:prox:improved_results_multiple} the decoding performance
|
||||
between proximal decoding and the improved algorithm is compared for a number
|
||||
of different codes.
|
||||
Similar behaviour can be observed in all cases, with varying improvement over
|
||||
standard proximal decoding.
|
||||
|
||||
Interestingly, the improved algorithm does not have much different time
|
||||
complexity than proximal decoding.
|
||||
This is the case, because the ML-in-the-List step is only performed when the
|
||||
proximal decoding algorithm produces an invalid result, which in absolute
|
||||
terms happens relatively infrequently.
|
||||
This is illustrated in figure \ref{fig:prox:time_complexity_comp}, where the
|
||||
average time needed to decode a single received frame is visualized for
|
||||
proximal decoding as well as for the improved algorithm.
|
||||
|
||||
In conclusion, the decoding performance of proximal decoding can be improved
|
||||
by appending an ML-in-the-List step when the algorithm does not produce a
|
||||
valid result.
|
||||
The gain is in some cases as high as $\SI{1}{dB}$ and can be achieved with
|
||||
negligible computational performance penalty.
|
||||
The improvement is mainly noticable for higher \ac{SNR} values and depends on
|
||||
the code as well as the chosen parameters.
|
||||
|
||||
|
||||
Loading…
Reference in New Issue
Block a user