Finished first version of comparison chapter

This commit is contained in:
Andreas Tsouchlos 2023-04-22 18:10:36 +02:00
parent c860c16a1b
commit 488949c0a9

View File

@ -2,9 +2,9 @@
\label{chapter:comparison}
In this chapter, proximal decoding and \ac{LP} Decoding using \ac{ADMM} are compared.
First the two algorithms are compared on a theoretical basis.
First, the two algorithms are studied on a theoretical basis.
Subsequently, their respective simulation results are examined, and their
differences are interpreted on the basis of their theoretical structure.
differences are interpreted based on their theoretical structure.
%some similarities between the proximal decoding algorithm
%and \ac{LP} decoding using \ac{ADMM} are be pointed out.
@ -40,10 +40,11 @@ by moving the constraints into the objective function, as shown in figure
\ref{fig:ana:theo_comp_alg:admm}.
Both algorithms are composed of an iterative approach consisting of two
alternating steps.
The objective functions of both problems are similar in that they
The objective functions of the two problems are similar in that they
both comprise two parts: one associated to the likelihood that a given
codeword was sent and one associated to the constraints the decoding process
is subjected to.
codeword was sent, stemming from the channel model, and one associated
to the constraints the decoding process is subjected to, stemming from the
code used.
%
\begin{figure}[h]
@ -123,15 +124,11 @@ Their major difference is that while with proximal decoding the constraints
are regarded in a global context, considering all parity checks at the same
time, with \ac{ADMM} each parity check is
considered separately and in a more local context (line 4 in both algorithms).
This difference means that while with proximal decoding the alternating
minimization of the two parts of the objective function inevitably leads to
oscillatory behavior (as explained in section
\ref{subsec:prox:conv_properties}), this is not the case with \ac{ADMM}, which
partly explains the disparate decoding performance of the two methods.
Furthermore, while with proximal decoding the step considering the constraints
is realized using gradient descent - amounting to an approximation -
with \ac{ADMM} it reduces to a number of projections onto the parity polytopes
$\mathcal{P}_{d_j}$ which always provide exact results.
$\mathcal{P}_{d_j}, \hspace{1mm} j\in\mathcal{J}$, which always provide exact
results.
The contrasting treatment of the constraints (global and approximate with
proximal decoding as opposed to local and exact with \ac{LP} decoding using
@ -142,34 +139,27 @@ calculation, whereas with \ac{LP} decoding it occurs due to the approximate
formulation of the constraints - independent of the optimization method
itself.
The advantage which arises because of this when employing \ac{LP} decoding is
that it can be easily detected \todo{Not 'easily' detected}, when the algorithm gets stuck - it
returns a solution corresponding to a pseudocodeword, the components of which
are fractional.
Moreover, when a valid codeword is returned, it is also the \ac{ML} codeword.
the \ac{ML} certificate property: when a valid codeword is returned, it is
also the \ac{ML} codeword.
This means that additional redundant parity-checks can be added successively
until the codeword returned is valid and thus the \ac{ML} solution is found
\cite[Sec. IV.]{alp}.
In terms of time complexity, the two decoding algorithms are comparable.
In terms of time complexity the two decoding algorithms are comparable.
Each of the operations required for proximal decoding can be performed
in linear time for \ac{LDPC} codes (see section \ref{subsec:prox:comp_perf}).
The same is true for the $\tilde{\boldsymbol{c}}$- and $\boldsymbol{u}$-update
steps of \ac{LP} decoding using \ac{ADMM}, while
the projection step has a worst-case time complexity of
$\mathcal{O}\left( n^2 \right)$ and an average complexity of
$\mathcal{O}\left( n \right)$ (see section TODO, \cite[Sec. VIII.]{lautern}).
Both algorithms can be understood as message-passing algorithms, \ac{LP}
decoding using \ac{ADMM} as similarly to \cite[Sec. III. D.]{original_admm}
or \cite[Sec. II. B.]{efficient_lp_dec_admm} and proximal decoding by
starting with algorithm \ref{alg:prox},
substituting for the gradient of the code-constraint polynomial and separating
it into two parts.
in $\mathcal{O}\left( n \right) $ time for \ac{LDPC} codes (see section
\ref{subsec:prox:comp_perf}).
The same is true for \ac{LP} decoding using \ac{ADMM} (see section
\ref{subsec:admm:comp_perf}).
Additionally, both algorithms can be understood as message-passing algorithms,
\ac{LP} decoding using \ac{ADMM} as similarly to
\cite[Sec. III. D.]{original_admm} and
\cite[Sec. II. B.]{efficient_lp_dec_admm}, and proximal decoding by starting
with algorithm \ref{alg:prox}, substituting for the gradient of the
code-constraint polynomial and separating it into two parts.
The algorithms in their message-passing form are depicted in figure
\ref{fig:comp:message_passing}.
$M_{j\to i}$ denotes a message transmitted from \ac{CN} j to \ac{VN} i.
$M_{j\to}$ signifies the special case where a \ac{VN} transmits the same
message to all \acp{VN}.
%
\begin{figure}[h]
\centering
@ -184,14 +174,14 @@ Initialize $\boldsymbol{r}, \boldsymbol{s}, \omega, \gamma$
while stopping critierion unfulfilled do
for j in $\mathcal{J}$ do
$p_j \leftarrow \prod_{i\in N_c\left( j \right) } r_i $
$M_{j\to} \leftarrow p_j^2 - p_j$|\Suppressnumber|
$M_{j\to i} \leftarrow p_j^2 - p_j$|\Suppressnumber|
|\vspace{0.22mm}\Reactivatenumber|
end for
for i in $\mathcal{I}$ do
$s_i \leftarrow s_i + \gamma \left[ 4\left( s_i^2 - 1 \right)s_i
\phantom{\frac{4}{s_i}}\right.$|\Suppressnumber|
|\Reactivatenumber|$\left.+ \frac{4}{s_i}\sum_{j\in N_v\left( i \right) }
M_{j\to} \right] $
M_{j\to i} \right] $
$r_i \leftarrow r_i + \omega \left( s_i - y_i \right)$
end for
end while
@ -237,15 +227,10 @@ return $\tilde{\boldsymbol{c}}$
\label{fig:comp:message_passing}
\end{figure}%
%
It is evident that while the two algorithms are very similar in their general
structure, with \ac{LP} decoding using \ac{ADMM}, multiple messages have to be
computed for each check node (line 6 in figure
\ref{fig:comp:message_passing:admm}), whereas
with proximal decoding, the same message is transmitted to all \acp{VN}
(line 5 of figure \ref{fig:comp:message_passing:proximal}).
This means that while both algorithms have an average time complexity of
$\mathcal{O}\left( n \right)$, more arithmetic operations are required in the
\ac{ADMM} case.
\todo{Remove figure caption and add algorithm caption}
This message passing structure means that both algorithms can be implemented
very efficiently, as the update steps can be performed in parallel for all
\acp{CN} and for all \acp{VN}, respectively.
In conclusion, the two algorithms have a very similar structure, where the
parts of the objective function relating to the likelihood and to the
@ -253,9 +238,8 @@ constraints are minimized in an alternating fashion.
With proximal decoding this minimization is performed for all constraints at once
in an approximative manner, while with \ac{LP} decoding using \ac{ADMM} it is
performed for each constraint individually and with exact results.
In terms of time complexity, both algorithms are, on average, linear with
respect to $n$, although for \ac{LP} decoding using \ac{ADMM} significantly
more arithmetic operations are necessary in each iteration.
In terms of time complexity, both algorithms are linear with
respect to $n$ and are heavily parallelisable.
@ -263,24 +247,88 @@ more arithmetic operations are necessary in each iteration.
\section{Comparison of Simulation Results}%
\label{sec:comp:res}
\begin{itemize}
\item The comparison of actual implementations is always debatable /
contentious, since it is difficult to separate differences in
algorithm performance from differences in implementation
\item No large difference in computational performance $\rightarrow$
Parallelism cannot come to fruition as decoding is performed on the
same number of cores for both algorithms (Multiple decodings in parallel)
\item Nonetheless, in realtime applications / applications where the focus
is not the mass decoding of raw data, \ac{ADMM} has advantages, since
the decoding of a single codeword is performed faster
\item \ac{ADMM} faster than proximal decoding $\rightarrow$
Parallelism
\item Proximal decoding faster than \ac{ADMM} $\rightarrow$ dafuq
(larger number of iterations before convergence? More values to compute for ADMM?)
\end{itemize}
The decoding performance of the two algorithms is shown in figure
\ref{fig:comp:prox_admm_dec} in the form of the \ac{FER}.
Shown as well are the performance of the improved proximal decoding
altorithm presented in section \ref{sec:prox:Improved Implementation}
and, wherever available, the \ac{ML} decoding \ac{FER}.
The parameters chosen for the proximal and improved proximal decoders are
$\gamma=0.05$, $\omega=0.05$, $K=200$, $\eta = 1.5$ and $N=12$.
The parameters chosen for \ac{LP} decoding using \ac{ADMM} are $\mu = 5$,
$\rho = 1$, $K=200$, $\epsilon_\text{pri} = 10^{-5}$ and
$\epsilon_\text{dual} = 10^{-5}$.
For all codes considered in the scope of this work, \ac{LP} decoding using
\ac{ADMM} consistently outperforms both proximal decoding and the improved
version.
The decoding gain heavily depends on the code, evidently becoming greater for
codes with larger $n$ and reaching values of up to $\SI{2}{dB}$.
These simulation results can be interpreted with regard to the theoretical
structure of the decoding methods, as analyzed in section \ref{sec:comp:theo}.
The worse performance of proximal decoding is somewhat surprising considering
the global treatment of the constraints in contrast to the local treatment
in the case of \ac{LP} decoding using \ac{ADMM}.
It may be explained, however, in the context of the nature of the
calculations performed in each case.
With proximal decoding, the calculations are approximate, leading
to the constraints never being quite satisfied.
With \ac{LP} decoding using \ac{ADMM}
the constraints are fulfilled for each parity check individualy after each
iteration of the decoding process.
\begin{figure}[H]
The timing requirements of the decoding algorithms are visualized in figure
\ref{fig:comp:time}.
The datapoints have been generated by evaluating the metadata from \ac{FER}
and \ac{BER} simulations using the parameters mentioned earlier when
discussing the decoding performance.
While in this case the \ac{LP} decoding using \ac{ADMM} implementation seems
to be faster the the proximal decoding and improved proximal decoding
implementations, infering some general behavior is difficult.
This is because of the comparison of actual implementations, making the
results dependent on factors such as the grade of optimization of each
implementation.
Nevertheless, the run time of both implementations is similar and both are
reasonably performant, owing to the parallelisable structure of the
algorithms.
%
\begin{figure}[h]
\centering
\begin{tikzpicture}
\begin{axis}[grid=both,
xlabel={$n$}, ylabel={Time per frame (ms)},
width=0.6\textwidth,
height=0.45\textwidth,
legend style={at={(0.5,-0.52)},anchor=south},
legend cell align={left},]
\addplot[RedOrange, only marks, mark=square*]
table [col sep=comma, x=n, y=spf,
y expr=\thisrow{spf} * 1000]
{res/proximal/fps_vs_n.csv};
\addlegendentry{Proximal decoding}
\addplot[Gray, only marks, mark=*]
table [col sep=comma, x=n, y=spf,
y expr=\thisrow{spf} * 1000]
{res/hybrid/fps_vs_n.csv};
\addlegendentry{Improved proximal decoding ($N=12$)}
\addplot[NavyBlue, only marks, mark=triangle*]
table [col sep=comma, x=n, y=spf,
y expr=\thisrow{spf} * 1000]
{res/admm/fps_vs_n.csv};
\addlegendentry{\acs{LP} decoding using \acs{ADMM}}
\end{axis}
\end{tikzpicture}
\caption{Comparison of the timing requirements of the different decoder implementations}
\label{fig:comp:time}
\end{figure}%
%
\footnotetext{asdf}
%
\begin{figure}[h]
\centering
\begin{subfigure}[t]{0.48\textwidth}
@ -299,11 +347,14 @@ more arithmetic operations are necessary in each iteration.
\addplot[RedOrange, line width=1pt, mark=*, solid]
table [x=SNR, y=FER, col sep=comma, discard if not={gamma}{0.05}]
{res/proximal/2d_ber_fer_dfr_963965.csv};
\addplot[NavyBlue, line width=1pt, mark=triangle, densely dashed]
\addplot[RedOrange, line width=1pt, mark=triangle, densely dashed]
table [x=SNR, y=FER, col sep=comma, discard if not={gamma}{0.05}]
{res/hybrid/2d_ber_fer_dfr_963965.csv};
\addplot[NavyBlue, line width=1pt, mark=*]
table [x=SNR, y=FER, col sep=comma, discard if not={mu}{3.0}]
%{res/hybrid/2d_ber_fer_dfr_963965.csv};
{res/admm/ber_2d_963965.csv};
\addplot[PineGreen, line width=1pt, mark=triangle]
\addplot[Black, line width=1pt, mark=*]
table [col sep=comma, x=SNR, y=FER,]
{res/generic/fer_ml_9633965.csv};
\end{axis}
@ -329,10 +380,13 @@ more arithmetic operations are necessary in each iteration.
\addplot[RedOrange, line width=1pt, mark=*, solid]
table [x=SNR, y=FER, col sep=comma, discard if not={gamma}{0.05}]
{res/proximal/2d_ber_fer_dfr_bch_31_26.csv};
\addplot[NavyBlue, line width=1pt, mark=triangle, densely dashed]
\addplot[RedOrange, line width=1pt, mark=triangle, densely dashed]
table [x=SNR, y=FER, col sep=comma, discard if not={gamma}{0.05}]
{res/hybrid/2d_ber_fer_dfr_bch_31_26.csv};
\addplot[NavyBlue, line width=1pt, mark=*]
table [x=SNR, y=FER, col sep=comma, discard if not={mu}{3.0}]
{res/admm/ber_2d_bch_31_26.csv};
\addplot[PineGreen, line width=1pt, mark=triangle*]
\addplot[Black, line width=1pt, mark=*]
table [x=SNR, y=FER, col sep=comma,
discard if gt={SNR}{5.5},
discard if lt={SNR}{1},
@ -364,12 +418,15 @@ more arithmetic operations are necessary in each iteration.
discard if not={gamma}{0.05},
discard if gt={SNR}{5.5}]
{res/proximal/2d_ber_fer_dfr_20433484.csv};
\addplot[NavyBlue, line width=1pt, mark=triangle, densely dashed]
\addplot[RedOrange, line width=1pt, mark=triangle, densely dashed]
table [x=SNR, y=FER, col sep=comma, discard if not={gamma}{0.05}]
{res/hybrid/2d_ber_fer_dfr_20433484.csv};
\addplot[NavyBlue, line width=1pt, mark=*]
table [x=SNR, y=FER, col sep=comma,
discard if not={mu}{3.0},
discard if gt={SNR}{5.5}]
{res/admm/ber_2d_20433484.csv};
\addplot[PineGreen, line width=1pt, mark=triangle, solid]
\addplot[Black, line width=1pt, mark=*]
table [col sep=comma, x=SNR, y=FER,
discard if gt={SNR}{5.5}]
{res/generic/fer_ml_20433484.csv};
@ -396,7 +453,10 @@ more arithmetic operations are necessary in each iteration.
\addplot[RedOrange, line width=1pt, mark=*, solid]
table [x=SNR, y=FER, col sep=comma, discard if not={gamma}{0.05}]
{res/proximal/2d_ber_fer_dfr_20455187.csv};
\addplot[NavyBlue, line width=1pt, mark=triangle, densely dashed]
\addplot[RedOrange, line width=1pt, mark=triangle, densely dashed]
table [x=SNR, y=FER, col sep=comma, discard if not={gamma}{0.05}]
{res/hybrid/2d_ber_fer_dfr_20455187.csv};
\addplot[NavyBlue, line width=1pt, mark=*]
table [x=SNR, y=FER, col sep=comma, discard if not={mu}{3.0}]
{res/admm/ber_2d_20455187.csv};
\end{axis}
@ -424,7 +484,10 @@ more arithmetic operations are necessary in each iteration.
\addplot[RedOrange, line width=1pt, mark=*, solid]
table [x=SNR, y=FER, col sep=comma, discard if not={gamma}{0.05}]
{res/proximal/2d_ber_fer_dfr_40833844.csv};
\addplot[NavyBlue, line width=1pt, mark=triangle, densely dashed]
\addplot[RedOrange, line width=1pt, mark=triangle, densely dashed]
table [x=SNR, y=FER, col sep=comma, discard if not={gamma}{0.05}]
{res/hybrid/2d_ber_fer_dfr_40833844.csv};
\addplot[NavyBlue, line width=1pt, mark=*]
table [x=SNR, y=FER, col sep=comma, discard if not={mu}{3.0}]
{res/admm/ber_2d_40833844.csv};
\end{axis}
@ -450,7 +513,10 @@ more arithmetic operations are necessary in each iteration.
\addplot[RedOrange, line width=1pt, mark=*, solid]
table [x=SNR, y=FER, col sep=comma, discard if not={gamma}{0.05}]
{res/proximal/2d_ber_fer_dfr_pegreg252x504.csv};
\addplot[NavyBlue, line width=1pt, mark=triangle, densely dashed]
\addplot[RedOrange, line width=1pt, mark=triangle, densely dashed]
table [x=SNR, y=FER, col sep=comma, discard if not={gamma}{0.05}]
{res/hybrid/2d_ber_fer_dfr_pegreg252x504.csv};
\addplot[NavyBlue, line width=1pt, mark=*]
table [x=SNR, y=FER, col sep=comma, discard if not={mu}{3.0}]
{res/admm/ber_2d_pegreg252x504.csv};
\end{axis}
@ -470,14 +536,18 @@ more arithmetic operations are necessary in each iteration.
xmin=10, xmax=50,
ymin=0, ymax=0.4,
legend columns=1,
legend cell align={left},
legend style={draw=white!15!black}]
\addlegendimage{RedOrange, line width=1pt, mark=*, solid}
\addlegendentry{Proximal decoding}
\addlegendimage{RedOrange, line width=1pt, mark=triangle, densely dashed}
\addlegendentry{Improved proximal decoding}
\addlegendimage{NavyBlue, line width=1pt, mark=triangle, densely dashed}
\addlegendimage{NavyBlue, line width=1pt, mark=*}
\addlegendentry{\acs{LP} decoding using \acs{ADMM}}
\addlegendimage{PineGreen, line width=1pt, mark=triangle*, solid}
\addlegendimage{Black, line width=1pt, mark=*, solid}
\addlegendentry{\acs{ML} decoding}
\end{axis}
\end{tikzpicture}
@ -488,32 +558,4 @@ more arithmetic operations are necessary in each iteration.
\label{fig:comp:prox_admm_dec}
\end{figure}
\begin{figure}[h]
\centering
\begin{tikzpicture}
\begin{axis}[grid=both,
xlabel={$n$}, ylabel={Time per frame (s)},
width=0.6\textwidth,
height=0.45\textwidth,
legend style={at={(0.5,-0.42)},anchor=south},
legend cell align={left},]
\addplot[RedOrange, only marks, mark=*]
table [col sep=comma, x=n, y=spf]
{res/proximal/fps_vs_n.csv};
\addlegendentry{Proximal decoding}
\addplot[PineGreen, only marks, mark=triangle*]
table [col sep=comma, x=n, y=spf]
{res/admm/fps_vs_n.csv};
\addlegendentry{\acs{LP} decoding using \acs{ADMM}}
\end{axis}
\end{tikzpicture}
\caption{Timing requirements of the proximal decoding imlementation%
\protect\footnotemark{}}
\label{fig:comp:time}
\end{figure}%
%
\footnotetext{asdf}
%
\todo{Add BP curve}