Fixed spelling and grammatical errors in theory chapter

This commit is contained in:
Andreas Tsouchlos 2023-04-11 20:52:25 +02:00
parent 4981c220cd
commit 5989c33621

View File

@ -3,7 +3,7 @@
In this chapter, the theoretical background necessary to understand this
work is given.
First, the used notation is clarified.
First, the notation used is clarified.
The physical aspects are detailed - the used modulation scheme and channel model.
A short introduction to channel coding with binary linear codes and especially
\ac{LDPC} codes is given.
@ -33,7 +33,7 @@ of indexed variables:%
x_{\left[ m:n \right] } &:= \left\{ x_m, x_{m+1}, \ldots, x_{n-1}, x_n \right\}
.\end{align*}
%
In order to designate elemen-twise operations, in particular the\textit{Hadamard product}
In order to designate elemen-twise operations, in particular the \textit{Hadamard product}
and the \textit{Hadamard power}, the operator $\circ$ will be used:%
%
\begin{alignat*}{3}
@ -82,7 +82,7 @@ Encoding the information using \textit{binary linear codes} is one way of
conducting this process, whereby \textit{data words} are mapped onto longer
\textit{codewords}, which carry redundant information.
\Ac{LDPC} codes have become especially popular, since they are able to
reach arbitrarily small probabilities of error at coderates up to the capacity
reach arbitrarily small probabilities of error at code rates up to the capacity
of the channel \cite[Sec. II.B.]{mackay_rediscovery} while having a structure
that allows for very efficient decoding.
@ -252,13 +252,13 @@ Figure \ref{fig:theo:tanner_graph} shows the tanner graph for the
\label{fig:theo:tanner_graph}
\end{figure}%
%
\noindent \acp{CN} and \acp{VN}, and by extention the rows and columns of
\noindent \acp{CN} and \acp{VN}, and by extension the rows and columns of
$\boldsymbol{H}$, are indexed with the variables $j$ and $i$.
The sets of all \acp{CN} and all \acp{VN} are denoted by
$\mathcal{J} := \left[ 1:m \right]$ and $\mathcal{I} := \left[ 1:n \right]$, respectively.
The \textit{neighbourhood} of the $j$th \ac{CN}, i.e., the set of all adjacent \acp{VN},
The \textit{neighborhood} of the $j$th \ac{CN}, i.e., the set of all adjacent \acp{VN},
is denoted by $N_c\left( j \right)$.
The neighbourhood of the $i$th \ac{VN} is denoted by $N_v\left( i \right)$.
The neighborhood of the $i$th \ac{VN} is denoted by $N_v\left( i \right)$.
For the code depicted in figure \ref{fig:theo:tanner_graph}, for example,
$N_c\left( 1 \right) = \left\{ 1, 3, 5, 7 \right\}$ and
$N_v\left( 3 \right) = \left\{ 1, 2 \right\}$.
@ -282,7 +282,7 @@ the use of \ac{BP} impractical for applications where a very low \ac{BER} is
desired \cite[Sec. 15.3]{ryan_lin_2009}.
Another popular decoding method for \ac{LDPC} codes is the
\textit{min-sum algorithm}.
This is a simplification of \ac{BP} using an approximation of the the
This is a simplification of \ac{BP} using an approximation of the
non-linear $\tanh$ function to improve the computational performance.
@ -478,13 +478,13 @@ and minimizing $g$ using the proximal operator
%
Since $g$ is minimized with the proximal operator and is thus not required
to be differentiable, it can be used to encode the constraints of the problem
(e.g., in the form of an \textit{indicator funcion}, as mentioned in
(e.g., in the form of an \textit{indicator function}, as mentioned in
\cite[Sec. 1.2]{proximal_algorithms}).
The \ac{ADMM} is another optimization method.
In this thesis it will be used to solve a \textit{linear program}, which
is a special type of convex optimization problem, where the objective function
is linear and the constraints consist of linear equalities and inequalities.
is linear, and the constraints consist of linear equalities and inequalities.
Generally, any linear program can be expressed in \textit{standard form}%
\footnote{The inequality $\boldsymbol{x} \ge \boldsymbol{0}$ is to be
interpreted componentwise.}
@ -499,7 +499,7 @@ interpreted componentwise.}
\label{eq:theo:admm_standard}
\end{alignat}%
%
A technique called \textit{lagrangian relaxation} \cite[Sec. 11.4]{intro_to_lin_opt_book}
A technique called \textit{Lagrangian relaxation} \cite[Sec. 11.4]{intro_to_lin_opt_book}
can then be applied.
First, some of the constraints are moved into the objective function itself
and weights $\boldsymbol{\lambda}$ are introduced. A new, relaxed problem
@ -515,7 +515,7 @@ is then formulated as
\label{eq:theo:admm_relaxed}
\end{align}%
%
the new objective function being the \textit{lagrangian}%
the new objective function being the \textit{Lagrangian}%
%
\begin{align*}
\mathcal{L}\left( \boldsymbol{x}, \boldsymbol{\lambda} \right)
@ -525,10 +525,10 @@ the new objective function being the \textit{lagrangian}%
.\end{align*}%
%
This problem is not directly equivalent to the original one, as the
solution now depends on the choice of the \textit{lagrange multipliers}
solution now depends on the choice of the \textit{Lagrange multipliers}
$\boldsymbol{\lambda}$.
Interestingly, however, for this particular class of problems,
the minimum of the objective function (herafter called \textit{optimal objective})
the minimum of the objective function (hereafter called \textit{optimal objective})
of the relaxed problem (\ref{eq:theo:admm_relaxed}) is a lower bound for
the optimal objective of the original problem (\ref{eq:theo:admm_standard})
\cite[Sec. 4.1]{intro_to_lin_opt_book}:%
@ -599,7 +599,7 @@ $g_i: \mathbb{R}^{n_i} \rightarrow \mathbb{R}$,
i.e., $g\left( \boldsymbol{x} \right) = \sum_{i=1}^{N} g_i
\left( \boldsymbol{x}_i \right)$,
where $\boldsymbol{x}_i,\hspace{1mm} i\in [1:N]$ are subvectors of
$\boldsymbol{x}$, the lagrangian is as well:
$\boldsymbol{x}$, the Lagrangian is as well:
%
\begin{align*}
\text{minimize }\hspace{5mm} & \sum_{i=1}^{N} g_i\left( \boldsymbol{x}_i \right) \\
@ -638,18 +638,18 @@ This modified version of dual ascent is called \textit{dual decomposition}:
.\end{align*}
%
The \ac{ADMM} works the same way as dual decomposition.
It only differs in the use of an \textit{augmented lagrangian}
\ac{ADMM} works the same way as dual decomposition.
It only differs in the use of an \textit{augmented Lagrangian}
$\mathcal{L}_\mu\left( \boldsymbol{x}_{[1:N]}, \boldsymbol{\lambda} \right)$
in order to strengthen the convergence properties.
The augmented lagrangian extends the ordinary one with an additional penalty term
The augmented Lagrangian extends the ordinary one with an additional penalty term
with the penaly parameter $\mu$:
%
\begin{align*}
\mathcal{L}_\mu \left( \boldsymbol{x}_{[1:N]}, \boldsymbol{\lambda} \right)
= \underbrace{\sum_{i=1}^{N} g_i\left( \boldsymbol{x_i} \right)
+ \boldsymbol{\lambda}^\text{T}\left( \boldsymbol{b}
- \sum_{i=1}^{N} \boldsymbol{A}_i\boldsymbol{x}_i \right)}_{\text{Ordinary lagrangian}}
- \sum_{i=1}^{N} \boldsymbol{A}_i\boldsymbol{x}_i \right)}_{\text{Ordinary Lagrangian}}
+ \underbrace{\frac{\mu}{2}\left\Vert \sum_{i=1}^{N} \boldsymbol{A}_i\boldsymbol{x}_i
- \boldsymbol{b} \right\Vert_2^2}_{\text{Penalty term}},
\hspace{5mm} \mu > 0