Reworked theoretical background
This commit is contained in:
parent
c074d3e034
commit
327ad3934e
@ -28,7 +28,8 @@ Additionally, a shorthand notation will be used to denote series of indices and
|
||||
of indexed variables:%
|
||||
%
|
||||
\begin{align*}
|
||||
\left[ m:n \right] &:= \left\{ m, m+1, \ldots, n-1, n \right\} \\
|
||||
\left[ m:n \right] &:= \left\{ m, m+1, \ldots, n-1, n \right\},
|
||||
\hspace{5mm} m,n\in\mathbb{Z}\\
|
||||
x_{\left[ m:n \right] } &:= \left\{ x_m, x_{m+1}, \ldots, x_{n-1}, x_n \right\}
|
||||
.\end{align*}
|
||||
%
|
||||
@ -40,7 +41,7 @@ and the \textit{Hadamard power}, the operator $\circ$ will be used:%
|
||||
&:= \begin{bmatrix} a_1 b_1 & \ldots & a_n b_n \end{bmatrix} ^\text{T},
|
||||
\hspace{5mm} &&\boldsymbol{a}, \boldsymbol{b} \in \mathbb{R}^n, \hspace{2mm} n\in \mathbb{N} \\
|
||||
\boldsymbol{a}^{\circ k} &:= \begin{bmatrix} a_1^k \ldots a_n^k \end{bmatrix}^\text{T},
|
||||
\hspace{5mm} &&\boldsymbol{a} \in \mathbb{R}^n, \hspace{2mm}k\in \mathbb{Z}
|
||||
\hspace{5mm} &&\boldsymbol{a} \in \mathbb{R}^n, \hspace{2mm}n\in \mathbb{N}, k\in \mathbb{Z}
|
||||
.\end{alignat*}
|
||||
%
|
||||
|
||||
@ -59,11 +60,12 @@ This is known as modulation. The modulation scheme chosen here is \ac{BPSK}:%
|
||||
.\end{align*}
|
||||
%
|
||||
The symbol that reaches the receiver, $\boldsymbol{y}$, is distorted by the channel.
|
||||
This distortion is described by the channel model, which here is chosen to be \ac{AWGN}:%
|
||||
This distortion is described by the channel model, which in the context of
|
||||
this thesis is chosen to be \ac{AWGN}:%
|
||||
%
|
||||
\begin{align*}
|
||||
\boldsymbol{y} = \boldsymbol{x} + \boldsymbol{z},
|
||||
\hspace{5mm} z_i \in \mathcal{N}\left( 0, \frac{\sigma^2}{2} \right),
|
||||
\boldsymbol{y} = \boldsymbol{x} + \boldsymbol{n},
|
||||
\hspace{5mm} n_i \in \mathcal{N}\left( 0, \frac{\sigma^2}{2} \right),
|
||||
\hspace{2mm} i \in \left[ 1:n \right]
|
||||
.\end{align*}
|
||||
%
|
||||
@ -81,11 +83,11 @@ conducting this process, whereby \textit{data words} are mapped onto longer
|
||||
\textit{codewords}, which carry redundant information.
|
||||
\Ac{LDPC} codes have become especially popular, since they are able to
|
||||
reach arbitrarily small probabilities of error at coderates up to the capacity
|
||||
of the channel \cite[Sec. II.B.]{mackay_rediscovery} and their structure allows
|
||||
for very efficient decoding.
|
||||
of the channel \cite[Sec. II.B.]{mackay_rediscovery} while having a structure
|
||||
that allows for very efficient decoding.
|
||||
|
||||
The lengths of the data words and codewords are denoted by $k$ and $n$,
|
||||
respectively.
|
||||
The lengths of the data words and codewords are denoted by $k\in\mathbb{N}$
|
||||
and $n\in\mathbb{N}$, respectively.
|
||||
The set of codewords $\mathcal{C} \subset \mathbb{F}_2^n$ of a binary
|
||||
linear code can be represented using the \textit{parity-check matrix}
|
||||
$\boldsymbol{H} \in \mathbb{F}_2^{m\times n}$, where $m$ represents
|
||||
@ -101,7 +103,7 @@ $\boldsymbol{c} \in \mathbb{F}_2^n$ using the \textit{generator matrix}
|
||||
$\boldsymbol{G} \in \mathbb{F}_2^{k\times n}$:%
|
||||
%
|
||||
\begin{align*}
|
||||
\boldsymbol{c} = \boldsymbol{u}\boldsymbol{G}
|
||||
\boldsymbol{c} = \boldsymbol{u}^\text{T}\boldsymbol{G}
|
||||
.\end{align*}
|
||||
%
|
||||
|
||||
@ -110,7 +112,7 @@ as described in section \ref{sec:theo:Preliminaries: Channel Model and Modulatio
|
||||
The received signal $\boldsymbol{y}$ is then decoded to obtain
|
||||
an estimate of the transmitted codeword, $\hat{\boldsymbol{c}}$.
|
||||
Finally, the encoding procedure is reversed and an estimate of the originally
|
||||
sent data word, $\hat{\boldsymbol{u}}$, is obtained.
|
||||
sent data word, $\hat{\boldsymbol{u}}$, is produced.
|
||||
The methods examined in this work are all based on \textit{soft-decision} decoding,
|
||||
i.e., $\boldsymbol{y}$ is considered to be in $\mathbb{R}^n$ and no preliminary decision
|
||||
is made by a demodulator.
|
||||
@ -156,9 +158,6 @@ figure \ref{fig:theo:channel_overview}.%
|
||||
\label{fig:theo:channel_overview}
|
||||
\end{figure}
|
||||
|
||||
\todo{Explicitly mention $\boldsymbol{n}$}
|
||||
\todo{Mapper $\to$ Modulator?}
|
||||
|
||||
The decoding process itself is generally based either on the \ac{MAP} or the \ac{ML}
|
||||
criterion:%
|
||||
%
|
||||
@ -183,7 +182,7 @@ This is especially true for \ac{LDPC} codes, as the established decoding
|
||||
algorithms are \textit{message passing algorithms}, which are inherently
|
||||
graph-based.
|
||||
|
||||
Binary linear codes with a parity-check matrix $\boldsymbol{H}$ can be
|
||||
A binary linear code with a parity-check matrix $\boldsymbol{H}$ can be
|
||||
visualized using a \textit{Tanner} or \textit{factor graph}:
|
||||
Each row of $\boldsymbol{H}$, which represents one parity-check, is viewed as a
|
||||
\ac{CN}.
|
||||
@ -263,8 +262,9 @@ The neighbourhood of the $i$th \ac{VN} is denoted by $N_v\left( i \right)$.
|
||||
For the code depicted in figure \ref{fig:theo:tanner_graph}, for example,
|
||||
$N_c\left( 1 \right) = \left\{ 1, 3, 5, 7 \right\}$ and
|
||||
$N_v\left( 3 \right) = \left\{ 1, 2 \right\}$.
|
||||
|
||||
\todo{Define $d_i$ and $d_j$}
|
||||
The degree $d_j$ of a \ac{CN} is defined as the number of adjacent \acp{VN}:
|
||||
$d_j := \left| N_c\left( j \right) \right| $; the degree of a \ac{VN} is
|
||||
similarly defined as $d_i := \left| N_v\left( i \right) \right|$.
|
||||
|
||||
Message passing algorithms are based on the notion of passing messages between
|
||||
\acp{CN} and \acp{VN}.
|
||||
@ -273,13 +273,17 @@ It aims to compute the posterior probabilities
|
||||
$p_{C_i \mid \boldsymbol{Y}}\left(c_i = 1 | \boldsymbol{y} \right),\hspace{2mm} i\in\mathcal{I}$
|
||||
\cite[Sec. III.]{mackay_rediscovery} and use them to calculate the estimate $\hat{\boldsymbol{c}}$.
|
||||
For cycle-free graphs this goal is reached after a finite
|
||||
number of steps and \ac{BP} is thus equivalent to \ac{MAP} decoding.
|
||||
number of steps and \ac{BP} is equivalent to \ac{MAP} decoding.
|
||||
When the graph contains cycles, however, \ac{BP} only approximates the probabilities
|
||||
and is sub-optimal.
|
||||
This leads to generally worse performance than \ac{MAP} decoding for practical codes.
|
||||
Additionally, an \textit{error floor} appears for very high \acp{SNR}, making
|
||||
the use of \ac{BP} impractical for applications where a very low \ac{BER} is
|
||||
desired \cite[Sec. 15.3]{ryan_lin_2009}.
|
||||
Another popular decoding method for \ac{LDPC} codes is the
|
||||
\textit{min-sum algorithm}.
|
||||
This is a simplification of \ac{BP} using an approximation of the the
|
||||
non-linear $\tanh$ function to improve the computational performance.
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
@ -438,12 +442,12 @@ which minimizes the objective function $g$.
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\section{Optimization Methods}
|
||||
\section{An introduction to the proximal gradient method and ADMM}
|
||||
\label{sec:theo:Optimization Methods}
|
||||
|
||||
\textit{Proximal algorithms} are algorithms for solving convex optimization
|
||||
problems, that rely on the use of \textit{proximal operators}.
|
||||
The proximal operator $\textbf{prox}_f : \mathbb{R}^n \rightarrow \mathbb{R}^n$
|
||||
The proximal operator $\textbf{prox}_{\lambda f} : \mathbb{R}^n \rightarrow \mathbb{R}^n$
|
||||
of a function $f:\mathbb{R}^n \rightarrow \mathbb{R}$ is defined by
|
||||
\cite[Sec. 1.1]{proximal_algorithms}%
|
||||
%
|
||||
@ -456,8 +460,8 @@ of a function $f:\mathbb{R}^n \rightarrow \mathbb{R}$ is defined by
|
||||
This operator computes a point that is a compromise between minimizing $f$
|
||||
and staying in the proximity of $\boldsymbol{v}$.
|
||||
The parameter $\lambda$ determines how heavily each term is weighed.
|
||||
The \textit{proximal gradient method} is an iterative optimization method used to
|
||||
solve problems of the form%
|
||||
The \textit{proximal gradient method} is an iterative optimization method
|
||||
utilizing proximal operators, used to solve problems of the form%
|
||||
%
|
||||
\begin{align*}
|
||||
\text{minimize}\hspace{5mm}f\left( \boldsymbol{x} \right) + g\left( \boldsymbol{x} \right)
|
||||
@ -473,11 +477,14 @@ and minimizing $g$ using the proximal operator
|
||||
,\end{align*}
|
||||
%
|
||||
Since $g$ is minimized with the proximal operator and is thus not required
|
||||
to be differentiable, it can be used to encode the constraints of the problem.
|
||||
to be differentiable, it can be used to encode the constraints of the problem
|
||||
(e.g., in the form of an \textit{indicator funcion}, as mentioned in
|
||||
\cite[Sec. 1.2]{proximal_algorithms}).
|
||||
|
||||
A special case of convex optimization problems are \textit{linear programs}.
|
||||
These are problems where the objective function is linear and the constraints
|
||||
consist of linear equalities and inequalities.
|
||||
The \ac{ADMM} is another optimization method.
|
||||
In this thesis it will be used to solve a \textit{linear program}, which
|
||||
is a special type of convex optimization problem, where the objective function
|
||||
is linear and the constraints consist of linear equalities and inequalities.
|
||||
Generally, any linear program can be expressed in \textit{standard form}%
|
||||
\footnote{The inequality $\boldsymbol{x} \ge \boldsymbol{0}$ is to be
|
||||
interpreted componentwise.}
|
||||
|
||||
Loading…
Reference in New Issue
Block a user