Reworked theoretical background
This commit is contained in:
parent
c074d3e034
commit
327ad3934e
@ -28,7 +28,8 @@ Additionally, a shorthand notation will be used to denote series of indices and
|
|||||||
of indexed variables:%
|
of indexed variables:%
|
||||||
%
|
%
|
||||||
\begin{align*}
|
\begin{align*}
|
||||||
\left[ m:n \right] &:= \left\{ m, m+1, \ldots, n-1, n \right\} \\
|
\left[ m:n \right] &:= \left\{ m, m+1, \ldots, n-1, n \right\},
|
||||||
|
\hspace{5mm} m,n\in\mathbb{Z}\\
|
||||||
x_{\left[ m:n \right] } &:= \left\{ x_m, x_{m+1}, \ldots, x_{n-1}, x_n \right\}
|
x_{\left[ m:n \right] } &:= \left\{ x_m, x_{m+1}, \ldots, x_{n-1}, x_n \right\}
|
||||||
.\end{align*}
|
.\end{align*}
|
||||||
%
|
%
|
||||||
@ -40,7 +41,7 @@ and the \textit{Hadamard power}, the operator $\circ$ will be used:%
|
|||||||
&:= \begin{bmatrix} a_1 b_1 & \ldots & a_n b_n \end{bmatrix} ^\text{T},
|
&:= \begin{bmatrix} a_1 b_1 & \ldots & a_n b_n \end{bmatrix} ^\text{T},
|
||||||
\hspace{5mm} &&\boldsymbol{a}, \boldsymbol{b} \in \mathbb{R}^n, \hspace{2mm} n\in \mathbb{N} \\
|
\hspace{5mm} &&\boldsymbol{a}, \boldsymbol{b} \in \mathbb{R}^n, \hspace{2mm} n\in \mathbb{N} \\
|
||||||
\boldsymbol{a}^{\circ k} &:= \begin{bmatrix} a_1^k \ldots a_n^k \end{bmatrix}^\text{T},
|
\boldsymbol{a}^{\circ k} &:= \begin{bmatrix} a_1^k \ldots a_n^k \end{bmatrix}^\text{T},
|
||||||
\hspace{5mm} &&\boldsymbol{a} \in \mathbb{R}^n, \hspace{2mm}k\in \mathbb{Z}
|
\hspace{5mm} &&\boldsymbol{a} \in \mathbb{R}^n, \hspace{2mm}n\in \mathbb{N}, k\in \mathbb{Z}
|
||||||
.\end{alignat*}
|
.\end{alignat*}
|
||||||
%
|
%
|
||||||
|
|
||||||
@ -59,11 +60,12 @@ This is known as modulation. The modulation scheme chosen here is \ac{BPSK}:%
|
|||||||
.\end{align*}
|
.\end{align*}
|
||||||
%
|
%
|
||||||
The symbol that reaches the receiver, $\boldsymbol{y}$, is distorted by the channel.
|
The symbol that reaches the receiver, $\boldsymbol{y}$, is distorted by the channel.
|
||||||
This distortion is described by the channel model, which here is chosen to be \ac{AWGN}:%
|
This distortion is described by the channel model, which in the context of
|
||||||
|
this thesis is chosen to be \ac{AWGN}:%
|
||||||
%
|
%
|
||||||
\begin{align*}
|
\begin{align*}
|
||||||
\boldsymbol{y} = \boldsymbol{x} + \boldsymbol{z},
|
\boldsymbol{y} = \boldsymbol{x} + \boldsymbol{n},
|
||||||
\hspace{5mm} z_i \in \mathcal{N}\left( 0, \frac{\sigma^2}{2} \right),
|
\hspace{5mm} n_i \in \mathcal{N}\left( 0, \frac{\sigma^2}{2} \right),
|
||||||
\hspace{2mm} i \in \left[ 1:n \right]
|
\hspace{2mm} i \in \left[ 1:n \right]
|
||||||
.\end{align*}
|
.\end{align*}
|
||||||
%
|
%
|
||||||
@ -81,11 +83,11 @@ conducting this process, whereby \textit{data words} are mapped onto longer
|
|||||||
\textit{codewords}, which carry redundant information.
|
\textit{codewords}, which carry redundant information.
|
||||||
\Ac{LDPC} codes have become especially popular, since they are able to
|
\Ac{LDPC} codes have become especially popular, since they are able to
|
||||||
reach arbitrarily small probabilities of error at coderates up to the capacity
|
reach arbitrarily small probabilities of error at coderates up to the capacity
|
||||||
of the channel \cite[Sec. II.B.]{mackay_rediscovery} and their structure allows
|
of the channel \cite[Sec. II.B.]{mackay_rediscovery} while having a structure
|
||||||
for very efficient decoding.
|
that allows for very efficient decoding.
|
||||||
|
|
||||||
The lengths of the data words and codewords are denoted by $k$ and $n$,
|
The lengths of the data words and codewords are denoted by $k\in\mathbb{N}$
|
||||||
respectively.
|
and $n\in\mathbb{N}$, respectively.
|
||||||
The set of codewords $\mathcal{C} \subset \mathbb{F}_2^n$ of a binary
|
The set of codewords $\mathcal{C} \subset \mathbb{F}_2^n$ of a binary
|
||||||
linear code can be represented using the \textit{parity-check matrix}
|
linear code can be represented using the \textit{parity-check matrix}
|
||||||
$\boldsymbol{H} \in \mathbb{F}_2^{m\times n}$, where $m$ represents
|
$\boldsymbol{H} \in \mathbb{F}_2^{m\times n}$, where $m$ represents
|
||||||
@ -101,7 +103,7 @@ $\boldsymbol{c} \in \mathbb{F}_2^n$ using the \textit{generator matrix}
|
|||||||
$\boldsymbol{G} \in \mathbb{F}_2^{k\times n}$:%
|
$\boldsymbol{G} \in \mathbb{F}_2^{k\times n}$:%
|
||||||
%
|
%
|
||||||
\begin{align*}
|
\begin{align*}
|
||||||
\boldsymbol{c} = \boldsymbol{u}\boldsymbol{G}
|
\boldsymbol{c} = \boldsymbol{u}^\text{T}\boldsymbol{G}
|
||||||
.\end{align*}
|
.\end{align*}
|
||||||
%
|
%
|
||||||
|
|
||||||
@ -110,7 +112,7 @@ as described in section \ref{sec:theo:Preliminaries: Channel Model and Modulatio
|
|||||||
The received signal $\boldsymbol{y}$ is then decoded to obtain
|
The received signal $\boldsymbol{y}$ is then decoded to obtain
|
||||||
an estimate of the transmitted codeword, $\hat{\boldsymbol{c}}$.
|
an estimate of the transmitted codeword, $\hat{\boldsymbol{c}}$.
|
||||||
Finally, the encoding procedure is reversed and an estimate of the originally
|
Finally, the encoding procedure is reversed and an estimate of the originally
|
||||||
sent data word, $\hat{\boldsymbol{u}}$, is obtained.
|
sent data word, $\hat{\boldsymbol{u}}$, is produced.
|
||||||
The methods examined in this work are all based on \textit{soft-decision} decoding,
|
The methods examined in this work are all based on \textit{soft-decision} decoding,
|
||||||
i.e., $\boldsymbol{y}$ is considered to be in $\mathbb{R}^n$ and no preliminary decision
|
i.e., $\boldsymbol{y}$ is considered to be in $\mathbb{R}^n$ and no preliminary decision
|
||||||
is made by a demodulator.
|
is made by a demodulator.
|
||||||
@ -156,9 +158,6 @@ figure \ref{fig:theo:channel_overview}.%
|
|||||||
\label{fig:theo:channel_overview}
|
\label{fig:theo:channel_overview}
|
||||||
\end{figure}
|
\end{figure}
|
||||||
|
|
||||||
\todo{Explicitly mention $\boldsymbol{n}$}
|
|
||||||
\todo{Mapper $\to$ Modulator?}
|
|
||||||
|
|
||||||
The decoding process itself is generally based either on the \ac{MAP} or the \ac{ML}
|
The decoding process itself is generally based either on the \ac{MAP} or the \ac{ML}
|
||||||
criterion:%
|
criterion:%
|
||||||
%
|
%
|
||||||
@ -183,7 +182,7 @@ This is especially true for \ac{LDPC} codes, as the established decoding
|
|||||||
algorithms are \textit{message passing algorithms}, which are inherently
|
algorithms are \textit{message passing algorithms}, which are inherently
|
||||||
graph-based.
|
graph-based.
|
||||||
|
|
||||||
Binary linear codes with a parity-check matrix $\boldsymbol{H}$ can be
|
A binary linear code with a parity-check matrix $\boldsymbol{H}$ can be
|
||||||
visualized using a \textit{Tanner} or \textit{factor graph}:
|
visualized using a \textit{Tanner} or \textit{factor graph}:
|
||||||
Each row of $\boldsymbol{H}$, which represents one parity-check, is viewed as a
|
Each row of $\boldsymbol{H}$, which represents one parity-check, is viewed as a
|
||||||
\ac{CN}.
|
\ac{CN}.
|
||||||
@ -263,8 +262,9 @@ The neighbourhood of the $i$th \ac{VN} is denoted by $N_v\left( i \right)$.
|
|||||||
For the code depicted in figure \ref{fig:theo:tanner_graph}, for example,
|
For the code depicted in figure \ref{fig:theo:tanner_graph}, for example,
|
||||||
$N_c\left( 1 \right) = \left\{ 1, 3, 5, 7 \right\}$ and
|
$N_c\left( 1 \right) = \left\{ 1, 3, 5, 7 \right\}$ and
|
||||||
$N_v\left( 3 \right) = \left\{ 1, 2 \right\}$.
|
$N_v\left( 3 \right) = \left\{ 1, 2 \right\}$.
|
||||||
|
The degree $d_j$ of a \ac{CN} is defined as the number of adjacent \acp{VN}:
|
||||||
\todo{Define $d_i$ and $d_j$}
|
$d_j := \left| N_c\left( j \right) \right| $; the degree of a \ac{VN} is
|
||||||
|
similarly defined as $d_i := \left| N_v\left( i \right) \right|$.
|
||||||
|
|
||||||
Message passing algorithms are based on the notion of passing messages between
|
Message passing algorithms are based on the notion of passing messages between
|
||||||
\acp{CN} and \acp{VN}.
|
\acp{CN} and \acp{VN}.
|
||||||
@ -273,13 +273,17 @@ It aims to compute the posterior probabilities
|
|||||||
$p_{C_i \mid \boldsymbol{Y}}\left(c_i = 1 | \boldsymbol{y} \right),\hspace{2mm} i\in\mathcal{I}$
|
$p_{C_i \mid \boldsymbol{Y}}\left(c_i = 1 | \boldsymbol{y} \right),\hspace{2mm} i\in\mathcal{I}$
|
||||||
\cite[Sec. III.]{mackay_rediscovery} and use them to calculate the estimate $\hat{\boldsymbol{c}}$.
|
\cite[Sec. III.]{mackay_rediscovery} and use them to calculate the estimate $\hat{\boldsymbol{c}}$.
|
||||||
For cycle-free graphs this goal is reached after a finite
|
For cycle-free graphs this goal is reached after a finite
|
||||||
number of steps and \ac{BP} is thus equivalent to \ac{MAP} decoding.
|
number of steps and \ac{BP} is equivalent to \ac{MAP} decoding.
|
||||||
When the graph contains cycles, however, \ac{BP} only approximates the probabilities
|
When the graph contains cycles, however, \ac{BP} only approximates the probabilities
|
||||||
and is sub-optimal.
|
and is sub-optimal.
|
||||||
This leads to generally worse performance than \ac{MAP} decoding for practical codes.
|
This leads to generally worse performance than \ac{MAP} decoding for practical codes.
|
||||||
Additionally, an \textit{error floor} appears for very high \acp{SNR}, making
|
Additionally, an \textit{error floor} appears for very high \acp{SNR}, making
|
||||||
the use of \ac{BP} impractical for applications where a very low \ac{BER} is
|
the use of \ac{BP} impractical for applications where a very low \ac{BER} is
|
||||||
desired \cite[Sec. 15.3]{ryan_lin_2009}.
|
desired \cite[Sec. 15.3]{ryan_lin_2009}.
|
||||||
|
Another popular decoding method for \ac{LDPC} codes is the
|
||||||
|
\textit{min-sum algorithm}.
|
||||||
|
This is a simplification of \ac{BP} using an approximation of the the
|
||||||
|
non-linear $\tanh$ function to improve the computational performance.
|
||||||
|
|
||||||
|
|
||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||||
@ -438,12 +442,12 @@ which minimizes the objective function $g$.
|
|||||||
|
|
||||||
|
|
||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||||
\section{Optimization Methods}
|
\section{An introduction to the proximal gradient method and ADMM}
|
||||||
\label{sec:theo:Optimization Methods}
|
\label{sec:theo:Optimization Methods}
|
||||||
|
|
||||||
\textit{Proximal algorithms} are algorithms for solving convex optimization
|
\textit{Proximal algorithms} are algorithms for solving convex optimization
|
||||||
problems, that rely on the use of \textit{proximal operators}.
|
problems, that rely on the use of \textit{proximal operators}.
|
||||||
The proximal operator $\textbf{prox}_f : \mathbb{R}^n \rightarrow \mathbb{R}^n$
|
The proximal operator $\textbf{prox}_{\lambda f} : \mathbb{R}^n \rightarrow \mathbb{R}^n$
|
||||||
of a function $f:\mathbb{R}^n \rightarrow \mathbb{R}$ is defined by
|
of a function $f:\mathbb{R}^n \rightarrow \mathbb{R}$ is defined by
|
||||||
\cite[Sec. 1.1]{proximal_algorithms}%
|
\cite[Sec. 1.1]{proximal_algorithms}%
|
||||||
%
|
%
|
||||||
@ -456,8 +460,8 @@ of a function $f:\mathbb{R}^n \rightarrow \mathbb{R}$ is defined by
|
|||||||
This operator computes a point that is a compromise between minimizing $f$
|
This operator computes a point that is a compromise between minimizing $f$
|
||||||
and staying in the proximity of $\boldsymbol{v}$.
|
and staying in the proximity of $\boldsymbol{v}$.
|
||||||
The parameter $\lambda$ determines how heavily each term is weighed.
|
The parameter $\lambda$ determines how heavily each term is weighed.
|
||||||
The \textit{proximal gradient method} is an iterative optimization method used to
|
The \textit{proximal gradient method} is an iterative optimization method
|
||||||
solve problems of the form%
|
utilizing proximal operators, used to solve problems of the form%
|
||||||
%
|
%
|
||||||
\begin{align*}
|
\begin{align*}
|
||||||
\text{minimize}\hspace{5mm}f\left( \boldsymbol{x} \right) + g\left( \boldsymbol{x} \right)
|
\text{minimize}\hspace{5mm}f\left( \boldsymbol{x} \right) + g\left( \boldsymbol{x} \right)
|
||||||
@ -473,11 +477,14 @@ and minimizing $g$ using the proximal operator
|
|||||||
,\end{align*}
|
,\end{align*}
|
||||||
%
|
%
|
||||||
Since $g$ is minimized with the proximal operator and is thus not required
|
Since $g$ is minimized with the proximal operator and is thus not required
|
||||||
to be differentiable, it can be used to encode the constraints of the problem.
|
to be differentiable, it can be used to encode the constraints of the problem
|
||||||
|
(e.g., in the form of an \textit{indicator funcion}, as mentioned in
|
||||||
|
\cite[Sec. 1.2]{proximal_algorithms}).
|
||||||
|
|
||||||
A special case of convex optimization problems are \textit{linear programs}.
|
The \ac{ADMM} is another optimization method.
|
||||||
These are problems where the objective function is linear and the constraints
|
In this thesis it will be used to solve a \textit{linear program}, which
|
||||||
consist of linear equalities and inequalities.
|
is a special type of convex optimization problem, where the objective function
|
||||||
|
is linear and the constraints consist of linear equalities and inequalities.
|
||||||
Generally, any linear program can be expressed in \textit{standard form}%
|
Generally, any linear program can be expressed in \textit{standard form}%
|
||||||
\footnote{The inequality $\boldsymbol{x} \ge \boldsymbol{0}$ is to be
|
\footnote{The inequality $\boldsymbol{x} \ge \boldsymbol{0}$ is to be
|
||||||
interpreted componentwise.}
|
interpreted componentwise.}
|
||||||
|
|||||||
Loading…
Reference in New Issue
Block a user