diff --git a/src/thesis/chapters/2_fundamentals.tex b/src/thesis/chapters/2_fundamentals.tex
index be5ce44..a9acd07 100644
--- a/src/thesis/chapters/2_fundamentals.tex
+++ b/src/thesis/chapters/2_fundamentals.tex
@@ -35,20 +35,21 @@ algorithm.
 % Codewords, n, k, rate
 %
 
-One particularly important class of coding schemes is that of binary
-linear block codes.
-The information to be protected takes the form of a sequence of
+Binary linear block codes form one particularly important class of
+coding schemes.
+The information to be protected is represented by a sequence of
 binary symbols, which is split into separate blocks.
-Each block is encoded, transmitted, and decoded separately.
+Then, each block is encoded, transmitted, and decoded separately.
 The encoding step introduces redundancy by mapping input messages
 $\bm{u} \in \mathbb{F}_2^k$ of length $k \in \mathbb{N}$ (called the
 \textit{information length}) onto \textit{codewords} $\bm{x} \in
 \mathbb{F}_2^n$ of length $n \in \mathbb{N}$ (called the
 \textit{block length}) with $n > k$.
-A measure of the amount of introduced redundancy is the \textit{code
-rate} $R = k/n$.
-We call the set of all codewords $\mathcal{C}$ the \textit{code}
-\cite[Sec.~3.1.1]{ryan_channel_2009}.
+The \textit{code rate} $R = k/n$ is a measure of the amount of
+introduced redundancy.
+We call the set of all codewords
+$\mathcal{C} = \{\bm{x}^{(1)}, \bm{x}^{(2)}, \ldots, \bm{x}^{(2^k)}\}$
+the \textit{code} \cite[Sec.~3.1.1]{ryan_channel_2009}.
 
 %
 % d_min and the [] Notation
@@ -77,7 +78,7 @@ $[n,k,d_\text{min}]$.
 % Parity checks, H, and the syndrome
 %
 
-A particularly elegant way of describing the code space $C$ is the
+A particularly elegant way of describing the code space $\mathcal{C}$ is the
 notion of \textit{parity checks}.
 Since $\lvert \mathcal{C} \rvert = 2^k$ and $\lvert \mathbb{F}_2^n
 \rvert = 2^n$, there are $n-k$ conditions constrain the additional
@@ -86,14 +87,14 @@ These conditions, called parity checks, take the form of equations
 over $\mathbb{F}_2^n$, linking the individual positions of each codeword.
 We can arrange the coefficients of these equations in a
 \textit{parity-check matrix} (\acs{pcm}) $\bm{H} \in
-\mathbb{F}_2^{(n-k) \times n}$ and equivalently define the code as
-\cite[Sec.~3.1.1]{ryan_channel_2009}
+\mathbb{F}_2^{(n-k) \times n}$, $\text{rank}(\bm{H}) = n-k$, and
+equivalently define the code as \cite[Sec.~3.1.1]{ryan_channel_2009}
 \begin{align*}
-    \mathcal{C} = \left\{ \bm{x} \in \mathbb{F}_2^n :
+    \mathcal{C} := \text{kern}(\bm{H}) = \left\{ \bm{x} \in \mathbb{F}_2^n :
     \bm{H}\bm{x}^\text{T} = \bm{0} \right\}
     .%
 \end{align*}
-Note that in general we may have linearly dependent parity checks,
+In general, we have linearly dependent parity checks,
 prompting us to define the \ac{pcm} as $\bm{H} \in
 \mathbb{F}_2^{m\times n}$ with $\hspace{2mm} m \ge n-k$ instead.
 The \textit{syndrome} $\bm{s} = \bm{H} \bm{v}^\text{T}$ describes
@@ -118,9 +119,9 @@ $\bm{y} \in \mathbb{R}^n$, and \textit{hard-decision} decoding, where
 $\bm{y} \in \mathbb{F}_2^n$ \cite[Sec.~1.5.1.3]{ryan_channel_2009}.
 Finally, the decoder is responsible for obtaining an estimate
 $\hat{\bm{u}} \in \mathbb{F}_2^k$ of the original input message.
-This is done by first finding an estimate $\hat{\bm{x}}$ of the sent
+This can be done by first finding an estimate $\hat{\bm{x}}$ of the sent
 codeword and undoing the encoding.
-The decoding problem that we generally attempt to solve thus consists
+The decoding problem that we attempt to solve thus consists
 in finding the best estimate $\hat{\bm{x}}$ given $\bm{y}$.
 
 \begin{figure}[t]
@@ -168,9 +169,9 @@ in finding the best estimate $\hat{\bm{x}}$ given $\bm{y}$.
 %
 
 Shannon's noisy-channel coding theorem is stated for codes whose block
-length approaches infinity. This suggests that as the block length
-becomes larger, the performance of the considered codes should
-generally improve.
+length $n$ approaches infinity.
+This suggests that as the block length becomes larger, the
+performance of the considered codes should generally improve.
 However, the size of the \ac{pcm} of a linear block code grows
 quadratically with $n$.
 This would quickly render decoding intractable as we increase the
@@ -189,13 +190,14 @@ This is exactly the motivation behind \ac{ldpc} codes
 These differ from ``classical codes'' in their decoding algorithms:
 Classical codes are usually decoded using one-step hard-decision decoding,
 whereas modern codes are suitable for iterative soft-decision
-decoding \cite[Preface]{ryan_channel_2009}. The iterative decoding algorithms
+decoding \cite[Preface]{ryan_channel_2009}.
+For \ac{ldpc} codes, the iterative decoding algorithms
 are generally defined in terms of message passing on the
 \textit{Tanner graph} of a code. The Tanner graph is a bipartite
 graph that constitutes an alternative representation of the \ac{pcm}.
 We define two types of nodes: \Acp{vn}, corresponding to codeword
 bits, and \acp{cn}, corresponding to individual parity checks.
-We then construct the Tanner graph by connecting each \ac{cn} to
+Then, we construct the Tanner graph by connecting each \ac{cn} to
 the \acp{vn} that make up the corresponding parity check
 \cite[Sec.~5.1.2]{ryan_channel_2009}.
 \Cref{PCM and Tanner graph of the Hamming code} shows the Tanner
@@ -273,10 +275,10 @@ Mathematically, we represent a \ac{vn} using the index $i \in
 and a \ac{cn} using the index $j \in \mathcal{J}
 := \left[ 0 : m-1 \right]$.
 We can then encode the information contained in the graph by defining
-the neighborhood of a variable node $i$ as
+the neighborhood of a \ac{vn} $i$ as
 $\mathcal{N}_\text{V} (i) = \left\{ j \in \mathcal{J} : \bm{H}_{j,i}
 = 1 \right\}$
-and that of a check node $j$ as
+and the neighborhood of a \ac{cn} $j$ as
 $\mathcal{N}_\text{C} (j) = \left\{ i \in \mathcal{I} : \bm{H}_{j,i}
 = 1 \right\}$.
 
@@ -379,15 +381,15 @@ the numbers of ones, of their rows and columns are constant
 Already during their introduction, regular \ac{ldpc} codes were shown to have
 a minimum distance scaling linearly with the block length $n$ for
 large values \cite[Ch.~2,~Theorem~1]{gallager_low_1960},
-which leads to the fact that they do not exhibit an error floor under
-\ac{ml} decoding.
-Irregular codes, on the other hand, generally do exhibit an error floor,
-while their redeeming quality is the ability to reach near-capacity
-performance in the waterfall region \cite[Intro.]{costello_spatially_2014}.
+which leads to a more favorable behavior of the error rate for high
+signal-to-noise ratios.
+Irregular codes, on the other hand, have more severe error floor behavior.
+However, they have the the ability to reach near-capacity performance
+in the waterfall region \cite[Intro.]{costello_spatially_2014}.
 
 \subsection{Spatially-Coupled LDPC Codes}
 
-A recent development in the field of \ac{ldpc} codes is that of
+A more recent development in the field of \ac{ldpc} codes is that of
 \ac{sc}-\ac{ldpc} codes.
 Their key feature is that they combine the best properties of regular
 and irregular codes.
@@ -399,11 +401,12 @@ waterfall region \cite[Intro.]{costello_spatially_2014}.
 The essential property of \ac{sc}-\ac{ldpc} codes is that codewords
 from different \textit{spatial positions}, which would ordinarily be sent
 one after the other independently, are linked.
-This is achieved by connecting some \acp{vn} of one spatial position to
-\acp{cn} of another, resulting in a \ac{pcm} of the form
+This is achieved by introducing edges between \acp{vn} of one spatial
+position and \acp{cn} of another, resulting in a \ac{pcm} of the form
 \cite[Eq.~1]{hassan_fully_2016}
 %
-\begin{align*}
+\begin{align}
+    \label{eq:PCM}
     \bm{H} =
     \begin{pmatrix}
         \bm{H}_0(1) &        &             \\
@@ -413,10 +416,11 @@ This is achieved by connecting some \acp{vn} of one spatial position to
         &        & \bm{H}_K(L) \\
     \end{pmatrix}
     ,
-\end{align*}
+\end{align}
 %
 where $K \in \mathbb{N}$ is the \textit{coupling width} and $L \in
 \mathbb{N}$ is the number of spatial positions.
+The parts of the \ac{pcm} left empty in \Cref{eq:PCM} are filled with zeros.
 This construction results in a Tanner graph as depicted in
 \Cref{fig:sc-ldpc-tanner}.
 
@@ -513,7 +517,7 @@ Note that at the first few spatial positions some \acp{cn} have lower degrees.
 This leads to more reliable information about the
 \acp{vn} that, as we will see, is
 later passed to subsequent spatial positions during decoding.
-This is precisely the effect that leads to the good performance of
+This is precisely the effect that leads to the improved performance of
 \ac{sc}-\ac{ldpc} codes in the waterfall region \cite{costello_spatially_2014}.
 
 \subsection{Iterative Decoding}
@@ -521,15 +525,14 @@ This is precisely the effect that leads to the good performance of
 
 % Introduction
 
-\ac{ldpc} codes are generally decoded using efficient iterative
-algorithms, something that is possible due to their sparsity
-\cite[Sec.~5.3]{ryan_channel_2009}.
-The algorithm originally proposed alongside LDPC codes for this
-purpose by Gallager in 1960 is now known as the \ac{spa}
+Due to their sparse graphs, efficient iterative decoders exist for
+\ac{ldpc} codes \cite[Sec.~5.3]{ryan_channel_2009}.
+The decoding algorithm originally proposed alongside LDPC codes by
+Gallager in 1960 is now known as the \ac{spa}
 \cite[5.4.1]{ryan_channel_2009}, also called \acf{bp}.
 
 The optimality criterion the \ac{spa} is built around is a
-symbol-wise \ac{map} decision \cite[Sec.~5.4.1]{ryan_channel_2009}.
+bit-wise \ac{map} decision \cite[Sec.~5.4.1]{ryan_channel_2009}.
 The core idea of the resulting algorithm is to view \acp{cn}
 and \acp{vn} as representing individual local codes.
 A \ac{cn} represents a single parity check on the connected \acp{vn},
@@ -539,11 +542,11 @@ should agree on its value; it can therefore be understood as a repetition code.
 The algorithm alternates between consolidating soft information about
 the \acp{vn} in the \acp{cn}, and consolidating soft information about
 the \acp{cn} in the \acp{vn}.
-To this end, messages are passed back and forth along the edges of
-the Tanner graph.
+To this end, messages computed in the nodes are passed back and forth
+along the edges of the Tanner graph.
 $L_{i\rightarrow j}$ represents a message passed from \ac{vn} $i$ to
-\ac{cn} j, $L_{i\leftarrow j}$ represents a message passed from
-\ac{cn} j to \ac{vn} i.
+\ac{cn} $j$, $L_{i\leftarrow j}$ represents a message passed from
+\ac{cn} $j$ to \ac{vn} $i$.
 The \acp{vn} additionally receive messages \cite[5.4.2]{ryan_channel_2009}
 \begin{align*}
     \tilde{L}_i = \log \frac{P(X=0 \vert Y=y)}{P(X=1 \vert Y=y)},
@@ -574,7 +577,7 @@ possible cycles and are thus especially problematic.
 
 % Min-sum algorithm
 
-A simplification of the \ac{spa} is the min-sum decoder. Here, the
+A simplification of the \ac{spa} is the min-sum algorithm. Here, the
 \ac{cn} update is approximated as \cite[Sec.~5.5.1]{ryan_channel_2009}
 \begin{align*}
     L_{i \leftarrow j} = \prod_{i' \in \mathcal{N}_\text{C}(j)\setminus \{i\}}