diff --git a/latex/thesis/chapters/decoding_techniques.tex b/latex/thesis/chapters/decoding_techniques.tex
index 9fc48f7..803414c 100644
--- a/latex/thesis/chapters/decoding_techniques.tex
+++ b/latex/thesis/chapters/decoding_techniques.tex
@@ -259,7 +259,8 @@ binary vectors of length $d_j$ with even parity%
 parity-check $j$, but extended to the continuous domain.}%
 and $\boldsymbol{T}_j$ is the \textit{transfer matrix}, which selects the
 neighboring variable nodes
-of check node $j$ (i.e., the relevant components of $\boldsymbol{c}$ for parity-check $j$).
+of check node $j$ (i.e., the relevant components of $\tilde{\boldsymbol{c}}$
+for parity-check $j$).
 For example, if the $j$th row of the parity-check matrix
 $\boldsymbol{H}$ was $\boldsymbol{h}_j =
 \begin{bmatrix} 0 & 1 & 0 & 1 & 0 & 1 & 0 \end{bmatrix}$,
@@ -676,7 +677,7 @@ correspond to the so-called \textit{pseudo-codewords} introduced in
 However, since for \ac{LDPC} codes $\overline{Q}$ scales linearly with $n$ instead of
 exponentially, it is a lot more tractable for practical applications.
 
-The resulting formulation of the relaxed optimization problem becomes:%
+The resulting formulation of the relaxed optimization problem becomes%
 %
 \begin{align}
     \begin{aligned}
@@ -805,7 +806,7 @@ $\boldsymbol{\lambda}_j = \mu \cdot \boldsymbol{u}_j \,\forall\,j\in\mathcal{J}$
     \tilde{c}_i &\leftarrow \frac{1}{\left| N_v\left( i \right) \right|} \left( 
         \sum_{j\in N_v\left( i \right) } \Big( \left( \boldsymbol{z}_j \right)_i
             - \left( \boldsymbol{u}_j \right)_i \Big)
-        - \gamma_i \right)
+        - \frac{\gamma_i}{\mu} \right)
     \hspace{3mm} && \forall i\in\mathcal{I} \\
     \boldsymbol{z}_j &\leftarrow \Pi_{\mathcal{P}_{d_j}}\left(
         \boldsymbol{T}_j\tilde{\boldsymbol{c}} + \boldsymbol{u}_j \right)
@@ -898,8 +899,7 @@ In order to derive the objective function, the authors begin with the
 material difference in the meaning of the rule.
 The only change is that what previously were \acp{PMF} now have to be expressed
 in terms of \acp{PDF}.}
-over $\boldsymbol{x}$
-:%
+over $\boldsymbol{x}$:%
 %
 \begin{align}
     \hat{\boldsymbol{x}} = \argmax_{\tilde{\boldsymbol{x}} \in \mathbb{R}^{n}}
@@ -1015,7 +1015,7 @@ descent:%
 %
 For the second step, minimizing the scaled code-constraint polynomial, the
 proximal gradient method is used and the \textit{proximal operator} of
-$\gamma h\left( \boldsymbol{x} \right) $ has to be computed.
+$\gamma h\left( \tilde{\boldsymbol{x}} \right) $ has to be computed.
 It is then immediately approximated with gradient-descent:%
 %
 \begin{align*}
@@ -1037,7 +1037,7 @@ The second step thus becomes%
 While the approximation of the prior \ac{PDF} made in equation (\ref{eq:prox:prior_pdf_approx})
 theoretically becomes better
 with larger $\gamma$, the constraint that $\gamma$ be small is important,
-as it keeps the effect of $h\left( \boldsymbol{x} \right) $ on the landscape
+as it keeps the effect of $h\left( \tilde{\boldsymbol{x}} \right) $ on the landscape
 of the objective function small.
 Otherwise, unwanted stationary points, including local minima, are introduced.
 The authors say that ``in practice, the value of $\gamma$ should be adjusted
@@ -1079,7 +1079,7 @@ it suffices to consider only proportionality instead of equality.}%
     &\propto \tilde{\boldsymbol{x}} - \boldsymbol{y}
 ,\end{align*}%
 %
-allowing equation \ref{eq:prox:step_log_likelihood} to be rewritten as%
+allowing equation (\ref{eq:prox:step_log_likelihood}) to be rewritten as%
 %
 \begin{align*}
     \boldsymbol{r} \leftarrow \boldsymbol{s}
diff --git a/latex/thesis/chapters/theoretical_background.tex b/latex/thesis/chapters/theoretical_background.tex
index aa6497d..245fd54 100644
--- a/latex/thesis/chapters/theoretical_background.tex
+++ b/latex/thesis/chapters/theoretical_background.tex
@@ -203,7 +203,7 @@ by computing \cite[Sec. 2.1]{admm_distr_stats}%
 %
 
 The dual problem can then be solved iteratively using \textit{dual ascent}: starting with an
-initial estimate of $\boldsymbol{\lambda}$, calculate an estimate for $\boldsymbol{x}$
+initial estimate for $\boldsymbol{\lambda}$, calculate an estimate for $\boldsymbol{x}$
 using equation (\ref{eq:theo:admm_obtain_primal}); then, update $\boldsymbol{\lambda}$
 using gradient descent \cite[Sec. 2.1]{admm_distr_stats}:%
 %
@@ -215,7 +215,7 @@ using gradient descent \cite[Sec. 2.1]{admm_distr_stats}:%
     \hspace{5mm} \alpha > 0
 .\end{align*}
 %
-The algorithm can be improved by observing that when hen the objective function is separable in $\boldsymbol{x}$, the lagrangian is as well:
+The algorithm can be improved by observing that when the objective function is separable in $\boldsymbol{x}$, the lagrangian is as well:
 %
 \begin{align*}
     \text{minimize }\hspace{5mm} & \sum_{i=1}^{N} g_i\left( \boldsymbol{x}_i \right)  \\
@@ -246,7 +246,7 @@ This modified version of dual ascent is called \textit{dual decomposition}:
 
 The \ac{ADMM} works the same way as dual decomposition.
 It only differs in the use of an \textit{augmented lagrangian}
-$\mathcal{L}_\mu\left( \boldsymbol{x}_{[1:N]} \boldsymbol{\lambda} \right)$
+$\mathcal{L}_\mu\left( \boldsymbol{x}_{[1:N]}, \boldsymbol{\lambda} \right)$
 in order to robustify the convergence properties.
 The augmented lagrangian extends the ordinary one with an additional penalty term
 with the penaly parameter $\mu$: