diff --git a/latex/thesis/chapters/proximal_decoding.tex b/latex/thesis/chapters/proximal_decoding.tex
index 4604def..fe7d578 100644
--- a/latex/thesis/chapters/proximal_decoding.tex
+++ b/latex/thesis/chapters/proximal_decoding.tex
@@ -140,8 +140,12 @@ descent:%
 .\end{align}%
 %
 For the second step, minimizing the scaled code-constraint polynomial, the
-proximal gradient method is used and the \textit{proximal operator} of
+proximal gradient method is used \todo{The proximal gradient method is not
+just used for the second step. It is the name for the alternating iterative process}
+and the \textit{proximal operator} of
 $\gamma h\left( \tilde{\boldsymbol{x}} \right) $ has to be computed.
+\todo{Note about how the proximal gradient method is meant for convex optimization
+problems but used for a non-convex problem in this case?}
 It is then immediately approximated with gradient-descent:%
 %
 \begin{align*}
@@ -304,7 +308,7 @@ the gradient can be written as%
 %
 enabling its computation primarily with element-wise operations and
 matrix-vector multiplication.
-This is beneficial, as the libraries used for the implementation are
+This is beneficial, as the libraries employed for the implementation are
 heavily optimized for such calculations (e.g., through vectorization of the
 operations).
 \todo{Note about how the equation with which the gradient is calculated is
@@ -430,7 +434,8 @@ Evidently, while the decoding performance does depend on the value of
 $\gamma$, there is no single optimal value offering optimal performance, but
 rather a certain interval in which it stays largely unchanged.
 When examining a number of different codes (figure
-\ref{fig:prox:results_3d_multiple}), it is apparent that while the exact
+\ref{fig:prox:results_3d_multiple}), \todo{Move figure to appendix?}
+it is apparent that while the exact
 landscape of the graph depends on the code, the general behaviour is the same
 in each case.
 
@@ -483,25 +488,59 @@ in each case.
     \cite[\text{204.33.484}]{mackay_enc}; $\omega = 0.05, K=200, \eta=1.5$
 }%
 %
-\noindent This indicates \todo{This is a result fit for the conclusion}
-that while the choice of the parameter $\gamma$ significantly
-affects the decoding performance, there is not much benefit attainable in
-undertaking an extensive search for an exact optimum.
+\noindent This indicates that while the choice of the parameter $\gamma$
+significantly affects the decoding performance, there is not much benefit
+attainable in undertaking an extensive search for an exact optimum.
 Rather, a preliminary examination providing a rough window for $\gamma$ may
 be sufficient.
 
-TODO: $\omega, K$
+The parameter $\gamma$ describes the step-size for the optimization step
+dealing with the code-constraint polynomial;
+the parameter $\omega$ describes the step-size for the step dealing with the
+negative-log likelihood.
+The relationship between $\omega$ and $\gamma$ is studied in figure
+\ref{TODO}.
+The \ac{SNR} is kept constant at $\SI{4}{dB}$.
+Similar behaviour to $\gamma$ is exhibited: the \ac{BER} is minimized when
+keeping the value within certain bounds, without displaying a clear
+optimum.
+It is noteworthy that the decoder seems to achieve the best performance for
+similar values of the two step sizes.
+Again, this consideration applies to a multitude of different codes, depicted
+in figure \ref{TODO}.
+
+To better understand how to determine the optimal value for the parameter $K$,
+the average error is inspected.
+This time $\gamma$ and $\omega$ are held constant and the average error is
+observed during each iteration of the decoding process for a number of
+different \acp{SNR}.
+The plots have been generated by averaging the error over TODO decodings.
+As some decodings go one for more iterations than others, the number of values
+which are averaged for each datapoints vary.
+This explains the bump observable around $k=\text{TODO}$, since after
+this point more and more correct decodings converge and stop iterating,
+leaving more and more faulty ones to be averaged.
+Remarkably, the \ac{SNR} seems to not have any impact on the number of
+iterations necessary to reach the point at which the average error
+stabilizes.
+Furthermore, the improvement in decoding performance stagnates at a particular
+point, rendering an increase in $K$ counterproductive as it only raises the
+average timing requirements of the decoding process.
 
 Changing the parameter $\eta$ does not appear to have a significant effect on
 the decoding performance when keeping the value within a reasonable window
-(''slightly larger than one``, as stated in \cite[Sec. 3.2]{proximal_paper}),
+(``slightly larger than one'', as stated in \cite[Sec. 3.2]{proximal_paper}),
 which seems plausible considering its only function is ensuring numerical stability.
 
-Summarizing the above considerations, \ldots
-
-\begin{itemize}
-    \item Conclusion: Number of iterations independent of \ac{SNR}
-\end{itemize}
+Summarizing the above considerations, an intricate strategy to find the exact
+optimum values for the parameters $\gamma$ and $\omega$ appears to bring
+limited benefit;
+an initial rudimentary examination to find the general bounds in which the two
+values should lie is sufficient.
+The parameter $K$ is independent of the $SNR$ and raising its value above a
+certain threshold does not improve the decoding performance.
+The choice of $\eta$ is insignificant and the parameter is only relevant as a
+means to bring about numerical stability.
 
 \begin{figure}[H]
     \centering
@@ -1272,6 +1311,7 @@ used due to memory requirements?}
 Some deviations from linear behaviour are unavoidable because not all codes
 considered are actually \ac{LDPC} codes, or \ac{LDPC} codes constructed
 according to the same scheme.
+\todo{Mention on what hardware the results where generated}
 Nontheless, a generally linear relationship between the average time needed to
 decode a received frame and the length $n$ of the frame can be observed.