From b76bf585cfd8090bee37246151e2203587e03ab6 Mon Sep 17 00:00:00 2001 From: Tale152 Date: Sat, 19 Nov 2022 18:32:23 +0100 Subject: [PATCH] completed ch 7 highlighting --- .../chapter_7/sections/5_proto-mapreduce.tex | 16 +++---- .../sections/6_real-world_experiments.tex | 42 +++++++++---------- 2 files changed, 30 insertions(+), 28 deletions(-) diff --git a/document/chapters/chapter_7/sections/5_proto-mapreduce.tex b/document/chapters/chapter_7/sections/5_proto-mapreduce.tex index c0efb7c..16c3293 100644 --- a/document/chapters/chapter_7/sections/5_proto-mapreduce.tex +++ b/document/chapters/chapter_7/sections/5_proto-mapreduce.tex @@ -1,16 +1,18 @@ \section{Proto-MapReduce} -The MapReduce implementation used in this prototype is a simplified version of what discussed in \textit{chapter \ref{the_mapreduce_paradigm}}; said simplifications, while making it easier to implement, have the side effect of negatively influencing performances but, given the available time limitations, they are still acceptable in a prototypical context where the focus is to demonstrate the feasibility of mobile devices Contribution. +\textbf{The MapReduce implementation used in this prototype is a simplified version} of what discussed in \textit{chapter \ref{the_mapreduce_paradigm}}; said simplifications, while making it \textbf{easier to implement}, have the \textbf{side effect of negatively influencing performances but}, given the available time limitations, they are \textbf{still acceptable in a prototypical context where the focus is to demonstrate the feasibility of mobile devices Contribution}. -The first difference comes from how the data are handled. As explained before, the MapReduce paradigms handles a number of splits M and applies progressively the Map function to said splits; the intermediate results are grouped by key (using a partitioning function) in order to obtain R (with R { @@ -33,7 +33,7 @@ \subsection{Computation} } \end{lstlisting} -Every data region is computed by the Map Workers and, after every point in a particular region is classified, the output (visualized in \textit{figure \ref{fig:computation_region_computation}}) can be computed in the reduce function which simply reunites the intermediate results with the same key (hence belonging to the same class) in a single array. +Every data region is computed by the Map Workers and, \textbf{after every point in a particular region is classified, the output} (visualized in \textit{figure \ref{fig:computation_region_computation}}) \textbf{can be computed in the reduce function which simply reunites the intermediate results with the same key} (hence belonging to the same class) in a single array. \begin{lstlisting} const reduceFunction = (p1, p2) => { @@ -49,7 +49,7 @@ \subsection{Computation} \label{fig:computation_region_computation} \end{figure} -As can be seen in \textit{figure \ref{fig:computation_final_result}}, after every region is mapped and then reduced, each point is assigned to one of the three classes. +As can be seen in \textit{figure \ref{fig:computation_final_result}}, after every region is mapped and then reduced, \textbf{each point is assigned to one of the three classes}. \begin{figure}[!ht] \centering @@ -58,13 +58,13 @@ \subsection{Computation} \label{fig:computation_final_result} \end{figure} -Five experiments were performed: +\textbf{Five experiments} were performed: \begin{itemize} - \item 1000 values (10 regions, 100 points for each region) - \item 10000 values (100 regions, 100 points for each region)(shown in \textit{figure \ref{fig:computation_final_result}}) - \item 100000 values (100 regions, 1000 points for each region) - \item 1000000 values (1000 regions, 1000 points for each region) - \item 5000000 values (2000 regions, 2500 points for reach region) + \item \textbf{1000 values} (10 regions, 100 points for each region) + \item \textbf{10000 values} (100 regions, 100 points for each region)(shown in \textit{figure \ref{fig:computation_final_result}}) + \item \textbf{100000 values} (100 regions, 1000 points for each region) + \item \textbf{1000000 values} (1000 regions, 1000 points for each region) + \item \textbf{5000000 values} (2000 regions, 2500 points for reach region) \end{itemize} \subsection{Setup} @@ -84,17 +84,17 @@ \subsection{Setup} \label{fig:experiment_devices_setup} \end{figure} -The setup of Contributing Endpoints was thus composed by 2 Interconnected Desktop Clients and 5 Interconnected Mobile Clients, placed in a 100 km range in central Italy, which were forcibly chosen isolating them in a dedicated American server in order to perform multiple experiments with the same setup. +The setup of Contributing Endpoints was thus composed by \textbf{2 Interconnected Desktop Clients} and \textbf{5 Interconnected Mobile Clients}, placed in a \textbf{100 km range} in central Italy, which were forcibly chosen isolating them in a dedicated American server in order to perform multiple experiments with the same setup. -The Invoking Endpoint Prototype instance was placed in the E location (although it was not executed in the same computer which ran the Interconnected Desktop Client); said Invoking Endpoint requested the following resources for executing the MapReduce computation: +\textbf{The Invoking Endpoint Prototype instance was placed in the E location} (although it was not executed in the same computer which ran the Interconnected Desktop Client); said Invoking Endpoint requested the following resources for executing the MapReduce computation: \begin{itemize} - \item 4 Map Workers - \item 2 Reduce Workers + \item \textbf{4 Map Workers} + \item \textbf{2 Reduce Workers} \end{itemize} -Including the implicit MapReduce Master (which will be taken by one of the two computers), this adds up to the 7 devices specified earlier, which were used in each of the five experiments. + \textbf{Including the implicit MapReduce Master} (which will be taken by one of the two computers), \textbf{this adds up to the 7 devices specified earlier}, which were used in each of the five experiments. \subsection{Results} -\textit{Figure \ref{fig:experiment_results}} shows the results for the five experiments performed, focusing on the total time and the average time taken by each value (both measured in milliseconds). Once again, these results were obtained using a very small pool of devices and the simplified nature of the MapReduce algorithm in this prototype significantly slows down the whole process (primarily because the intermediate results are first sent back to the MapReduce Master that then forwards them to the Reduce Worker, instead using a direct connection between Map Worker and Reduce Worker). +\textit{Figure \ref{fig:experiment_results}} shows the \textbf{results for the five experiments performed}, focusing on the \textbf{total time} and the \textbf{average time taken by each value} (both \textbf{measured in milliseconds}). Once again, these results were obtained using a very small pool of devices and the simplified nature of the MapReduce algorithm in this prototype significantly slows down the whole process (primarily because the intermediate results are first sent back to the MapReduce Master that then forwards them to the Reduce Worker, instead using a direct connection between Map Worker and Reduce Worker). \begin{figure}[!ht] \centering @@ -103,13 +103,13 @@ \subsection{Results} \label{fig:experiment_results} \end{figure} -The first significant observation can be made looking at the first two experiments: the average time for each value drastically drops (\texttildelow6.7 times faster); this can be explained by considering that the total time also includes the recruitment phase where no computation is executed. In other terms, the number of values used in the first experiment is so small that their computation time becomes irrelevant, meaning that the recruitment phase is basically the only factor that influences the average time. The more values are computed (assuming the same number of devices are used), the less impactful the recruitment time becomes. - -Finally, \textit{figure \ref{fig:experiment_results_avg_ms_per_value}} focuses on comparing the average time for each value; it becomes apparent that, despite the not optimized algorithms used in this prototype, the more values are computed, the greater the advantage becomes, showing that a distributed computation participated also by mobile devices is not only feasible, but it can also provide value to the Customer. +The \textbf{first significant observation} can be made l\textbf{ooking at the first two experiments}: the \textbf{average time for each value drastically drops} (\texttildelow6.7 times faster); this can be \textbf{explained by considering that the total time also includes the recruitment phase where no computation is executed}. In other terms, \textbf{the number of values used in the first experiment is so small that their computation time becomes irrelevant}, meaning that the recruitment phase is basically the only factor that influences the average time. \textbf{The more values are computed (assuming the same number of devices are used), the less impactful the recruitment time becomes}. \begin{figure}[!ht] \centering \includegraphics[scale=0.55]{document/chapters/chapter_7/images/experiment_results_avg_ms_per_value.png} \caption{Average time (milliseconds) for each value} \label{fig:experiment_results_avg_ms_per_value} -\end{figure} \ No newline at end of file +\end{figure} + +Finally, \textit{figure \ref{fig:experiment_results_avg_ms_per_value}} focuses on \textbf{comparing the average time for each value}; it becomes apparent that, despite the not optimized algorithms used in this prototype, \textbf{the more values are computed, the greater the advantage becomes, showing that a distributed computation participated also by mobile devices is not only feasible, but it can also provide value to the Customer}.