Examiner 2 corrections; add new appendix with stats methodology

atyndall · Jun 20, 2015 · efda2ba · efda2ba
1 parent dbba277
commit efda2ba
Show file tree

Hide file tree

Showing 11 changed files with 143 additions and 82 deletions.
diff --git a/conclusion/conclusion.pdf b/conclusion/conclusion.pdf
diff --git a/conclusion/conclusion.tex b/conclusion/conclusion.tex
@@ -16,7 +16,7 @@ \subsection{Low Cost}
 Right now we are at the stage where this technology is economical for researchers to investigate, but a future where it becomes economical for consumers is approaching fast. We believe by selecting the components that we have at the current price point, we have met the project's goal of low cost.
 
 \subsection{Non-Invasive}
-To ensure wide adoption, minimizing privacy concerns is necessary. We viewed creating a system with little means by which to  surveil occupants as the best way to minimize such concerns.
+To ensure wide adoption, minimising privacy concerns is necessary. We viewed creating a system with little means by which to  surveil occupants as the best way to minimise such concerns.
 
 As discussed in the Literature Review (\Fref{sec:litreview:sensors:analysis}), we concluded that the \mlx provides the best trade-off between accuracy and non-invasiveness of those sensing systems studied. It provides this trade-off from two different angles; the infra-red aspect and the low-resolution aspect. 
 
@@ -27,9 +27,9 @@ \subsection{Non-Invasive}
 \subsection{Reliable}
 Creating a system that is wholly automated and can detect occupants with a high level of accuracy is important to ensure that climate control and other occupant-driven tasks are reliably executed.
 
-As discussed in \Fref{subsec:classification}, we were unable to replicate ThermoSense's RMSEs of 0.346, 0.409 and 0.385 with either $k$-nearest Neighbours, Linear Regression, or Multi-Layer Perceptron respectively. This suggests that the classifiers ThermoSense used were highly sensitive to their sensor's specific properties.
+As discussed in \Fref{subsec:classification}, we were unable to replicate ThermoSense's Root-Mean-Square Errors (RMSEs) of 0.346, 0.409 and 0.385~occupants with either $k$-nearest Neighbours, Linear Regression, or Multi-Layer Perceptron respectively. This suggests that the classifiers ThermoSense used were highly sensitive to their sensor's specific properties.
 
-However among our own selected machine learning algorithms, K* and C4.5 achieved accuracies in the 80\%+ range, exceeding our original goal of 75\%. These algorithms also improved upon ThermoSense's best RMSE with RMSEs of 0.304 and 0.314 respectively. Both of these algorithms leverage entropy measures as a way of partitioning data, suggesting that entropy-based approaches may be particularly suited to our dataset.
+However among our own selected machine learning algorithms, K* and C4.5 achieved accuracies in the 80\%+ range, exceeding our original goal of 75\%. These algorithms also improved upon ThermoSense's best RMSE with RMSEs of 0.304 and 0.314~occupants respectively. Both of these algorithms leverage entropy measures as a way of partitioning data, suggesting that entropy-based approaches may be particularly suited to our dataset.
 
 Using the K* or C4.5 machine learning algorithm, we are confident that this prototype could achieve appropriate levels of accuracy for its occupancy goals, and believe that the reliability requirements of our project have been met.
 

diff --git a/evaluation/evaluation.pdf b/evaluation/evaluation.pdf
diff --git a/evaluation/evaluation.tex b/evaluation/evaluation.tex
@@ -107,7 +107,7 @@ \subsection{Executing Weka Tests}
 
 Weka's ``iBk'' function is used to perform a KNN calculation, configuring \texttt{distanceWeighting} to be ``Weight by 1-distance'' and \texttt{KNN} to be 5, to make the classification as similar in function to the ThermoSense approach as is possible. ThermoSense does not specify what validation technique they used, so we elected to use a 10-fold cross-validation. We limit the scope of our study to $k = 5$.
 
-Thirdly, they use a Linear Regression model of $y = \beta_A A + \beta_S S + \beta $, whereby $A$ is the number of active pixels, $S$ is the size of the largest connected component, and the $\beta$ values represent the corresponding coefficients. They opt to exclude the third feature, the number of connected components, as their testing indicates that excluding it minimizes the Root Mean Squared Error (RMSE) further. 
+Thirdly, they use a Linear Regression model of $y = \beta_A A + \beta_S S + \beta $, whereby $A$ is the number of active pixels, $S$ is the size of the largest connected component, and the $\beta$ values represent the corresponding coefficients. They opt to exclude the third feature, the number of connected components, as their testing indicates that excluding it minimizes the Root-Mean-Square Error (RMSE) further. 
 
 We use Weka's ``LinearRegression'' function, and exclude the number of connected components (\texttt{numconnected}) attribute from the feature vector list, as ThermoSense does, to attempt to match this approach. We limit the scope of our study to Linear Regressions that exclude the number of connected components.
 
@@ -208,16 +208,16 @@ \subsection{Classification}
 \centering
 \begin{tabular}{|l|r|r|r|}
 \hline
-\textbf{Classifier} & \textbf{RMSE} & \textbf{Precision (\%)} & \textbf{Correlation ($r$)} \\ \hline
+\textbf{Classifier} & \textbf{RMSE$^1$} & \textbf{Precision (\%)} & \textbf{Correlation ($r$)} \\ \hline
 \multicolumn{4}{|c|}{\cellcolor{black!15} ThermoSense Actual}                         \\ \hline
-KNN$^1$             & 0.346         &             &              \\ \hline
-Lin Reg$^2$         & 0.385         &             & 0.926        \\ \hline
+KNN$^2$             & 0.346         &             &              \\ \hline
+Lin Reg$^3$         & 0.385         &             & 0.926        \\ \hline
 MLP                 & 0.409         &             & 0.945        \\ \hline
 \multicolumn{4}{|c|}{\cellcolor{black!15} ThermoSense Replication}                    \\ \hline
-KNN (Nom)$^1$       & 0.364         & 65.65       &              \\ \hline
+KNN (Nom)$^2$       & 0.364         & 65.65       &              \\ \hline
 MLP                 & 0.592         &             & 0.687        \\ \hline
-Lin Reg$^2$         & 0.525         &             & 0.589        \\ \hline
-KNN (Num)$^1$       & 1.123         &             & 0.377        \\ \hline
+Lin Reg$^3$         & 0.525         &             & 0.589        \\ \hline
+KNN (Num)$^2$       & 1.123         &             & 0.377        \\ \hline
 \multicolumn{4}{|c|}{\cellcolor{black!15} Numeric}                                    \\ \hline
 K*                  & 0.423         &             & 0.760        \\ \hline
 0-R                 & 0.651         &             & -0.118       \\ \hline
@@ -229,25 +229,26 @@ \subsection{Classification}
 N. Bayes            & 0.405         & 63.59       &              \\ \hline
 0-R                 & 0.442         & 49.74       &              \\ \hline
 \end{tabular}\\
-\parbox{300pt}{
-$^1$: Includes zero occupant cases in training data \\
-$^2$: Excludes number of connected components feature \\
-\%: Precision, measuring a nominal test result \\
-$r$: Correlation coefficient, measuring a numeric test result \\
+\parbox{300pt}{\footnotesize
+$^1$: Model deviation from occupant ground truth (see Appendix \ref{sec:rmse}) \\
+$^2$: Includes zero occupant cases in training data \\
+$^3$: Excludes number of connected components feature \\
+\%: Precision (see Appendix \ref{sec:precision}), measuring a nominal test result \\
+$r$: Pearson's $r$ (see Appendix \ref{sec:correlation}), measuring a numeric test result \\
 }
 \caption{Results of Classification Experiment Set classification replicating ThermoSense algorithms and using self-selected algorithm}
 \label{tab:results:set1}
 \end{table}
 
 Significant care was taken to ensure that the same classification parameters were used between our experiments and those performed in ThermoSense to provide as accurate as possible a comparison between our results. However, there were some ambiguities with the ThermoSense results that have made it more difficult to determine which parameters to choose. In particular, with reference to the $k$-Nearest Neighbours tests (KNN), it was ambiguous in the ThermoSense paper as to whether this data used a nominal classification or a numeric classification.
 
-Because of this, four tests were performed overall to replicate the ThermoSense results as closely as possible; KNN tests for both numeric and nominal representations of data, a Multi-Layer Perceptron numeric test (MLP) and a Linear Regression numeric test (Lin Reg). With these tests we found that our prototype did not achieve comparable results. ThermoSense reported correlation coefficients ($r$) of around 0.9 for their MLP and Lin Reg tests, however we could not replicate these results, with our best being 0.69 and 0.59 respectively. We were also unable to replicate the low Root Mean Squared Errors (RMSEs) reported by ThermoSense, with their RMSEs for KNN, MLP and Lin Reg being 0.346, 0.385 and 0.409 respectively, while ours were 0.364 (KNN Nominal Case), 1.123 (KNN Numeric Case), 0.592 (MLP) and 0.525 (Lin Reg). Our numeric KNN test performed worse than the 0-R benchmark for numeric tests, indicating an exceedingly poor classification result, with it achieving an RMSE of 1.123 vs. the 0-R's 0.651.
+Because of this, four tests were performed overall to replicate the ThermoSense results as closely as possible; KNN tests for both numeric and nominal representations of data, a Multi-Layer Perceptron numeric test (MLP) and a Linear Regression numeric test (Lin Reg). With these tests (\Fref{tab:results:set1}) we found that our prototype did not achieve comparable results. ThermoSense reported correlation coefficients ($r$, see Appendix \ref{sec:correlation}) of around 0.9 for their MLP and Lin Reg tests, however we could not replicate these results, with our best being 0.69 and 0.59 respectively. We were also unable to replicate the low Root-Mean-Square Errors (RMSEs) reported by ThermoSense, with their RMSEs for KNN, MLP and Lin Reg being 0.346, 0.385 and 0.409~occupants respectively, while ours were 0.364 (KNN Nominal Case), 1.123 (KNN Numeric Case), 0.592 (MLP) and 0.525~occupants (Lin Reg). Our numeric KNN test performed worse than the 0-R benchmark for numeric tests, indicating an exceedingly poor classification result, with it achieving an RMSE of 1.123~occupants vs. the 0-R's 0.651~occupants.
 
-For our own proposed nominal classification algorithms, our accuracies were significantly improved, and in some cases exceeded the RMSEs reported by ThermoSense. Within our dataset, the K* and C4.5 algorithms were most accurate, with accuracies of 82.56\% and 82.39\% respectively. They both achieved RMSEs lower than the best achieved by ThermoSense, with their 0.304 and 0.314 a significant improvement on ThermoSense's KNN RMSE of 0.346.
+For our own proposed nominal classification algorithms, our accuracies were significantly improved, and in some cases exceeded the RMSEs reported by ThermoSense. Within our dataset, the K* and C4.5 algorithms were most accurate, with accuracy (or precision, see Appendix \ref{sec:precision}) of 82.56\% and 82.39\% respectively. They both achieved RMSEs lower than the best achieved by ThermoSense, with their 0.304 and 0.314~occupants a significant improvement on ThermoSense's KNN RMSE of 0.346~occupants.
 
-Following down the ranking, our nominal MLP performed next best, with an accuracy of 77.14\%, and an RMSE of 0.362, which is slightly higher than ThermoSense's best result. Following, the Support Vector Machine (SVM) implementation achieved a relatively poor accuracy of 67.18\% with an RMSE of 0.398, and finally the Naive Bayes (N. Bayes) approach, achieved the worst accuracy of 63.59\% with an RMSE of 0.405. None of these techniques however achieved an RMSE or accuracy worse than our 0-R benchmark, which achieved an RMSE of 0.442 and an accuracy of 49.74\%.
+Following down the ranking, our nominal MLP performed next best, with an accuracy of 77.14\%, and an RMSE of 0.362~occupants, which is slightly higher than ThermoSense's best result. Following, the Support Vector Machine (SVM) implementation achieved a relatively poor accuracy of 67.18\% with an RMSE of 0.398~occupants, and finally the Naive Bayes (N. Bayes) approach, achieved the worst accuracy of 63.59\% with an RMSE of 0.405~occupants. None of these techniques however achieved an RMSE or accuracy worse than our 0-R benchmark, which achieved an RMSE of 0.442~occupants and an accuracy of 49.74\%.
 
-In our sole numeric choice of K*, we found that it achieved a better correlation than any replicated ThermoSense technique, with $r = 0.760$. Additionally, its RMSE of 0.423 was also superior.
+In our sole numeric choice of K*, we found that it achieved a better correlation than any replicated ThermoSense technique, with $r = 0.760$. Additionally, its RMSE of 0.423~occupants was also superior.
 
 \clearpage{}
 

diff --git a/litreview/litreview.pdf b/litreview/litreview.pdf
diff --git a/litreview/litreview.tex b/litreview/litreview.tex
@@ -7,7 +7,7 @@ \chapter{Literature Review}
 
 These quantitative requirements can be used to exclude sensing options that clearly cannot meet the requirements before the more specific qualitative accessibility criteria will be considered for the remaining sensors. 
 
-The quantitative criteria elements are;
+The quantitative criteria elements are:
 \begin{enumerate}
  \item \emph{Presence}: Is there any occupant present in the sensed area?
  \item \emph{Count}: How many occupants are there in the sensed area?
@@ -38,7 +38,7 @@ \subsection{Static Traits}
 \label{subsubsec:litreview:sensors:intrinsic:static}
 Static traits are physiologically derived, and are present in most occupants. One key static trait that can be used for occupant sensing is that of thermal emissions. All human occupants emit distinctive thermal radiation in both resting and active states. The heat signatures of these emissions could potentially be measured with some apparatus, counted, and used to provide Presence and Count information to a sensor system, without providing Identity information.
 
-Beltran, Erickson and Cerpa~\cite{beltran2013thermosense} propose ThermoSense, a system that uses a type of thermal sensor known as a \iar. This sensor is much like a camera in that it has a field of view which is divided into ``pixels''; in this case an $8\times8$ grid of detected temperatures. This sensor is mounted on an embedded device on the ceiling, along with a \pir for basic motion detection and uses machine learning algorithms to detect human heat signatures within the raw thermal and motion data it collects. ThermoSense measures accuracy with Root Mean Squared Error (RMSE), an average of the absolute value that their prediction deviated from the true result. They achieve an RMSE of 0.35 occupants, which they indicate is sufficient for accurate occupancy detection.
+Beltran, Erickson and Cerpa~\cite{beltran2013thermosense} propose ThermoSense, a system that uses a type of thermal sensor known as a \iar. This sensor is much like a camera in that it has a field of view which is divided into ``pixels''; in this case an $8\times8$ grid of detected temperatures. This sensor is mounted on an embedded device on the ceiling, along with a \pir for basic motion detection and uses machine learning algorithms to detect human heat signatures within the raw thermal and motion data it collects. ThermoSense measures accuracy with Root-Mean-Square Error (RMSE), which performs a summation of the absolute valued deviation between their model's occupancy predictions and the ground truth data (see Appendix \ref{sec:rmse} for further explanation). They achieve an RMSE of 0.35 occupants, indicating that on average the predicted occupancy value only deviated by 0.35 from the actual, which they indicate is sufficient for accurate occupancy detection.
 
 Another static trait are \cdi emissions, which, like thermal emissions, are emitted by human occupants in both resting and active states. By measuring the build-up of \cdi within a given area, one can use a variety of mathematical models of human \cdi production to determine the likely number of occupants present. Hailemariam \etal~\cite{hailemariam2011real} trialled this as part of a sensor fusion within the context of an office environment, achieving a $\sim95\%$ accuracy. Such a sensing system could provide both the Presence and Count information, and exclude the Identity information as required. However, \cdi based detection methods have serious drawbacks: The \cdi feedback mechanism is slow, taking hours of continuous occupancy to correctly identify the presence of people, as discussed by Fisk, Faulkner and Sullivan~\cite{fisk2006accuracy}. In a residential environment, occupants are more likely to be moving between rooms than an office, so the system may have a more difficult time detecting in that situation. Similarly, such systems can be interfered with by other elements that control the \cdi build-up in a space, such as air conditioners and open windows. This is also much more of a concern in a residential environment compared to the studied office space, as the average residence can have numerous such confounding factors that cannot easily be controlled for.
 
@@ -68,7 +68,7 @@ \subsection{Instrumented Traits}
 
 Balaji \etal~\cite{balaji2013sentinel} also leverage smartphones to determine occupancy, but in a more broad enterprise environment: Wireless device association logs are analysed to determine which access points in a building a given occupant is connected to. If this access point falls within the radio range of their designated ``personal space'', they are considered to be occupying that personal space. This technique cannot be applied to a residential environment, as there are usually not multiple wireless hotspots present.
 
-Finally, Gupta, Intille and Larson~\cite{gupta2009adding} use the GPS functions of the smartphone to perform optimisation on heating and cooling systems by calculating the ``travel-to-home'' time of occupants at all times and ensuring at every distance the house is minimally heated such that if the potential occupant were to travel home, the house would be at the correct temperature when they arrived. While this system does achieve similar potential air-conditioning energy savings, it is not room-level modular, and also presupposes an occupant whose primary energy costs are from incorrect heating when away from home, which isn't necessarily the case for the elderly or disabled demographics considered in this dissertation.
+Finally, Gupta, Intille and Larson~\cite{gupta2009adding} use the GPS functions of the smartphone to perform optimisation on heating and cooling systems by calculating the ``travel-to-home'' time of occupants at all times and ensuring at every distance the house is minimally heated such that if the potential occupant were to travel home, the house would be at the correct temperature when they arrived. While this system does achieve similar potential air-conditioning energy savings, it is not room-level modular, and also presupposes an occupant whose primary energy costs are from incorrect heating when away from home, which is not necessarily the case for the elderly or disabled demographics considered in this dissertation.
 
 \subsection{Correlative Traits}
 \label{subsubsec:litreview:sensors:extrinsic:correlative}

diff --git a/proposal/proposal.tex b/proposal/proposal.tex
@@ -147,7 +147,7 @@ \section{Software and Hardware Requirements}
 
 \end{description}
 
-\renewcommand{\bibname}{\section{Proposal References} \vskip -1.75cm}
+\renewcommand{\bibname}{\tocless\section{Proposal References} \vskip -1.75cm}
 \putbib
 
 \end{bibunit}

diff --git a/references/primary.bib b/references/primary.bib
@@ -266,6 +266,14 @@ @misc{ArduinoForum
   year = {2012}
 }
 
+@misc{WekaCorrelation,
+  author = {{Weka}},
+  title = {{C}orrelation{A}ttribute{E}val},
+  howpublished = {\url{http://weka.sourceforge.net/doc.dev/weka/attributeSelection/CorrelationAttributeEval.html}},
+  note = {Retrieved June 20, 2015},
+  year = {2014}
+}
+
 @inproceedings{erickson2011observe,
   title={{OBSERVE}: Occupancy-based system for efficient reduction of {HVAC} energy},
   author={Erickson, Varick L and Carreira-Perpi{\~n}{\'a}n, Miguel {\'A} and Cerpa, Alberto E},
@@ -400,3 +408,14 @@ @book{han2011data
   year={2011},
   publisher={Elsevier Science}
 }
+
+@article{willmott2005advantages,
+  title={Advantages of the mean absolute error ({MAE}) over the root mean square error ({RMSE}) in assessing average model performance},
+  author={Willmott, Cort J and Matsuura, Kenji},
+  journal={Climate research},
+  volume={30},
+  number={1},
+  pages={79},
+  year={2005},
+  publisher={INTER-RESEARCH NORDBUNTE 23, D-21385 OLDENDORF LUHE, GERMANY}
+}