Application of ML methods on a medical dataset to determine the most significant risk factors and predict outcomes.
Coincidences of various health metrics with coronary heart disease were determined using machine learning approaches. Prediction models were trained and tested utilizing naïve Bayes (gaussian and complement) and linear regression, as well as an ensemble approach in an attempt to increase predictive power.
Note that no learnings from the ensemble approach have been included in the paper at this time. Additionally, many of the techniques used at the time of writing the paper should be improved upon, such as opting for logistic regression over linear regression and using inbuilt functions for k-folds cross validation.