Description:
In this project, I employed several supervised algorithms of your choice to accurately model individuals' income using data collected from the 1994 U.S. Census. I then chose the best candidate algorithm from preliminary results and further optimize this algorithm to best model the data. My goal with this implementation was to construct a model that accurately predicts whether an individual makes more than $50,000.
Project Notebook: Finding Donors
Description:
Going forward, AI algorithms will be incorporated into more and more everyday applications. For example, you might want to include an image classifier in a smart phone app. To do this, you'd use a deep learning model trained on hundreds of thousands of images as part of the overall application architecture. A large part of software development in the future will be using these types of models as common parts of applications.
In this project, I trained an image classifier to recognize different species of flowers. You can imagine using something like this in a phone app that tells you the name of the flower your camera is looking at. In practice one would train this classifier, then export it for use in application. We'll be using this dataset of 102 flower categories
Project Notebook: Image Classifier
Description:
In this project, I applied unsupervised learning techniques to identify segments of the population that form the core customer base for a mail-order sales company in Germany. These segments can then be used to direct marketing campaigns towards audiences that will have the highest expected rate of returns. The data that you will use has been provided by our partners at Bertelsmann Arvato Analytics, and represents a real-life data science task.
First, the general demographics data are clustered through a KMeans clustering algorithms, then the same parameters are applied over the customer dataset to investigate if the customers are following the same distributions.
Project Notebook: Customer Segmentations
Description:
In this project, I analyzed the 2011 - 2018 Stack Overflow developer survey data in order to create a blog post regarding a comprehensive study of data science careers. The project notebook could be found below
Project Notebook: Understanding the Career of Data Scientists
Blog Post: Understanding the Career of Data Scientist Using the Data Science Way
Description:
In this project, I built a data transformation - machine learning pipeline that is capable to curate the class of the messages. The pipeline is eventually built into a flask application. The project include a web app where an emergency worker can input a new message and get classification results in several categories. The web app will also display visualizations of the data.The project notebook could be found below
Prject Notebook: Machine Learning Pipeline
Description:
In this project, I developed a recommendation engine's algorithm with IBM communities articles and user interactions. This project will serve as a prototype of the recommender systems of the article recommendation systems of IBM. The project will be hosted on
Project Notebook: Recommendations with IBM
Description:
In the capstone project, I built a distributed machine learning for a user log using spark - the big data technology toolkit. The primary objective was to predict the churning possibilities of every user. Most of the data visualizations were completed on a small subset of the data, while the full dataset analytics were performed using AWS's EMR services.
Project Notebook: Spark - Subset Analytics
Project Notebook: Spark - Full Dataset Analytics
Blog Post: Understanding Customer Churning with Big Data Analytics
Disclaimer: Remember to provide proper citation if you want to use any part of this code.