Data Science Portfolio Projects

Project 1 - Predicting Donors' Income using supervised learning

Description:

In this project, I employed several supervised algorithms of your choice to accurately model individuals' income using data collected from the 1994 U.S. Census. I then chose the best candidate algorithm from preliminary results and further optimize this algorithm to best model the data. My goal with this implementation was to construct a model that accurately predicts whether an individual makes more than $50,000.

Project Notebook: Finding Donors

Project 2 - Flower Image Classifier Application

Description:

Going forward, AI algorithms will be incorporated into more and more everyday applications. For example, you might want to include an image classifier in a smart phone app. To do this, you'd use a deep learning model trained on hundreds of thousands of images as part of the overall application architecture. A large part of software development in the future will be using these types of models as common parts of applications.

In this project, I trained an image classifier to recognize different species of flowers. You can imagine using something like this in a phone app that tells you the name of the flower your camera is looking at. In practice one would train this classifier, then export it for use in application. We'll be using this dataset of 102 flower categories

Project Notebook: Image Classifier

Project 3 - Identifying Customers Segmentations

Description:

In this project, I applied unsupervised learning techniques to identify segments of the population that form the core customer base for a mail-order sales company in Germany. These segments can then be used to direct marketing campaigns towards audiences that will have the highest expected rate of returns. The data that you will use has been provided by our partners at Bertelsmann Arvato Analytics, and represents a real-life data science task.

First, the general demographics data are clustered through a KMeans clustering algorithms, then the same parameters are applied over the customer dataset to investigate if the customers are following the same distributions.

Project Notebook: Customer Segmentations

Project 4 - Data Science Blog

Description:

In this project, I analyzed the 2011 - 2018 Stack Overflow developer survey data in order to create a blog post regarding a comprehensive study of data science careers. The project notebook could be found below

Project Notebook: Understanding the Career of Data Scientists

Blog Post: Understanding the Career of Data Scientist Using the Data Science Way

Project 5 - Disaster Response Pipeline

Description:

In this project, I built a data transformation - machine learning pipeline that is capable to curate the class of the messages. The pipeline is eventually built into a flask application. The project include a web app where an emergency worker can input a new message and get classification results in several categories. The web app will also display visualizations of the data.The project notebook could be found below

Project Notebook: ETL

Prject Notebook: Machine Learning Pipeline

Disaster Response App

Project 6 - Recommendation System

Description:

In this project, I developed a recommendation engine's algorithm with IBM communities articles and user interactions. This project will serve as a prototype of the recommender systems of the article recommendation systems of IBM. The project will be hosted on

Project Notebook: Recommendations with IBM

Capstone Project - Spark Distributed Analytics

Description:

In the capstone project, I built a distributed machine learning for a user log using spark - the big data technology toolkit. The primary objective was to predict the churning possibilities of every user. Most of the data visualizations were completed on a small subset of the data, while the full dataset analytics were performed using AWS's EMR services.

Project Notebook: Spark - Subset Analytics

Project Notebook: Spark - Full Dataset Analytics

Blog Post: Understanding Customer Churning with Big Data Analytics

Disclaimer: Remember to provide proper citation if you want to use any part of this code.

Name		Name	Last commit message	Last commit date
Latest commit History 139 Commits
Capstone Project		Capstone Project
Project 1 - Finding Donars		Project 1 - Finding Donars
Project 2 - Image Classifier Application		Project 2 - Image Classifier Application
Project 3 - Identify Customer Segementation		Project 3 - Identify Customer Segementation
Project 4 - Data Science Blog		Project 4 - Data Science Blog
Project 5 - Disaster Response Pipeline		Project 5 - Disaster Response Pipeline
Project 6 - Reccomendation System		Project 6 - Reccomendation System
Data Scientist Nanodegree certificate.jpg		Data Scientist Nanodegree certificate.jpg
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Science Portfolio Projects

Project 1 - Predicting Donors' Income using supervised learning

Project 2 - Flower Image Classifier Application

Project 3 - Identifying Customers Segmentations

Project 4 - Data Science Blog

Project 5 - Disaster Response Pipeline

Project 6 - Recommendation System

Capstone Project - Spark Distributed Analytics

About

Releases

Packages

Languages

chen-bowen/Data_Science_Portfolio

Folders and files

Latest commit

History

Repository files navigation

Data Science Portfolio Projects

Project 1 - Predicting Donors' Income using supervised learning

Project 2 - Flower Image Classifier Application

Project 3 - Identifying Customers Segmentations

Project 4 - Data Science Blog

Project 5 - Disaster Response Pipeline

Project 6 - Recommendation System

Capstone Project - Spark Distributed Analytics

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages