Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
random-forest.ipynb		random-forest.ipynb
titanic.csv		titanic.csv

README.md

CICF Week 11

This week we will look at some data wrangling on a tabular dataset. We will then fit a decision tree and a random forest model to the data.

Tutorial

This class focuses more on the tools and concepts you might encounter related to cyberinfrastructure. This means we are not going to cover the mathematics behind the machine learning algorithms in much depth. But I encourage you to look at these materials if you find the techniques interesting. This and the next tutorials are based on the Practical Deep Learning for Coders lessons by Jeremy Howard. References are included in the Resources section at the end of this file.

Random Forests

This section is modeled after the excellent tutorial by Jeremy Howard titled "How Random Forests Really Work". I recommend looking at this for more detail on how decision trees and random forests work.

Open your VM, and git pull in the cicf folder.

sudo apt install graphviz
pip install --upgrade jupyter-core nbconvert seaborn fastai

We are going to work with the Titanic dataset. Lets first look at the dataset This dataset is a passenger manifest from the Titanic.

The rest of this section is in the notebook random-forest.ipynb.

Resources

Sources for the tutorial notebook:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

week11

week11

README.md

CICF Week 11

Tutorial

Random Forests

Resources

Files

week11

Directory actions

More options

Directory actions

More options

Latest commit

History

week11

Folders and files

parent directory

README.md

CICF Week 11

Tutorial

Random Forests

Resources