This repo contains code for the paper Stochastic Optimization Forests.

Code structure

The tree and forest classes are in tree.py, and the splitting criterion implementations for newsvendor problem, CVaR optimization, mean variance optimization, shortest path optimization are in newsvendor/nv_tree_utilities.py, cvar/cvar_tree_utilities.py, mean_var/meanvar_tree_utilities.py, and uber/cvar_tree_utilities.py, respectively. All scripts for different experiments are experiment_*.py files in each directory. Calling 'python experiment_name.py' will run these experiments in python.

Part of the code for tree and forest classes builds on the EconML package:

EconML: A Python Package for ML-Based Heterogeneous Treatment Effects Estimation. https://github.com/microsoft/EconML, 2019. Version 0.x.

Generating the figures and tables

The basic process of generating the figures is to first run the corresponding experiment script in each directory to get experimental results stored in .pkl files, and then use prepare_plot_data.ipynb to transform the .pkl files into .csv files, and finally use the .Rmd file in each directory to generate the plots.

CVaR Portfolio optimization

Figure 2

Figure 2(a): cvar/experiment_cvar_lognormal.py --> cvar/risk_cvar_lognormal.pkl --> cvar/risk_lognormal.csv --> cvar/Plotting_cvar.Rmd
Figure 2(b): cvar/experiment_cvar_lognormal.py --> cvar/feature_imp_cvar_lognormal.pkl --> cvar/feature_imp_cvar_lognormal.csv --> cvar/Plotting_cvar.Rmd

Figure 7 - 9

Figure 7: cvar/experiment_cvar_lognormal.py --> cvar/feature_split_cvar_lognormal.pkl --> cvar/feature_split_cvar_lognormal.csv --> cvar/Plotting_cvar.Rmd
Figure 8: cvar/experiment_cvar_lognormal_oracle.py --> cvar/risk_cvar_lognormal_oracle.pkl --> cvar/risk_lognormal_oracle.csv--> cvar/Plotting_cvar.Rmd
Figire 9: cvar/experiment_cvar_lognormal_objcoef.py --> cvar/risk_cvar_lognormal_objcoef.pkl --> cvar/risk_lognormal_objcoef.csv --> cvar/Plotting_cvar.Rmd

Figure 10

Figure 10(a): cvar/experiment_cvar_normal.py --> cvar/risk_cvar_normal.pkl --> cvar/risk_normal.csv--> cvar/Plotting_cvar.Rmd
Figure 10(b): cvar/experiment_cvar_normal_oracle.py --> cvar/risk_cvar_normal_oracle.pkl --> cvar/risk_normal_oracle.csv--> cvar/Plotting_cvar.Rmd

Uber experiment

All raw data files are in uber/data.
See uber/data_downloading.R and uber/preprocessing.R for data collection and preprocessing.

Figure 3

uber/experiment_downtown_years.py --> uber/downtown_risks_forest_years_halfyear.pkl, uber/downtown_risks_forest_years_oneyear.pkl, uber/downtown_risks_forest_years_onehalfyear.pkl, uber/downtown_risks_forest_years_twoyear.pkl --> uber/downtown_risks_forest_years_halfyear.csv, uber/downtown_risks_forest_years_oneyear.csv, uber/downtown_risks_forest_years_onehalfyear.csv, uber/downtown_risks_forest_years_twoyear.csv --> Plotting_uber.Rmd

Newsvendor

Figure 5

Fig 5(a) newsvendor/experiment_nv_n.py --> newsvendor/risk_n.pkl --> newsvendor/risk_nv_n.csv --> newsvendor/Plotting_newsvendor.Rmd
Fig 5(b) newsvendor/experiment_nv_n.py --> newsvendor/feature_split_n.pkl --> newsvendor/feature_split_nv_n.csv --> newsvendor/Plotting_newsvendor.Rmd
Fig 5(c) newsvendor/experiment_nv_n.py --> newsvendor/feature_importance_n.pkl --> newsvendor/feature_importance_n.csv --> newsvendor/Plotting_newsvendor.Rmd

Figure 6

Fig 6(a) newsvendor/experiment_nv_p.py --> newsvendor/risk_p.pkl --> newsvendor/risk_nv_p.csv --> newsvendor/Plotting_newsvendor.Rmd
Fig 6(b) newsvendor/experiment_nv_highdim.py --> newsvendor/risk_highdim.pkl --> newsvendor/risk_highdim.csv --> newsvendor/Plotting_newsvendor.Rmd

mean-variance optimization

Figure 4

Fig 4(a): mean_var/experiment_meanvar_stoch.py --> mean_var/rel_risk_meanvar_normal_stoch.pkl --> mean_var/rel_risk_full.csv --> mean_var/Plotting_meanvar.Rmd
Fig 4(b): mean_var/experiment_meanvar_stoch.py --> mean_var/feature_split_meanvar_normal_stoch.pkl --> mean_var/feature_freq_full.csv --> mean_var/Plotting_meanvar.Rmd
Fig 4(c): mean_var/experiment_meanvar_stoch.py --> mean_var/cond_violation_meanvar_normal_stoch.pkl --> mean_var/cond_violation_full.csv --> mean_var/Plotting_meanvar.Rmd
Fig 4(d): mean_var/experiment_meanvar_stoch.py --> mean_var/mean_violation_meanvar_normal_stoch.pkl --> mean_var/marginal_violation_full.csv --> mean_var/Plotting_meanvar.Rmd

Figure 12

Fig 12(a): mean_var/experiment_var_normal_oracle.py --> mean_var/risk_var_normal_oracle.pkl --> mean_var/risk_var_normal_oracle.csv --> mean_var/Plotting_var.Rmd
Fig 12(b): mean_var/experiment_var_normal_oracle.py --> mean_var/feature_split_var_normal_oracle.pkl --> mean_var/feature_split_var_normal_oracle.csv --> mean_var/Plotting_var.Rmd
Fig 12(c): mean_var/experiment_var_normal.py --> mean_var/risk_var_normal.pkl --> mean_var/risk_var_normal.csv --> mean_var/Plotting_var.Rmd

Figure 13

Fig 13(a): mean_var/experiment_meanvar_stoch_oracle.py --> mean_var/rel_risk_meanvar_normal_stoch_oracle.pkl --> mean_var/rel_risk_full_oracle.csv --> mean_var/Plotting_meanvar.Rmd
Fig 13(b): mean_var/experiment_meanvar_stoch.py --> mean_var/rel_risk_meanvar_normal_stoch_oracle.pkl --> mean_var/feature_freq_full_oracle.csv --> mean_var/Plotting_meanvar.Rmd

Figure 14

Fig 14: mean_var/experiment_meanvar_stoch_R.py --> mean_var/rel_risk_meanvar_normal_stoch_R.pkl --> mean_var/rel_risk_full_R.csv --> mean_var/Plotting_meanvar.Rmd

honest forests

Fig 15(a): cvar/experiment_cvar_lognormal_honesty.py --> cvar/risk_cvar_lognormal_honesty.pkl --> cvar/risk_lognormal_honesty.csv --> cvar/Plotting_cvar.Rmd
Fig 15(b): newsvendor/experiment_nv_honesty.py --> newsvendor/risk_nv_honesty.pkl --> newsvendor/risk_nv_honesty.csv --> newsvendor/Plotting_newsvendor.Rmd

Running time

Table 1: cvar/speed_cvar.ipynb --> time_cvar.pkl
Table 2: mean_var/speed_meanvar.ipynb --> time_meanvar.pkl

Dependencies

python 3.6.10

gurobipy 9.0.2
joblib 0.16.0
numpy 1.19.1
scikit-learn 0.23.2
scipy 1.3.1

R 3.6.1

latex2exp 0.4.0
tidyverse 1.3.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

readme.md

readme.md

Code structure

Generating the figures and tables

CVaR Portfolio optimization

Figure 2

Figure 7 - 9

Figure 10

Uber experiment

Figure 3

Newsvendor

Figure 5

Figure 6

mean-variance optimization

Figure 4

Figure 12

Figure 13

Figure 14

honest forests

Running time

Dependencies

python 3.6.10

R 3.6.1

Files

readme.md

Latest commit

History

readme.md

File metadata and controls

Code structure

Generating the figures and tables

CVaR Portfolio optimization

Figure 2

Figure 7 - 9

Figure 10

Uber experiment

Figure 3

Newsvendor

Figure 5

Figure 6

mean-variance optimization

Figure 4

Figure 12

Figure 13

Figure 14

honest forests

Running time

Dependencies

python 3.6.10

R 3.6.1