THE VARIABLE SELECTION JAR
The purpose of this jar file is to allow users to run the Preferential Diversity variable selection method with or without prior knowledge.
Graphical modeling structure learning algorithms can then be run on the selected variables.
Prior knowledge on relationships between variables can (optionally) be included in the analysis.
NOTE: THE INPUT FILE MUST BE A TAB-DELIMITED .TXT FILE
Usage: java -jar PrefDiv.jar
See Excel sheet for detailed description of parameters
Examples!
1. Select the top-50 variables without using prior knowledge
java -jar PrefDiv.jar -data data.txt -numSelect 50 -t Target
2. Select the top-50 variables using piPref-Div
java -jar PrefDiv.jar -data data.txt -numSelect 50 -t Target -priors Prior_Directory
3. Use an internal cross-validation to choose the number of variables to select based on prediction accuracy
java -jar PrefDiv.jar -data data.txt -cv 5 1,5,10,25,50,100 -t Target -priors Prior_Directory
4. Select the top-50 variables using piPref-Div but keep demographic data
java -jar PrefDiv.jar -data data.txt -numSelect 50 -t Target -priors Prior_Directory -keep Gender Age Race
5. Select the top-50 variables using piPref-Div and run StEPS to get a causal graph
java -jar PrefDiv.jar -data data.txt -numSelect 50 -t Target -priors Prior_Directory -keep Gender Age Race -useCausalGraph
6. Select the top-50 variables using piPref-Div and run piMGM to get a causal graph
java -jar PrefDiv.jar -data data.txt -numSelect 50 -t Target -priors Prior_Directory -keep Gender Age Race -useCausalGraph piMGM
7. Select the top-50 variables using piPref-Div and run StEPS to get a causal graph of the PCA summarized clusters
java -jar PrefDiv.jar -data data.txt -numSelect 50 -t Target -priors Prior_Directory -keep Gender Age Race -useCausalGraph -ctype pca