Skip to content

Commit

Permalink
Internal change
Browse files Browse the repository at this point in the history
PiperOrigin-RevId: 424659743
  • Loading branch information
achoum authored and copybara-github committed Jan 27, 2022
1 parent 945af7a commit d3fc8f8
Show file tree
Hide file tree
Showing 3 changed files with 67 additions and 2 deletions.
2 changes: 1 addition & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Changelog

## 0.2.3 - ????
## 0.2.3 - 2021-01-27

### Features

Expand Down
5 changes: 5 additions & 0 deletions documentation/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,11 @@ interfaces.
* [Linux / MacOS](#linux--macos)
* [Windows](#windows)
* [Running a minimal example](#running-a-minimal-example)
* [Compilation on and for Raspberry Pi](#compilation-on-and-for-raspberry-pi)
* [Install requirements](#install-requirements)
* [Compile Bazel](#compile-bazel)
* [Compile YDF](#compile-ydf)
* [Test YDF](#test-ydf)
* [Using the C++ library](#using-the-c-library)
* [Troubleshooting](#troubleshooting)

Expand Down
62 changes: 61 additions & 1 deletion documentation/learners.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,23 @@ the gradient of the loss relative to the model output).
- Rolling number of trees used to detect validation loss increase and trigger
early stopping.

#### [focal_loss_alpha](../yggdrasil_decision_forests/learner/gradient_boosted_trees/gradient_boosted_trees.proto?q=symbol:focal_loss_alpha)

- **Type:** Real **Default:** 0.5 **Possible values:** min:0 max:1

- EXPERIMENTAL. Wighting parameter for focal loss, positive samples weighted
by alpha, negative samples by (1-alpha). The default 0.5 value means no
active class-level weighting. Only used with Focal loss i.e.
`loss="BINARY_FOCAL_LOSS"`

#### [focal_loss_gamma](../yggdrasil_decision_forests/learner/gradient_boosted_trees/gradient_boosted_trees.proto?q=symbol:focal_loss_gamma)

- **Type:** Real **Default:** 2 **Possible values:** min:0

- EXPERIMENTAL. Exponent of the misprediction exponent term in focal Loss,
corresponds to gamma parameter in https://arxiv.org/pdf/1708.02002.pdf. Only
used with Focal loss i.e. `loss="BINARY_FOCAL_LOSS"`

#### [forest_extraction](../yggdrasil_decision_forests/learner/gradient_boosted_trees/gradient_boosted_trees.proto?q=symbol:forest_extraction)

- **Type:** Categorical **Default:** MART **Possible values:** MART, DART
Expand Down Expand Up @@ -133,6 +150,17 @@ the gradient of the loss relative to the model output).

- How to grow the tree.<br>- `LOCAL`: Each node is split independently of the other nodes. In other words, as long as a node satisfy the splits "constraints (e.g. maximum depth, minimum number of observations), the node will be split. This is the "classical" way to grow decision trees.<br>- `BEST_FIRST_GLOBAL`: The node with the best loss reduction among all the nodes of the tree is selected for splitting. This method is also called "best first" or "leaf-wise growth". See "Best-first decision tree learning", Shi and "Additive logistic regression : A statistical view of boosting", Friedman for more details.

#### [honest](../yggdrasil_decision_forests/learner/decision_tree/decision_tree.proto?q=symbol:honest)

- **Type:** Categorical **Default:** false **Possible values:** true, false

- In honest trees, different training examples are used to infer the structure
and the leaf values. This regularization technique trades examples for bias
estimates. It might increase or reduce the quality of the model. See
"Generalized Random Forests", Athey et al. In this paper, Honest tree are
trained with the Random Forest algorithm with a sampling without
replacement.

#### [in_split_min_examples_check](../yggdrasil_decision_forests/learner/decision_tree/decision_tree.proto?q=symbol:in_split_min_examples_check)

- **Type:** Categorical **Default:** true **Possible values:** true, false
Expand Down Expand Up @@ -185,7 +213,7 @@ the gradient of the loss relative to the model output).

- **Type:** Categorical **Default:** DEFAULT **Possible values:** DEFAULT,
BINOMIAL_LOG_LIKELIHOOD, SQUARED_ERROR, MULTINOMIAL_LOG_LIKELIHOOD,
LAMBDA_MART_NDCG5, XE_NDCG_MART
LAMBDA_MART_NDCG5, XE_NDCG_MART, BINARY_FOCAL_LOSS

- The loss optimized by the model. If not specified (DEFAULT) the loss is selected automatically according to the \"task\" and label statistics. For example, if task=CLASSIFICATION and the label has two possible values, the loss will be set to BINOMIAL_LOG_LIKELIHOOD. Possible values are:<br>- `DEFAULT`: Select the loss automatically according to the task and label statistics.<br>- `BINOMIAL_LOG_LIKELIHOOD`: Binomial log likelihood. Only valid for binary classification.<br>- `SQUARED_ERROR`: Least square loss. Only valid for regression.<br>- `MULTINOMIAL_LOG_LIKELIHOOD`: Multinomial log likelihood i.e. cross-entropy. Only valid for binary or multi-class classification.<br>- `LAMBDA_MART_NDCG5`: LambdaMART with NDCG5.<br>- `XE_NDCG_MART`: Cross Entropy Loss NDCG. See arxiv.org/abs/1911.09798.<br>

Expand Down Expand Up @@ -501,6 +529,17 @@ It is probably the most well-known of the Decision Forest training algorithms.

- How to grow the tree.<br>- `LOCAL`: Each node is split independently of the other nodes. In other words, as long as a node satisfy the splits "constraints (e.g. maximum depth, minimum number of observations), the node will be split. This is the "classical" way to grow decision trees.<br>- `BEST_FIRST_GLOBAL`: The node with the best loss reduction among all the nodes of the tree is selected for splitting. This method is also called "best first" or "leaf-wise growth". See "Best-first decision tree learning", Shi and "Additive logistic regression : A statistical view of boosting", Friedman for more details.

#### [honest](../yggdrasil_decision_forests/learner/decision_tree/decision_tree.proto?q=symbol:honest)

- **Type:** Categorical **Default:** false **Possible values:** true, false

- In honest trees, different training examples are used to infer the structure
and the leaf values. This regularization technique trades examples for bias
estimates. It might increase or reduce the quality of the model. See
"Generalized Random Forests", Athey et al. In this paper, Honest tree are
trained with the Random Forest algorithm with a sampling without
replacement.

#### [in_split_min_examples_check](../yggdrasil_decision_forests/learner/decision_tree/decision_tree.proto?q=symbol:in_split_min_examples_check)

- **Type:** Categorical **Default:** true **Possible values:** true, false
Expand Down Expand Up @@ -610,6 +649,16 @@ It is probably the most well-known of the Decision Forest training algorithms.
- Random seed for the training of the model. Learners are expected to be
deterministic by the random seed.

#### [sampling_with_replacement](../yggdrasil_decision_forests/learner/random_forest/random_forest.proto?q=symbol:sampling_with_replacement)

- **Type:** Categorical **Default:** true **Possible values:** true, false

- If true, the training examples are sampled with replacement. If false, the
training samples are sampled without replacement. Only used when
"bootstrap_training_dataset=true". If false (sampling without replacement)
and if "bootstrap_size_ratio=1" (default), all the examples are used to
train all the trees (you probably do not want that).

#### [sorting_strategy](../yggdrasil_decision_forests/learner/decision_tree/decision_tree.proto?q=symbol:sorting_strategy)

- **Type:** Categorical **Default:** PRESORT **Possible values:** IN_NODE,
Expand Down Expand Up @@ -741,6 +790,17 @@ used to grow the tree while the second is used to prune the tree.

- How to grow the tree.<br>- `LOCAL`: Each node is split independently of the other nodes. In other words, as long as a node satisfy the splits "constraints (e.g. maximum depth, minimum number of observations), the node will be split. This is the "classical" way to grow decision trees.<br>- `BEST_FIRST_GLOBAL`: The node with the best loss reduction among all the nodes of the tree is selected for splitting. This method is also called "best first" or "leaf-wise growth". See "Best-first decision tree learning", Shi and "Additive logistic regression : A statistical view of boosting", Friedman for more details.

#### [honest](../yggdrasil_decision_forests/learner/decision_tree/decision_tree.proto?q=symbol:honest)

- **Type:** Categorical **Default:** false **Possible values:** true, false

- In honest trees, different training examples are used to infer the structure
and the leaf values. This regularization technique trades examples for bias
estimates. It might increase or reduce the quality of the model. See
"Generalized Random Forests", Athey et al. In this paper, Honest tree are
trained with the Random Forest algorithm with a sampling without
replacement.

#### [in_split_min_examples_check](../yggdrasil_decision_forests/learner/decision_tree/decision_tree.proto?q=symbol:in_split_min_examples_check)

- **Type:** Categorical **Default:** true **Possible values:** true, false
Expand Down

0 comments on commit d3fc8f8

Please sign in to comment.