diff --git a/CHANGELOG.md b/CHANGELOG.md
index c118ecab..0ae7c569 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -1,6 +1,6 @@
# Changelog
-## 0.2.3 - ????
+## 0.2.3 - 2021-01-27
### Features
diff --git a/documentation/installation.md b/documentation/installation.md
index ace7a620..a5201221 100644
--- a/documentation/installation.md
+++ b/documentation/installation.md
@@ -15,6 +15,11 @@ interfaces.
* [Linux / MacOS](#linux--macos)
* [Windows](#windows)
* [Running a minimal example](#running-a-minimal-example)
+ * [Compilation on and for Raspberry Pi](#compilation-on-and-for-raspberry-pi)
+ * [Install requirements](#install-requirements)
+ * [Compile Bazel](#compile-bazel)
+ * [Compile YDF](#compile-ydf)
+ * [Test YDF](#test-ydf)
* [Using the C++ library](#using-the-c-library)
* [Troubleshooting](#troubleshooting)
diff --git a/documentation/learners.md b/documentation/learners.md
index f2c111ee..d7195616 100644
--- a/documentation/learners.md
+++ b/documentation/learners.md
@@ -105,6 +105,23 @@ the gradient of the loss relative to the model output).
- Rolling number of trees used to detect validation loss increase and trigger
early stopping.
+#### [focal_loss_alpha](../yggdrasil_decision_forests/learner/gradient_boosted_trees/gradient_boosted_trees.proto?q=symbol:focal_loss_alpha)
+
+- **Type:** Real **Default:** 0.5 **Possible values:** min:0 max:1
+
+- EXPERIMENTAL. Wighting parameter for focal loss, positive samples weighted
+ by alpha, negative samples by (1-alpha). The default 0.5 value means no
+ active class-level weighting. Only used with Focal loss i.e.
+ `loss="BINARY_FOCAL_LOSS"`
+
+#### [focal_loss_gamma](../yggdrasil_decision_forests/learner/gradient_boosted_trees/gradient_boosted_trees.proto?q=symbol:focal_loss_gamma)
+
+- **Type:** Real **Default:** 2 **Possible values:** min:0
+
+- EXPERIMENTAL. Exponent of the misprediction exponent term in focal Loss,
+ corresponds to gamma parameter in https://arxiv.org/pdf/1708.02002.pdf. Only
+ used with Focal loss i.e. `loss="BINARY_FOCAL_LOSS"`
+
#### [forest_extraction](../yggdrasil_decision_forests/learner/gradient_boosted_trees/gradient_boosted_trees.proto?q=symbol:forest_extraction)
- **Type:** Categorical **Default:** MART **Possible values:** MART, DART
@@ -133,6 +150,17 @@ the gradient of the loss relative to the model output).
- How to grow the tree.
- `LOCAL`: Each node is split independently of the other nodes. In other words, as long as a node satisfy the splits "constraints (e.g. maximum depth, minimum number of observations), the node will be split. This is the "classical" way to grow decision trees.
- `BEST_FIRST_GLOBAL`: The node with the best loss reduction among all the nodes of the tree is selected for splitting. This method is also called "best first" or "leaf-wise growth". See "Best-first decision tree learning", Shi and "Additive logistic regression : A statistical view of boosting", Friedman for more details.
+#### [honest](../yggdrasil_decision_forests/learner/decision_tree/decision_tree.proto?q=symbol:honest)
+
+- **Type:** Categorical **Default:** false **Possible values:** true, false
+
+- In honest trees, different training examples are used to infer the structure
+ and the leaf values. This regularization technique trades examples for bias
+ estimates. It might increase or reduce the quality of the model. See
+ "Generalized Random Forests", Athey et al. In this paper, Honest tree are
+ trained with the Random Forest algorithm with a sampling without
+ replacement.
+
#### [in_split_min_examples_check](../yggdrasil_decision_forests/learner/decision_tree/decision_tree.proto?q=symbol:in_split_min_examples_check)
- **Type:** Categorical **Default:** true **Possible values:** true, false
@@ -185,7 +213,7 @@ the gradient of the loss relative to the model output).
- **Type:** Categorical **Default:** DEFAULT **Possible values:** DEFAULT,
BINOMIAL_LOG_LIKELIHOOD, SQUARED_ERROR, MULTINOMIAL_LOG_LIKELIHOOD,
- LAMBDA_MART_NDCG5, XE_NDCG_MART
+ LAMBDA_MART_NDCG5, XE_NDCG_MART, BINARY_FOCAL_LOSS
- The loss optimized by the model. If not specified (DEFAULT) the loss is selected automatically according to the \"task\" and label statistics. For example, if task=CLASSIFICATION and the label has two possible values, the loss will be set to BINOMIAL_LOG_LIKELIHOOD. Possible values are:
- `DEFAULT`: Select the loss automatically according to the task and label statistics.
- `BINOMIAL_LOG_LIKELIHOOD`: Binomial log likelihood. Only valid for binary classification.
- `SQUARED_ERROR`: Least square loss. Only valid for regression.
- `MULTINOMIAL_LOG_LIKELIHOOD`: Multinomial log likelihood i.e. cross-entropy. Only valid for binary or multi-class classification.
- `LAMBDA_MART_NDCG5`: LambdaMART with NDCG5.
- `XE_NDCG_MART`: Cross Entropy Loss NDCG. See arxiv.org/abs/1911.09798.
@@ -501,6 +529,17 @@ It is probably the most well-known of the Decision Forest training algorithms.
- How to grow the tree.
- `LOCAL`: Each node is split independently of the other nodes. In other words, as long as a node satisfy the splits "constraints (e.g. maximum depth, minimum number of observations), the node will be split. This is the "classical" way to grow decision trees.
- `BEST_FIRST_GLOBAL`: The node with the best loss reduction among all the nodes of the tree is selected for splitting. This method is also called "best first" or "leaf-wise growth". See "Best-first decision tree learning", Shi and "Additive logistic regression : A statistical view of boosting", Friedman for more details.
+#### [honest](../yggdrasil_decision_forests/learner/decision_tree/decision_tree.proto?q=symbol:honest)
+
+- **Type:** Categorical **Default:** false **Possible values:** true, false
+
+- In honest trees, different training examples are used to infer the structure
+ and the leaf values. This regularization technique trades examples for bias
+ estimates. It might increase or reduce the quality of the model. See
+ "Generalized Random Forests", Athey et al. In this paper, Honest tree are
+ trained with the Random Forest algorithm with a sampling without
+ replacement.
+
#### [in_split_min_examples_check](../yggdrasil_decision_forests/learner/decision_tree/decision_tree.proto?q=symbol:in_split_min_examples_check)
- **Type:** Categorical **Default:** true **Possible values:** true, false
@@ -610,6 +649,16 @@ It is probably the most well-known of the Decision Forest training algorithms.
- Random seed for the training of the model. Learners are expected to be
deterministic by the random seed.
+#### [sampling_with_replacement](../yggdrasil_decision_forests/learner/random_forest/random_forest.proto?q=symbol:sampling_with_replacement)
+
+- **Type:** Categorical **Default:** true **Possible values:** true, false
+
+- If true, the training examples are sampled with replacement. If false, the
+ training samples are sampled without replacement. Only used when
+ "bootstrap_training_dataset=true". If false (sampling without replacement)
+ and if "bootstrap_size_ratio=1" (default), all the examples are used to
+ train all the trees (you probably do not want that).
+
#### [sorting_strategy](../yggdrasil_decision_forests/learner/decision_tree/decision_tree.proto?q=symbol:sorting_strategy)
- **Type:** Categorical **Default:** PRESORT **Possible values:** IN_NODE,
@@ -741,6 +790,17 @@ used to grow the tree while the second is used to prune the tree.
- How to grow the tree.
- `LOCAL`: Each node is split independently of the other nodes. In other words, as long as a node satisfy the splits "constraints (e.g. maximum depth, minimum number of observations), the node will be split. This is the "classical" way to grow decision trees.
- `BEST_FIRST_GLOBAL`: The node with the best loss reduction among all the nodes of the tree is selected for splitting. This method is also called "best first" or "leaf-wise growth". See "Best-first decision tree learning", Shi and "Additive logistic regression : A statistical view of boosting", Friedman for more details.
+#### [honest](../yggdrasil_decision_forests/learner/decision_tree/decision_tree.proto?q=symbol:honest)
+
+- **Type:** Categorical **Default:** false **Possible values:** true, false
+
+- In honest trees, different training examples are used to infer the structure
+ and the leaf values. This regularization technique trades examples for bias
+ estimates. It might increase or reduce the quality of the model. See
+ "Generalized Random Forests", Athey et al. In this paper, Honest tree are
+ trained with the Random Forest algorithm with a sampling without
+ replacement.
+
#### [in_split_min_examples_check](../yggdrasil_decision_forests/learner/decision_tree/decision_tree.proto?q=symbol:in_split_min_examples_check)
- **Type:** Categorical **Default:** true **Possible values:** true, false