diff --git a/CHANGELOG.md b/CHANGELOG.md index 167b03a6..9f15fc8d 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,16 +2,27 @@ ## HEAD +## 1.5.0 - 2023-07-03 + ### Feature - Rename experimental_analyze_model_and_dataset to analyze_model_and_dataset - Add new GBT loss function `POISSON` for Poisson log likelihood. - Go API: Categorical string values available for inspection. - Improved training speed for unit-weight datasets. +- Support for MHLD oblique decision trees. +- Multi-threaded RMSE computation. +- Added Uint8 inference engine. +- Added Multi-task learning where the output of models trained as "secondary" + are used as input for the models trained as "primary" ### Fix - Go API: fixed typo on OutOfVocabulary constant. +- Error messages for Uplift models. +- Remove owner leakage in the model compiler. +- Fix buggy restriction for SelGB sampling +- Improve documentation. ## 1.4.0 - 2023-03-20 diff --git a/documentation/rtd/hyper_parameters.md b/documentation/rtd/hyper_parameters.md index 3d442089..d774c920 100644 --- a/documentation/rtd/hyper_parameters.md +++ b/documentation/rtd/hyper_parameters.md @@ -283,9 +283,9 @@ reasonable time. - **Type:** Categorical **Default:** DEFAULT **Possible values:** DEFAULT, BINOMIAL_LOG_LIKELIHOOD, SQUARED_ERROR, MULTINOMIAL_LOG_LIKELIHOOD, - LAMBDA_MART_NDCG5, XE_NDCG_MART, BINARY_FOCAL_LOSS + LAMBDA_MART_NDCG5, XE_NDCG_MART, BINARY_FOCAL_LOSS, POISSON -- The loss optimized by the model. If not specified (DEFAULT) the loss is selected automatically according to the \"task\" and label statistics. For example, if task=CLASSIFICATION and the label has two possible values, the loss will be set to BINOMIAL_LOG_LIKELIHOOD. Possible values are:
- `DEFAULT`: Select the loss automatically according to the task and label statistics.
- `BINOMIAL_LOG_LIKELIHOOD`: Binomial log likelihood. Only valid for binary classification.
- `SQUARED_ERROR`: Least square loss. Only valid for regression.
- `MULTINOMIAL_LOG_LIKELIHOOD`: Multinomial log likelihood i.e. cross-entropy. Only valid for binary or multi-class classification.
- `LAMBDA_MART_NDCG5`: LambdaMART with NDCG5.
- `XE_NDCG_MART`: Cross Entropy Loss NDCG. See arxiv.org/abs/1911.09798.
+- The loss optimized by the model. If not specified (DEFAULT) the loss is selected automatically according to the \"task\" and label statistics. For example, if task=CLASSIFICATION and the label has two possible values, the loss will be set to BINOMIAL_LOG_LIKELIHOOD. Possible values are:
- `DEFAULT`: Select the loss automatically according to the task and label statistics.
- `BINOMIAL_LOG_LIKELIHOOD`: Binomial log likelihood. Only valid for binary classification.
- `SQUARED_ERROR`: Least square loss. Only valid for regression.
- `POISSON`: Poisson log likelihood loss. Mainly used for counting problems. Only valid for regression.
- `MULTINOMIAL_LOG_LIKELIHOOD`: Multinomial log likelihood i.e. cross-entropy. Only valid for binary or multi-class classification.
- `LAMBDA_MART_NDCG5`: LambdaMART with NDCG5.
- `XE_NDCG_MART`: Cross Entropy Loss NDCG. See arxiv.org/abs/1911.09798.
#### [max_depth](https://github.com/google/yggdrasil-decision-forests/blob/main/yggdrasil_decision_forests/learner/decision_tree/decision_tree.proto) @@ -379,7 +379,7 @@ reasonable time. #### [sampling_method](https://github.com/google/yggdrasil-decision-forests/blob/main/yggdrasil_decision_forests/learner/gradient_boosted_trees/gradient_boosted_trees.proto) - **Type:** Categorical **Default:** RANDOM **Possible values:** NONE, RANDOM, - GOSS + GOSS, SELGB - Control the sampling of the datasets used to train individual trees.
- NONE: No sampling is applied. This is equivalent to RANDOM sampling with \"subsample=1\".
- RANDOM (default): Uniform random sampling. Automatically selected if "subsample" is set.
- GOSS: Gradient-based One-Side Sampling. Automatically selected if "goss_alpha" or "goss_beta" is set.
- SELGB: Selective Gradient Boosting. Automatically selected if "selective_gradient_boosting_ratio" is set. Only valid for ranking.