From 8bf9db09166de189c0e0c2f8633a17b4bcb7a845 Mon Sep 17 00:00:00 2001
From: Richard Stotz <richardstotz@google.com>
Date: Wed, 21 Aug 2024 03:21:05 -0700
Subject: [PATCH] Prepare release of TF-DF 1.10.0 and YDF 1.10.0 and PYDF 0.7.0

PiperOrigin-RevId: 665797043
---
 CHANGELOG.md                                  |  3 +-
 documentation/public/docs/hyperparameters.md  | 92 +++++++------------
 .../port/python/CHANGELOG.md                  | 14 ++-
 .../port/python/config/setup.py               |  4 +-
 .../port/python/dev_requirements.txt          |  2 +-
 .../pybind11_protobuf/workspace.bzl           |  4 +-
 .../port/python/tools/build_test_linux.sh     |  2 +-
 .../port/python/tools/release_macos.sh        |  3 +-
 .../port/python/tools/release_windows.bat     |  2 +-
 .../port/python/ydf/cc/BUILD                  |  4 +
 .../port/python/ydf/model/generic_model.py    |  6 --
 .../port/python/ydf/version.py                |  2 +-
 .../utils/compatibility.h                     |  1 +
 13 files changed, 63 insertions(+), 76 deletions(-)
diff --git a/CHANGELOG.md b/CHANGELOG.md
index 7fc4feb9..376703a8 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -3,7 +3,7 @@
 Note: This is the changelog of the C++ library. The Python port has a separate
 Changelog under `yggdrasil_decision_forests/port/python/CHANGELOG.md`.
 
-## Head
+## 1.10.0 - 2024-08-21
 
 ### Features
 
@@ -11,6 +11,7 @@ Changelog under `yggdrasil_decision_forests/port/python/CHANGELOG.md`.
 -   The default value of `num_candidate_attributes` in the CART learner is
     changed from 0 (Random Forest style sampling) to -1 (no sampling). This is
     the generally accepted logic of CART.
+-   Added support for GCS for file I/O.
 
 ## 1.9.0 - 2024-03-12
 
diff --git a/documentation/public/docs/hyperparameters.md b/documentation/public/docs/hyperparameters.md
index 37883238..8c0075ed 100644
--- a/documentation/public/docs/hyperparameters.md
+++ b/documentation/public/docs/hyperparameters.md
@@ -24,11 +24,6 @@ learner: "RANDOM_FOREST"
   num_trees: 1000
 }
 ```
-
-## Table of content
-
-[TOC]
-
 ## GRADIENT_BOOSTED_TREES
 
 A [Gradient Boosted Trees](https://statweb.stanford.edu/~jhf/ftp/trebst.pdf)
@@ -427,14 +422,15 @@ reasonable time.
 
 -   Coefficient applied to each tree prediction. A small value (0.02) tends to
     give more accurate results (assuming enough trees are trained), but results
-    in larger models. Analogous to neural network learning rate.
+    in larger models. Analogous to neural network learning rate. Fixed to 1.0
+    for DART models.
 
 #### [sorting_strategy](https://github.com/google/yggdrasil-decision-forests/blob/main/yggdrasil_decision_forests/learner/decision_tree/decision_tree.proto)
 
 -   **Type:** Categorical **Default:** PRESORT **Possible values:** IN_NODE,
-    PRESORT
+    PRESORT, FORCE_PRESORT, AUTO
 
--   How are sorted the numerical features in order to find the splits<br>- PRESORT: The features are pre-sorted at the start of the training. This solution is faster but consumes much more memory than IN_NODE.<br>- IN_NODE: The features are sorted just before being used in the node. This solution is slow but consumes little amount of memory.<br>.
+-   How are sorted the numerical features in order to find the splits<br>- AUTO: Selects the most efficient method among IN_NODE, FORCE_PRESORT, and LAYER.<br>- IN_NODE: The features are sorted just before being used in the node. This solution is slow but consumes little amount of memory.<br>- FORCE_PRESORT: The features are pre-sorted at the start of the training. This solution is faster but consumes much more memory than IN_NODE.<br>- PRESORT: Automatically choose between FORCE_PRESORT and IN_NODE.<br>.
 
 #### [sparse_oblique_max_num_projections](https://github.com/google/yggdrasil-decision-forests/blob/main/yggdrasil_decision_forests/learner/decision_tree/decision_tree.proto)
 
@@ -473,7 +469,7 @@ reasonable time.
 -   **Type:** Categorical **Default:** AXIS_ALIGNED **Possible values:**
     AXIS_ALIGNED, SPARSE_OBLIQUE, MHLD_OBLIQUE
 
--   What structure of split to consider for numerical features.<br>- `AXIS_ALIGNED`: Axis aligned splits (i.e. one condition at a time). This is the "classical" way to train a tree. Default value.<br>- `SPARSE_OBLIQUE`: Sparse oblique splits (i.e. random splits one a small number of features) from "Sparse Projection Oblique Random Forests", Tomita et al., 2020.<br>- `MHLD_OBLIQUE`: Multi-class Hellinger Linear Discriminant splits from "Classification Based on Multivariate Contrast Patterns", Canete-Sifuentes et al., 2029
+-   What structure of split to consider for numerical features.<br>- `AXIS_ALIGNED`: Axis aligned splits (i.e. one condition at a time). This is the "classical" way to train a tree. Default value.<br>- `SPARSE_OBLIQUE`: Sparse oblique splits (i.e. random splits on a small number of features) from "Sparse Projection Oblique Random Forests", Tomita et al., 2020.<br>- `MHLD_OBLIQUE`: Multi-class Hellinger Linear Discriminant splits from "Classification Based on Multivariate Contrast Patterns", Canete-Sifuentes et al., 2029
 
 #### [subsample](https://github.com/google/yggdrasil-decision-forests/blob/main/yggdrasil_decision_forests/learner/gradient_boosted_trees/gradient_boosted_trees.proto)
 
@@ -536,8 +532,8 @@ reasonable time.
 
 ## RANDOM_FOREST
 
-A Random Forest (https://www.stat.berkeley.edu/~breiman/randomforest2001.pdf) is
-a collection of deep CART decision trees trained independently and without
+A [Random Forest](https://www.stat.berkeley.edu/~breiman/randomforest2001.pdf)
+is a collection of deep CART decision trees trained independently and without
 pruning. Each tree is trained on a random subset of the original training
 dataset (sampled with replacement).
 
@@ -853,9 +849,9 @@ reasonable time.
 #### [sorting_strategy](https://github.com/google/yggdrasil-decision-forests/blob/main/yggdrasil_decision_forests/learner/decision_tree/decision_tree.proto)
 
 -   **Type:** Categorical **Default:** PRESORT **Possible values:** IN_NODE,
-    PRESORT
+    PRESORT, FORCE_PRESORT, AUTO
 
--   How are sorted the numerical features in order to find the splits<br>- PRESORT: The features are pre-sorted at the start of the training. This solution is faster but consumes much more memory than IN_NODE.<br>- IN_NODE: The features are sorted just before being used in the node. This solution is slow but consumes little amount of memory.<br>.
+-   How are sorted the numerical features in order to find the splits<br>- AUTO: Selects the most efficient method among IN_NODE, FORCE_PRESORT, and LAYER.<br>- IN_NODE: The features are sorted just before being used in the node. This solution is slow but consumes little amount of memory.<br>- FORCE_PRESORT: The features are pre-sorted at the start of the training. This solution is faster but consumes much more memory than IN_NODE.<br>- PRESORT: Automatically choose between FORCE_PRESORT and IN_NODE.<br>.
 
 #### [sparse_oblique_max_num_projections](https://github.com/google/yggdrasil-decision-forests/blob/main/yggdrasil_decision_forests/learner/decision_tree/decision_tree.proto)
 
@@ -894,7 +890,7 @@ reasonable time.
 -   **Type:** Categorical **Default:** AXIS_ALIGNED **Possible values:**
     AXIS_ALIGNED, SPARSE_OBLIQUE, MHLD_OBLIQUE
 
--   What structure of split to consider for numerical features.<br>- `AXIS_ALIGNED`: Axis aligned splits (i.e. one condition at a time). This is the "classical" way to train a tree. Default value.<br>- `SPARSE_OBLIQUE`: Sparse oblique splits (i.e. random splits one a small number of features) from "Sparse Projection Oblique Random Forests", Tomita et al., 2020.<br>- `MHLD_OBLIQUE`: Multi-class Hellinger Linear Discriminant splits from "Classification Based on Multivariate Contrast Patterns", Canete-Sifuentes et al., 2029
+-   What structure of split to consider for numerical features.<br>- `AXIS_ALIGNED`: Axis aligned splits (i.e. one condition at a time). This is the "classical" way to train a tree. Default value.<br>- `SPARSE_OBLIQUE`: Sparse oblique splits (i.e. random splits on a small number of features) from "Sparse Projection Oblique Random Forests", Tomita et al., 2020.<br>- `MHLD_OBLIQUE`: Multi-class Hellinger Linear Discriminant splits from "Classification Based on Multivariate Contrast Patterns", Canete-Sifuentes et al., 2029
 
 #### [uplift_min_examples_in_treatment](https://github.com/google/yggdrasil-decision-forests/blob/main/yggdrasil_decision_forests/learner/decision_tree/decision_tree.proto)
 
@@ -1134,9 +1130,9 @@ The hyper-parameter protobuffers are used with the C++ and CLI APIs.
 #### [sorting_strategy](https://github.com/google/yggdrasil-decision-forests/blob/main/yggdrasil_decision_forests/learner/decision_tree/decision_tree.proto)
 
 -   **Type:** Categorical **Default:** IN_NODE **Possible values:** IN_NODE,
-    PRESORT
+    PRESORT, FORCE_PRESORT, AUTO
 
--   How are sorted the numerical features in order to find the splits<br>- PRESORT: The features are pre-sorted at the start of the training. This solution is faster but consumes much more memory than IN_NODE.<br>- IN_NODE: The features are sorted just before being used in the node. This solution is slow but consumes little amount of memory.<br>.
+-   How are sorted the numerical features in order to find the splits<br>- AUTO: Selects the most efficient method among IN_NODE, FORCE_PRESORT, and LAYER.<br>- IN_NODE: The features are sorted just before being used in the node. This solution is slow but consumes little amount of memory.<br>- FORCE_PRESORT: The features are pre-sorted at the start of the training. This solution is faster but consumes much more memory than IN_NODE.<br>- PRESORT: Automatically choose between FORCE_PRESORT and IN_NODE.<br>.
 
 #### [sparse_oblique_max_num_projections](https://github.com/google/yggdrasil-decision-forests/blob/main/yggdrasil_decision_forests/learner/decision_tree/decision_tree.proto)
 
@@ -1175,7 +1171,7 @@ The hyper-parameter protobuffers are used with the C++ and CLI APIs.
 -   **Type:** Categorical **Default:** AXIS_ALIGNED **Possible values:**
     AXIS_ALIGNED, SPARSE_OBLIQUE, MHLD_OBLIQUE
 
--   What structure of split to consider for numerical features.<br>- `AXIS_ALIGNED`: Axis aligned splits (i.e. one condition at a time). This is the "classical" way to train a tree. Default value.<br>- `SPARSE_OBLIQUE`: Sparse oblique splits (i.e. random splits one a small number of features) from "Sparse Projection Oblique Random Forests", Tomita et al., 2020.<br>- `MHLD_OBLIQUE`: Multi-class Hellinger Linear Discriminant splits from "Classification Based on Multivariate Contrast Patterns", Canete-Sifuentes et al., 2029
+-   What structure of split to consider for numerical features.<br>- `AXIS_ALIGNED`: Axis aligned splits (i.e. one condition at a time). This is the "classical" way to train a tree. Default value.<br>- `SPARSE_OBLIQUE`: Sparse oblique splits (i.e. random splits on a small number of features) from "Sparse Projection Oblique Random Forests", Tomita et al., 2020.<br>- `MHLD_OBLIQUE`: Multi-class Hellinger Linear Discriminant splits from "Classification Based on Multivariate Contrast Patterns", Canete-Sifuentes et al., 2029
 
 #### [uplift_min_examples_in_treatment](https://github.com/google/yggdrasil-decision-forests/blob/main/yggdrasil_decision_forests/learner/decision_tree/decision_tree.proto)
 
@@ -1325,7 +1321,8 @@ The hyper-parameter protobuffers are used with the C++ and CLI APIs.
 
 -   Coefficient applied to each tree prediction. A small value (0.02) tends to
     give more accurate results (assuming enough trees are trained), but results
-    in larger models. Analogous to neural network learning rate.
+    in larger models. Analogous to neural network learning rate. Fixed to 1.0
+    for DART models.
 
 #### [use_hessian_gain](https://github.com/google/yggdrasil-decision-forests/blob/main/yggdrasil_decision_forests/learner/gradient_boosted_trees/gradient_boosted_trees.proto)
 
@@ -1343,8 +1340,8 @@ The hyper-parameter protobuffers are used with the C++ and CLI APIs.
 
 ## ISOLATION_FOREST
 
-An Isolation Forest (https://ieeexplore.ieee.org/abstract/document/4781136) is a
-collection of decision trees trained without labels and independently to
+An [Isolation Forest](https://ieeexplore.ieee.org/abstract/document/4781136) is
+a collection of decision trees trained without labels and independently to
 partition the feature space. The Isolation Forest prediction is an anomaly score
 that indicates whether an example originates from a same distribution to the
 training examples. We refer to Isolation Forest as both the original algorithm
@@ -1455,11 +1452,12 @@ The hyper-parameter protobuffers are used with the C++ and CLI APIs.
 
 #### [max_depth](https://github.com/google/yggdrasil-decision-forests/blob/main/yggdrasil_decision_forests/learner/decision_tree/decision_tree.proto)
 
--   **Type:** Integer **Default:** 16 **Possible values:** min:-1
+-   **Type:** Integer **Default:** -2 **Possible values:** min:-2
 
 -   Maximum depth of the tree. `max_depth=1` means that all trees will be roots.
-    `max_depth=-1` means that tree depth is not restricted by this parameter.
-    Values <= -2 will be ignored.
+    `max_depth=-1` means that tree depth unconstrained by this parameter.
+    `max_depth=-2` means that the maximum depth is log2(number of sampled
+    examples per tree) (default).
 
 #### [max_num_nodes](https://github.com/google/yggdrasil-decision-forests/blob/main/yggdrasil_decision_forests/learner/decision_tree/decision_tree.proto)
 
@@ -1496,15 +1494,6 @@ The hyper-parameter protobuffers are used with the C++ and CLI APIs.
     numerical features, the value is capped automatically. The value 1 is
     allowed but results in ordinary (non-oblique) splits.
 
-#### [mhld_oblique_sample_attributes](https://github.com/google/yggdrasil-decision-forests/blob/main/yggdrasil_decision_forests/learner/decision_tree/decision_tree.proto)
-
--   **Type:** Categorical **Default:** false **Possible values:** true, false
-
--   For MHLD oblique splits i.e. `split_axis=MHLD_OBLIQUE`. If true, applies the
-    attribute sampling controlled by the "num_candidate_attributes" or
-    "num_candidate_attributes_ratio" parameters. If false, all the attributes
-    are tested.
-
 #### [min_examples](https://github.com/google/yggdrasil-decision-forests/blob/main/yggdrasil_decision_forests/learner/decision_tree/decision_tree.proto)
 
 -   **Type:** Integer **Default:** 5 **Possible values:** min:1
@@ -1566,16 +1555,10 @@ The hyper-parameter protobuffers are used with the C++ and CLI APIs.
 
 #### [sorting_strategy](https://github.com/google/yggdrasil-decision-forests/blob/main/yggdrasil_decision_forests/learner/decision_tree/decision_tree.proto)
 
--   **Type:** Categorical **Default:** PRESORT **Possible values:** IN_NODE,
-    PRESORT
-
--   How are sorted the numerical features in order to find the splits<br>- PRESORT: The features are pre-sorted at the start of the training. This solution is faster but consumes much more memory than IN_NODE.<br>- IN_NODE: The features are sorted just before being used in the node. This solution is slow but consumes little amount of memory.<br>.
-
-#### [sparse_oblique_max_num_projections](https://github.com/google/yggdrasil-decision-forests/blob/main/yggdrasil_decision_forests/learner/decision_tree/decision_tree.proto)
-
--   **Type:** Integer **Default:** 6000 **Possible values:** min:1
+-   **Type:** Categorical **Default:** AUTO **Possible values:** IN_NODE,
+    PRESORT, FORCE_PRESORT, AUTO
 
--   For sparse oblique splits i.e. `split_axis=SPARSE_OBLIQUE`. Maximum number of projections (applied after the num_projections_exponent).<br>Oblique splits try out max(p^num_projections_exponent, max_num_projections) random projections for choosing a split, where p is the number of numerical features. Increasing "max_num_projections" increases the training time but not the inference time. In late stage model development, if every bit of accuracy if important, increase this value.<br>The paper "Sparse Projection Oblique Random Forests" (Tomita et al, 2020) does not define this hyperparameter.
+-   How are sorted the numerical features in order to find the splits<br>- AUTO: Selects the most efficient method among IN_NODE, FORCE_PRESORT, and LAYER.<br>- IN_NODE: The features are sorted just before being used in the node. This solution is slow but consumes little amount of memory.<br>- FORCE_PRESORT: The features are pre-sorted at the start of the training. This solution is faster but consumes much more memory than IN_NODE.<br>- PRESORT: Automatically choose between FORCE_PRESORT and IN_NODE.<br>.
 
 #### [sparse_oblique_normalization](https://github.com/google/yggdrasil-decision-forests/blob/main/yggdrasil_decision_forests/learner/decision_tree/decision_tree.proto)
 
@@ -1584,12 +1567,6 @@ The hyper-parameter protobuffers are used with the C++ and CLI APIs.
 
 -   For sparse oblique splits i.e. `split_axis=SPARSE_OBLIQUE`. Normalization applied on the features, before applying the sparse oblique projections.<br>- `NONE`: No normalization.<br>- `STANDARD_DEVIATION`: Normalize the feature by the estimated standard deviation on the entire train dataset. Also known as Z-Score normalization.<br>- `MIN_MAX`: Normalize the feature by the range (i.e. max-min) estimated on the entire train dataset.
 
-#### [sparse_oblique_num_projections_exponent](https://github.com/google/yggdrasil-decision-forests/blob/main/yggdrasil_decision_forests/learner/decision_tree/decision_tree.proto)
-
--   **Type:** Real **Default:** 2 **Possible values:** min:0
-
--   For sparse oblique splits i.e. `split_axis=SPARSE_OBLIQUE`. Controls of the number of random projections to test at each node.<br>Increasing this value very likely improves the quality of the model, drastically increases the training time, and doe not impact the inference time.<br>Oblique splits try out max(p^num_projections_exponent, max_num_projections) random projections for choosing a split, where p is the number of numerical features. Therefore, increasing this `num_projections_exponent` and possibly `max_num_projections` may improve model quality, but will also significantly increase training time.<br>Note that the complexity of (classic) Random Forests is roughly proportional to `num_projections_exponent=0.5`, since it considers sqrt(num_features) for a split. The complexity of (classic) GBDT is roughly proportional to `num_projections_exponent=1`, since it considers all features for a split.<br>The paper "Sparse Projection Oblique Random Forests" (Tomita et al, 2020) recommends values in [1/4, 2].
-
 #### [sparse_oblique_projection_density_factor](https://github.com/google/yggdrasil-decision-forests/blob/main/yggdrasil_decision_forests/learner/decision_tree/decision_tree.proto)
 
 -   **Type:** Real **Default:** 2 **Possible values:** min:0
@@ -1606,27 +1583,28 @@ The hyper-parameter protobuffers are used with the C++ and CLI APIs.
 #### [split_axis](https://github.com/google/yggdrasil-decision-forests/blob/main/yggdrasil_decision_forests/learner/decision_tree/decision_tree.proto)
 
 -   **Type:** Categorical **Default:** AXIS_ALIGNED **Possible values:**
-    AXIS_ALIGNED, SPARSE_OBLIQUE, MHLD_OBLIQUE
+    AXIS_ALIGNED, SPARSE_OBLIQUE
 
--   What structure of split to consider for numerical features.<br>- `AXIS_ALIGNED`: Axis aligned splits (i.e. one condition at a time). This is the "classical" way to train a tree. Default value.<br>- `SPARSE_OBLIQUE`: Sparse oblique splits (i.e. random splits one a small number of features) from "Sparse Projection Oblique Random Forests", Tomita et al., 2020.<br>- `MHLD_OBLIQUE`: Multi-class Hellinger Linear Discriminant splits from "Classification Based on Multivariate Contrast Patterns", Canete-Sifuentes et al., 2029
+-   What structure of split to consider for numerical features.<br>- `AXIS_ALIGNED`: Axis aligned splits (i.e. one condition at a time). This is the "classical" way to train a tree. Default value.<br>- `SPARSE_OBLIQUE`: Sparse oblique splits (i.e. random splits on a small number of features) from "Sparse Projection Oblique Random Forests", Tomita et al., 2020. This includes the splits described in "Extended Isolation Forests" (Sahand Hariri et al., 2018).
 
 #### [subsample_count](https://github.com/google/yggdrasil-decision-forests/blob/main/yggdrasil_decision_forests/learner/isolation_forest/isolation_forest.proto)
 
--   **Type:** Integer **Default:** 300 **Possible values:** min:0
+-   **Type:** Integer **Default:** 256 **Possible values:** min:0
 
 -   Number of examples used to grow each tree. Only one of "subsample_ratio" and
-    "subsample_count" can be set. If neither is set, "subsample_count" is
-    assumed to be equal to 256. This is the default value recommended in the
-    isolation forest paper.
+    "subsample_count" can be set. By default, sample 256 examples per tree. Note
+    that this parameter also restricts the tree's maximum depth to log2(examples
+    used per tree) unless max_depth is set explicitly.
 
 #### [subsample_ratio](https://github.com/google/yggdrasil-decision-forests/blob/main/yggdrasil_decision_forests/learner/isolation_forest/isolation_forest.proto)
 
--   **Type:** Integer **Default:** 300 **Possible values:** min:0
+-   **Type:** Real **Default:** 1 **Possible values:** min:0
 
 -   Ratio of number of training examples used to grow each tree. Only one of
-    "subsample_ratio" and "subsample_count" can be set. If neither is set,
-    "subsample_count" is assumed to be equal to 256. This is the default value
-    recommended in the isolation forest paper.
+    "subsample_ratio" and "subsample_count" can be set. By default, sample 256
+    examples per tree. Note that this parameter also restricts the tree's
+    maximum depth to log2(examples used per tree) unless max_depth is set
+    explicitly.
 
 #### [uplift_min_examples_in_treatment](https://github.com/google/yggdrasil-decision-forests/blob/main/yggdrasil_decision_forests/learner/decision_tree/decision_tree.proto)
 
diff --git a/yggdrasil_decision_forests/port/python/CHANGELOG.md b/yggdrasil_decision_forests/port/python/CHANGELOG.md
index e015d990..bedb1c8a 100644
--- a/yggdrasil_decision_forests/port/python/CHANGELOG.md
+++ b/yggdrasil_decision_forests/port/python/CHANGELOG.md
@@ -1,6 +1,6 @@
 # Changelog
 
-## Head
+## 0.7.0 - 2024-08-21
 
 ### Feature
 
@@ -13,12 +13,13 @@
 -   Models can be pickled safely.
 -   Native support for Xarray as a dataset format for all operations (e.g.,
     training, evaluation, predictions).
--   The output of `model.to_jax_function` can then be converted to a TensorFlow
-    Lite model.
+-   The output of `model.to_jax_function` can be converted to a TensorFlow Lite
+    model.
 -   Change the default number of examples to scan when training on files to
     determine the semantic and dictionaries of columns from 10k to 100k.
 -   Various improvements of error messages.
 -   Evaluation for Anomaly Detection models.
+-   Oblique splits for Anomaly Detection models.
 
 ### Fix
 
@@ -31,6 +32,13 @@
     multidimensional categorical integers.
 -   Fix error when defining categorical sets for non-ragged multidimensional
     inputs.
+-   MacOS: Fix compatibility with other protobuf-using libraries such as
+    Tensorflow.
+
+#### Release music
+
+Rondo Alla ingharese quasi un capriccio "Die Wut über den verlorenen Groschen",
+Op. 129. Ludwig van Beethoven
 
 ## 0.6.0 - 2024-07-04
 
diff --git a/yggdrasil_decision_forests/port/python/config/setup.py b/yggdrasil_decision_forests/port/python/config/setup.py
index 36e4fb19..29d657d7 100644
--- a/yggdrasil_decision_forests/port/python/config/setup.py
+++ b/yggdrasil_decision_forests/port/python/config/setup.py
@@ -22,13 +22,13 @@
 from setuptools.command.install import install
 from setuptools.dist import Distribution
 
-_VERSION = "0.6.0"
+_VERSION = "0.7.0"
 
 with open("README.md", "r", encoding="utf-8") as fh:
   long_description = fh.read()
 
 REQUIRED_PACKAGES = [
-    "numpy<2.0.0",
+    "numpy",
     "absl_py",
     "protobuf>=3.14",
 ]
diff --git a/yggdrasil_decision_forests/port/python/dev_requirements.txt b/yggdrasil_decision_forests/port/python/dev_requirements.txt
index d644b780..d2fbf648 100644
--- a/yggdrasil_decision_forests/port/python/dev_requirements.txt
+++ b/yggdrasil_decision_forests/port/python/dev_requirements.txt
@@ -4,7 +4,7 @@ pydantic
 requests
 fastapi[standard]>=0.112.0,<0.113.0
 tensorflow_decision_forests; platform_machine != 'aarch64' and python_version >= '3.9' and python_version < '3.12'
-tensorflow; platform_machine != 'aarch64'
+tensorflow; platform_machine != 'aarch64' and python_version >= '3.9' and python_version < '3.12'
 portpicker
 matplotlib
 scikit-learn
diff --git a/yggdrasil_decision_forests/port/python/oss_third_party/pybind11_protobuf/workspace.bzl b/yggdrasil_decision_forests/port/python/oss_third_party/pybind11_protobuf/workspace.bzl
index 4a0f35a1..99280dea 100644
--- a/yggdrasil_decision_forests/port/python/oss_third_party/pybind11_protobuf/workspace.bzl
+++ b/yggdrasil_decision_forests/port/python/oss_third_party/pybind11_protobuf/workspace.bzl
@@ -3,8 +3,8 @@
 load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")
 
 def deps():
-    PYBIND_PROTOBUF_COMMIT_HASH = "3d7834b607758bbd2e3d210c6c478453922f20c0"
-    PYBIND_PROTOBUF_SHA = "89ba0a6eb92a834dc08dc199da5b94b4648168c56d5409116f9b7699e5350f11"
+    PYBIND_PROTOBUF_COMMIT_HASH = "f1b245929759230f31cdd1e5f9e0e69f817fed95"
+    PYBIND_PROTOBUF_SHA = "7eeabdaa39d5b1f48f1feb0894d6b7f02f77964e2a6bc1eaa4a90fe243e0a34c"
     http_archive(
         name = "com_google_pybind11_protobuf",
         strip_prefix = "pybind11_protobuf-{commit}".format(commit = PYBIND_PROTOBUF_COMMIT_HASH),
diff --git a/yggdrasil_decision_forests/port/python/tools/build_test_linux.sh b/yggdrasil_decision_forests/port/python/tools/build_test_linux.sh
index 557fdb96..695f5cc9 100755
--- a/yggdrasil_decision_forests/port/python/tools/build_test_linux.sh
+++ b/yggdrasil_decision_forests/port/python/tools/build_test_linux.sh
@@ -33,7 +33,7 @@ build_and_maybe_test () {
    echo "   Compiler : $CC"
 
     bazel version
-    local ARCHITECTURE=$(uname --m)
+    local ARCHITECTURE=$(uname -m)
 
     local flags="--config=linux_cpp17 --features=-fully_static_link"
     if [ "$ARCHITECTURE" == "x86_64" ]; then
diff --git a/yggdrasil_decision_forests/port/python/tools/release_macos.sh b/yggdrasil_decision_forests/port/python/tools/release_macos.sh
index 5d6333bf..9ee5d655 100755
--- a/yggdrasil_decision_forests/port/python/tools/release_macos.sh
+++ b/yggdrasil_decision_forests/port/python/tools/release_macos.sh
@@ -14,6 +14,7 @@
 # limitations under the License.
 
 
+# Running this script inside a python venv may not work.
 set -vex
 
 declare -a python_versions=("3.8" "3.9" "3.10" "3.11" "3.12")
@@ -27,7 +28,7 @@ do
   source ${TMPDIR}venv/bin/activate
   pip install --upgrade pip
 
-  echo "Building with $(python3 -V 2>&1)"
+  echo "Building with $(python -V 2>&1)"
 
   bazel clean --expunge
   RUN_TESTS=0 CC="clang" ./tools/build_test_linux.sh
diff --git a/yggdrasil_decision_forests/port/python/tools/release_windows.bat b/yggdrasil_decision_forests/port/python/tools/release_windows.bat
index 19c7c035..814613c4 100644
--- a/yggdrasil_decision_forests/port/python/tools/release_windows.bat
+++ b/yggdrasil_decision_forests/port/python/tools/release_windows.bat
@@ -34,7 +34,7 @@
 cls
 setlocal
 
-set YDF_VERSION=0.5.0
+set YDF_VERSION=0.7.0
 set BAZEL=bazel.exe
 set BAZEL_SH=C:\msys64\usr\bin\bash.exe
 set BAZEL_FLAGS=--config=windows_cpp20 --config=windows_avx2
diff --git a/yggdrasil_decision_forests/port/python/ydf/cc/BUILD b/yggdrasil_decision_forests/port/python/ydf/cc/BUILD
index 40ea0854..3ce2aacc 100644
--- a/yggdrasil_decision_forests/port/python/ydf/cc/BUILD
+++ b/yggdrasil_decision_forests/port/python/ydf/cc/BUILD
@@ -11,6 +11,10 @@ package(
 pybind_extension(
     name = "ydf",
     srcs = ["ydf.cc"],
+    linkopts = select({
+        "@bazel_tools//src/conditions:darwin": ["-Wl,-exported_symbol,_PyInit_ydf"],
+        "//conditions:default": [],
+    }),
     deps = [
         "//ydf/dataset:dataset_cc",
         "//ydf/learner:learner_cc",
diff --git a/yggdrasil_decision_forests/port/python/ydf/model/generic_model.py b/yggdrasil_decision_forests/port/python/ydf/model/generic_model.py
index 409c2376..558b6e2b 100644
--- a/yggdrasil_decision_forests/port/python/ydf/model/generic_model.py
+++ b/yggdrasil_decision_forests/port/python/ydf/model/generic_model.py
@@ -897,12 +897,6 @@ def pre_processing(raw_features):
       force: Try to export even in currently unsupported environments. WARNING:
         Setting this to true may crash the Python runtime.
     """
-    if platform.system() == "Darwin" and not force:
-      raise ValueError(
-          "Exporting to TensorFlow is currently broken on MacOS and may crash"
-          " the current Python process. To proceed anyway, add parameter"
-          " `force=True`."
-      )
 
     if mode == "keras":
       log.warning(
diff --git a/yggdrasil_decision_forests/port/python/ydf/version.py b/yggdrasil_decision_forests/port/python/ydf/version.py
index abef65f6..5bd6b189 100644
--- a/yggdrasil_decision_forests/port/python/ydf/version.py
+++ b/yggdrasil_decision_forests/port/python/ydf/version.py
@@ -12,4 +12,4 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
-version = "0.6.0"
+version = "0.7.0"
diff --git a/yggdrasil_decision_forests/utils/compatibility.h b/yggdrasil_decision_forests/utils/compatibility.h
index fac95744..9ec2e11a 100644
--- a/yggdrasil_decision_forests/utils/compatibility.h
+++ b/yggdrasil_decision_forests/utils/compatibility.h
@@ -24,6 +24,7 @@
 
 #include <stdint.h>
 
+#include <cassert>
 #include <type_traits>
 
 #include "absl/types/optional.h"