Batch trust regions implementation of TURBO (#791)

* Add support for local models and datasets (WIP) * Add unit test for local models (WIP) * Update multi model/dataset test (WIP) * Add unit test for keep datasets in regions * Add more tests and move to local tags class * Always include global dataset in mapping * Add filter_mask method to trust region * Add more testing * Fix mypy model type issues * Add ask_tell testing * Fix summary init when only global dataset * Remove walrus operator * Update test, ask_tell data not changed in-place * Add some test comments * Add some rule comments * Allow input-multi-observers for batch observer * Allow multiple models/datasets for base rule * Support multiple models/datasets in region selects * Fix TR plotting history colors * Add notebook init points explanation * Rename region index and add init param * WIP * Add more TURBO implementation + attempt type fixes * Remove old comment * Tidy-up redundant expression * Fix TURBOBox and temp changes in TURBO for match * Add intermediate box region class * Keep full datasets along with filtered ones * Move subspace update to a new rule method * Add temp notebook for TURBO comparisons * Save TR subspaces in acquire to re-use later * Update notebook to use TURBOBOx * Make changes from PR feedback * Fix rename after merge * Fix compare notebook after merge * Move rule create later and tidy filtering dataset * More testing in notebook * Remove redundant dataset filtering * Address some of the recent feedback * Fix dataset mypy error * Copy dataset in optimizers to avoid changing it * Share DatasetChecker and tidy-up exp values in tests * Address more feedback * Remove prev TURBO and update tests to use new class * Remove notebook for testing * Create dataset and update at start of optim * Avoid default num_models in integ tests * Fix old python typing issue * Address feedback * Address more comments * Only copy state with track_state==True * Add comment explaining copy * Deepcopy subspace internal to rule * Keep global datasets unfiltered * Add notebook intro and improve TREGO text * Use flatten_... func and add comment * Improve TR explanations * Fix merge issues * Clarify parallel acq comments
secondmind-labs · Dec 18, 2023 · cf09b0f · cf09b0f
1 parent f766953
commit cf09b0f
Show file tree

Hide file tree

Showing 12 changed files with 806 additions and 642 deletions.
diff --git a/docs/notebooks/trust_region.pct.py b/docs/notebooks/trust_region.pct.py
@@ -17,7 +17,24 @@
 # %% [markdown]
 # # Trust region Bayesian optimization
 #
-# We will demonstrate three trust region Bayesian optimization algorithms in this tutorial.
+# This notebook guides you through practical examples of trust region Bayesian optimization,
+# illustrating algorithms like TREGO <cite data-cite="diouane2022trego"/> and TuRBO
+# <cite data-cite="eriksson2019scalable"/> that could be beneficial for optimizing high-dimensional
+# spaces. Trieste provides a flexible framework for implementing custom algorithms by encapsulating
+# the behavior of rules and regions into separate abstract classes, `BatchTrustRegion` and
+# `UpdatableTrustRegion` respectively.
+#
+# Trust region optimization is a general strategy used in optimization problems where the solution
+# space is navigated by fitting a localized model, such as a quadratic approximation, around
+# the current point estimate and then using this model to take an optimization step. The region
+# where this approximation is considered trustworthy is the "trust region". Within this region, the
+# algorithm deems the model's predictions to be reliable enough to base decisions on. After each
+# step, the performance of the actual function is evaluated and the trust region is adjusted
+# accordingly -- typically expanded if the model was predictive or contracted if it wasn't.
+#
+# In trust region Bayesian optimization, a probabilistic model, often a Gaussian Process,
+# replaces simpler models like quadratics to inform the optimization. This model can be applied
+# globally across the entire search space or adjusted to accommodate local regions.
 
 # %%
 import numpy as np
@@ -63,24 +80,26 @@ def build_model():
 # %% [markdown]
 # ## Trust region `TREGO` acquisition rule
 #
-# First we show how to run Bayesian optimization with the `TREGO` algorithm. This is a trust region
-# algorithm that alternates between regular EGO steps and local steps within one trust region
-# (see <cite data-cite="diouane2022trego"/>).
+# First we demonstrate how to run Bayesian optimization with the `TREGO` algorithm, which alternates
+# between regular EGO steps and local steps within one trust region (see
+# <cite data-cite="diouane2022trego"/>).
 #
 # ### Create `TREGO` rule and run optimization loop
 #
 # We can run the Bayesian optimization loop by defining a `BayesianOptimizer` and calling its
 # `optimize` method with the trust region rule. Once the optimization loop is complete, the
 # optimizer will return one new query point for every step in the loop; that's 5 points in total.
 #
-# In order to create the `TREGO` rule, we use the `BatchTrustRegionBox` class. This class supports
-# multiple trust regions, but here we only need one region of type `TREGOBox`. The `TREGOBox` class
-# implements the `TREGO` algorithm inside a single trust region. Note: we cover batch trust regions in
-# more detail in the next section.
+# The trust region rule is created by instantiating the concrete `BatchTrustRegionBox` class. This
+# is a "meta" rule that manages the acquisition from multiple possible regions by applying a
+# base-rule to each region. The default base-rule is `EfficientGlobalOptimization`, but a different
+# base-rule can be provided as an argument to `BatchTrustRegionBox`. Here we explicitly set it to
+# make usage clear.
 #
-# `TREGO` is a "meta" rule that applies a base-rule, either inside a trust region or the whole
-# space. The default base-rule is `EfficientGlobalOptimization`, but a different base-rule can be
-# provided as an argument to `TREGO`. Here we explicitly set it to make usage clear.
+# The regions themselves are implemented as separate classes. The `TREGO` algorithm in this example
+# requires a single region that alternates between the full search space and the trust region. This
+# is achieved by creating one instance of the `TREGOBox` class, which is responsible for managing
+# the state, initialization and update of the region.
 
 # %%
 trego_acq_rule = trieste.acquisition.rule.BatchTrustRegionBox(
@@ -100,7 +119,7 @@ def build_model():
 #
 # Let's take a look at where we queried the observer, the original query points (crosses), new
 # query points (dots) and the optimum point found (purple dot), and where they lie with respect to
-# the contours of the Branin.
+# the contours of the Branin function.
 
 # %%
 from trieste.experimental.plotting import plot_bo_points, plot_function_2d
@@ -125,14 +144,14 @@ def plot_final_result(_dataset: trieste.data.Dataset) -> None:
 # %% [markdown]
 # We can also visualize the progress of the optimization by plotting the acquisition space at each
 # step. This space is either the full search space or the trust region, depending on the step, and
-# is shown as a translucent box; with the current optimum point in a region shown in matching
-# color.
+# is shown as a translucent box. The new query points per region are plotted in matching color.
 #
-# Note there is only one trust region in this plot, however the rule in the next section will show
-# multiple trust regions.
+# Note there is only one trust region in this plot, however the rules in the following sections will
+# show multiple trust regions.
 
 # %%
 import base64
+from typing import Optional
 
 import IPython
 import matplotlib.pyplot as plt
@@ -144,7 +163,10 @@ def plot_final_result(_dataset: trieste.data.Dataset) -> None:
 )
 
 
-def plot_history(result: trieste.bayesian_optimizer.OptimizationResult) -> None:
+def plot_history(
+    result: trieste.bayesian_optimizer.OptimizationResult,
+    num_query_points: Optional[int] = None,
+) -> None:
     frames = []
     for step, hist in enumerate(
         result.history + [result.final_result.unwrap()]
@@ -154,6 +176,7 @@ def plot_history(result: trieste.bayesian_optimizer.OptimizationResult) -> None:
             search_space.lower,
             search_space.upper,
             hist,
+            num_query_points=num_query_points,
             num_init=num_initial_data_points,
         )
 
@@ -176,36 +199,33 @@ def plot_history(result: trieste.bayesian_optimizer.OptimizationResult) -> None:
 # %% [markdown]
 # ## Batch trust region rule
 #
-# Next we demonstrate how to run Bayesian optimization with the batch trust region rule.
+# Next we demonstrate how to run Bayesian optimization in a parallel batch with 5 regions.
 #
 # ### Create the batch trust region acquisition rule
 #
-# We achieve Bayesian optimization with trust regions by specifying `BatchTrustRegionBox` as the
-# acquisition rule.
-#
-# This rule needs an initial number `num_query_points` of sub-spaces (or trust regions) to be
-# provided and performs optimization in parallel across all these sub-spaces. Each region
-# contributes one query point, resulting in each acquisition step collecting `num_query_points`
-# points overall. As the optimization process continues, the bounds of these sub-spaces are
-# dynamically updated. In this example, we create 5 `SingleObjectiveTrustRegionBox` regions. This
-# class encapsulates the behavior of a trust region in a single sub-space; being responsible for
+# Multiple trust regions are created by providing a list of regions to the `BatchTrustRegionBox`
+# rule constructor. In this example, we create 5 `SingleObjectiveTrustRegionBox` regions. This
+# class encapsulates the behavior of a trust region in a single region; being responsible for
 # maintaining its own state, initializing it, and updating it after each step.
+# Each region contributes one query point, resulting in each acquisition step collecting 5
+# points overall. As the optimization process continues, the bounds of these regions are
+# dynamically updated.
 #
-# In addition, `BatchTrustRegionBox` is a "meta" rule that requires the specification of a
-# batch aquisition base-rule for performing optimization; for our example we use
-# `EfficientGlobalOptimization` coupled with the `ParallelContinuousThompsonSampling` acquisition
-# function.
+# In order to perform parallel acquisition across multiple regions with `BatchTrustRegionBox` "meta"
+# rule, we need to specify a batch base-rule that supports parallel optimization. For our example
+# we use `EfficientGlobalOptimization` coupled with the `ParallelContinuousThompsonSampling`
+# acquisition function. The batch size is controlled by the `num_query_points` argument.
 #
-# Note: in this example the number of sub-spaces/regions is equal to the number of batch query
+# Note: in this example the number of regions is equal to the number of batch query
 # points in the base-rule. This results in each region contributing one query point to the overall
 # batch. However, it is possible to generate multiple query points from each region by setting
-# `num_query_points` to be a multiple `Q` of the number of regions. In this case, each region will
+# `num_query_points` to be a multiple `Q` of the number of regions. In that case, each region will
 # contribute `Q` query points to the overall batch.
 
 # %%
 num_query_points = 5
 
-init_subspaces = [
+init_regions = [
     trieste.acquisition.rule.SingleObjectiveTrustRegionBox(search_space)
     for _ in range(num_query_points)
 ]
@@ -214,7 +234,7 @@ def plot_history(result: trieste.bayesian_optimizer.OptimizationResult) -> None:
     num_query_points=num_query_points,
 )
 batch_acq_rule = trieste.acquisition.rule.BatchTrustRegionBox(
-    init_subspaces, base_rule
+    init_regions, base_rule
 )
 
 # %% [markdown]
@@ -246,49 +266,68 @@ def plot_history(result: trieste.bayesian_optimizer.OptimizationResult) -> None:
 plot_history(result)
 
 # %% [markdown]
-# ## Trust region `TurBO` acquisition rule
+# ## Trust region `TuRBO` acquisition rule
 #
-# Finally, we show how to run Bayesian optimization with the `TurBO` algorithm. This is a
+# Finally, we show how to run Bayesian optimization with the `TuRBO` algorithm. This is a
 # trust region algorithm that uses local models and datasets to approximate the objective function
-# within one trust region.
+# within their respective trust regions (see <cite data-cite="eriksson2019scalable"/>).
 #
-# ### Create `TurBO` rule and run optimization loop
+# ### Create `TuRBO` rule and run optimization loop
 #
 # As before, this meta-rule requires the specification of an aquisition base-rule for performing
-# optimization within the trust region; for our example we use the `DiscreteThompsonSampling` rule.
+# optimization within the trust regions; for our example we use the `DiscreteThompsonSampling` rule.
 #
-# Note that trieste maintains a global model that is, by default, automatically trained on each
-# iteration. However, this global model is unused for `TurBO`; which uses a local model instead.
-# As fitting the global model would be redundant and wasteful, we switch its training off by
-# setting `fit_model=False` in the `optimize` method.
+# We create 2 `TuRBO` trust regions and associated local models by initially copying the global
+# model (using `copy_to_local_models`). The optimizer will return `num_query_points` new query
+# points for each region in every step of the loop. With 5 steps and 2 regions, that's 30 points in
+# total.
+#
+# Note: this behavior of the base-rule `num_query_points` argument is different from the
+# batch-trust-region example above. In the batch-trust-region example, the total number of
+# query points returned per step was `num_query_points`. In this example, the total number of query
+# points returned per step is `num_query_points * num_regions`. This depends on whether the
+# base-rule is run in parallel across all regions or in parallel only within a region. This
+# example runs the acquisition sequentially one region at a time, wherease the
+# batch-trust-region case performed the acquisition in parallel across all regions. Fully
+# parallel acquisition is only supported when using `EfficientGlobalOptimization` base-rule without
+# local models.
 
 # %%
-turbo_acq_rule = trieste.acquisition.rule.TURBO(
-    search_space, rule=trieste.acquisition.rule.DiscreteThompsonSampling(500, 3)
+num_regions = 2
+num_query_points = 3
+
+turbo_regions = [
+    trieste.acquisition.rule.TURBOBox(search_space) for _ in range(num_regions)
+]
+dts_rule = trieste.acquisition.rule.DiscreteThompsonSampling(
+    num_search_space_samples=500, num_query_points=num_query_points
+)
+turbo_acq_rule = trieste.acquisition.rule.BatchTrustRegionBox(
+    turbo_regions, dts_rule
 )
+
 bo = trieste.bayesian_optimizer.BayesianOptimizer(observer, search_space)
 
 num_steps = 5
 result = bo.optimize(
     num_steps,
-    initial_data,
-    build_model(),
+    {trieste.observer.OBJECTIVE: initial_data},
+    trieste.acquisition.utils.copy_to_local_models(build_model(), num_regions),
     turbo_acq_rule,
     track_state=True,
-    fit_model=False,
 )
 dataset = result.try_get_final_dataset()
 
 # %% [markdown]
-# ### Visualizing `TurBO` results
+# ### Visualizing `TuRBO` results
 #
 # We display the results as earlier.
 
 # %%
 plot_final_result(dataset)
 
 # %%
-plot_history(result)
+plot_history(result, num_regions * num_query_points)
 
 # %% [markdown]
 # ## LICENSE

diff --git a/tests/integration/test_bayesian_optimization.py b/tests/integration/test_bayesian_optimization.py
@@ -44,7 +44,6 @@
 )
 from trieste.acquisition.optimizer import generate_continuous_optimizer
 from trieste.acquisition.rule import (
-    TURBO,
     AcquisitionRule,
     AsynchronousGreedy,
     AsynchronousOptimization,
@@ -56,6 +55,7 @@
     EfficientGlobalOptimization,
     SingleObjectiveTrustRegionBox,
     TREGOBox,
+    TURBOBox,
 )
 from trieste.acquisition.sampler import ThompsonSamplerFromTrajectory
 from trieste.acquisition.utils import copy_to_local_models
@@ -214,7 +214,10 @@ def GPR_OPTIMIZER_PARAMS() -> Tuple[str, List[ParameterSet]]:
             ),
             pytest.param(
                 10,
-                TURBO(ScaledBranin.search_space, rule=DiscreteThompsonSampling(500, 3)),
+                BatchTrustRegionBox(
+                    TURBOBox(ScaledBranin.search_space),
+                    DiscreteThompsonSampling(500, 3),
+                ),
                 id="Turbo",
             ),
             pytest.param(
@@ -692,7 +695,6 @@ def _test_optimizer_finds_minimum(
                     minimum_rtol=rtol_level,
                     minimum_step_number=2,
                 ),
-                fit_model=not isinstance(acquisition_rule, TURBO),
                 fit_initial_model=False,
             )