Skip to content

Commit

Permalink
Batch trust regions implementation of TURBO (#791)
Browse files Browse the repository at this point in the history
* Add support for local models and datasets (WIP)

* Add unit test for local models (WIP)

* Update multi model/dataset test (WIP)

* Add unit test for keep datasets in regions

* Add more tests and move to local tags class

* Always include global dataset in mapping

* Add filter_mask method to trust region

* Add more testing

* Fix mypy model type issues

* Add ask_tell testing

* Fix summary init when only global dataset

* Remove walrus operator

* Update test, ask_tell data not changed in-place

* Add some test comments

* Add some rule comments

* Allow input-multi-observers for batch observer

* Allow multiple models/datasets for base rule

* Support multiple models/datasets in region selects

* Fix TR plotting history colors

* Add notebook init points explanation

* Rename region index and add init param

* WIP

* Add more TURBO implementation + attempt type fixes

* Remove old comment

* Tidy-up redundant expression

* Fix TURBOBox and temp changes in TURBO for match

* Add intermediate box region class

* Keep full datasets along with filtered ones

* Move subspace update to a new rule method

* Add temp notebook for TURBO comparisons

* Save TR subspaces in acquire to re-use later

* Update notebook to use TURBOBOx

* Make changes from PR feedback

* Fix rename after merge

* Fix compare notebook after merge

* Move rule create later and tidy filtering dataset

* More testing in notebook

* Remove redundant dataset filtering

* Address some of the recent feedback

* Fix dataset mypy error

* Copy dataset in optimizers to avoid changing it

* Share DatasetChecker and tidy-up exp values in tests

* Address more feedback

* Remove prev TURBO and update tests to use new class

* Remove notebook for testing

* Create dataset and update at start of optim

* Avoid default num_models in integ tests

* Fix old python typing issue

* Address feedback

* Address more comments

* Only copy state with track_state==True

* Add comment explaining copy

* Deepcopy subspace internal to rule

* Keep global datasets unfiltered

* Add notebook intro and improve TREGO text

* Use flatten_... func and add comment

* Improve TR explanations

* Fix merge issues

* Clarify parallel acq comments
  • Loading branch information
khurram-ghani authored Dec 18, 2023
1 parent f766953 commit cf09b0f
Show file tree
Hide file tree
Showing 12 changed files with 806 additions and 642 deletions.
141 changes: 90 additions & 51 deletions docs/notebooks/trust_region.pct.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,24 @@
# %% [markdown]
# # Trust region Bayesian optimization
#
# We will demonstrate three trust region Bayesian optimization algorithms in this tutorial.
# This notebook guides you through practical examples of trust region Bayesian optimization,
# illustrating algorithms like TREGO <cite data-cite="diouane2022trego"/> and TuRBO
# <cite data-cite="eriksson2019scalable"/> that could be beneficial for optimizing high-dimensional
# spaces. Trieste provides a flexible framework for implementing custom algorithms by encapsulating
# the behavior of rules and regions into separate abstract classes, `BatchTrustRegion` and
# `UpdatableTrustRegion` respectively.
#
# Trust region optimization is a general strategy used in optimization problems where the solution
# space is navigated by fitting a localized model, such as a quadratic approximation, around
# the current point estimate and then using this model to take an optimization step. The region
# where this approximation is considered trustworthy is the "trust region". Within this region, the
# algorithm deems the model's predictions to be reliable enough to base decisions on. After each
# step, the performance of the actual function is evaluated and the trust region is adjusted
# accordingly -- typically expanded if the model was predictive or contracted if it wasn't.
#
# In trust region Bayesian optimization, a probabilistic model, often a Gaussian Process,
# replaces simpler models like quadratics to inform the optimization. This model can be applied
# globally across the entire search space or adjusted to accommodate local regions.

# %%
import numpy as np
Expand Down Expand Up @@ -63,24 +80,26 @@ def build_model():
# %% [markdown]
# ## Trust region `TREGO` acquisition rule
#
# First we show how to run Bayesian optimization with the `TREGO` algorithm. This is a trust region
# algorithm that alternates between regular EGO steps and local steps within one trust region
# (see <cite data-cite="diouane2022trego"/>).
# First we demonstrate how to run Bayesian optimization with the `TREGO` algorithm, which alternates
# between regular EGO steps and local steps within one trust region (see
# <cite data-cite="diouane2022trego"/>).
#
# ### Create `TREGO` rule and run optimization loop
#
# We can run the Bayesian optimization loop by defining a `BayesianOptimizer` and calling its
# `optimize` method with the trust region rule. Once the optimization loop is complete, the
# optimizer will return one new query point for every step in the loop; that's 5 points in total.
#
# In order to create the `TREGO` rule, we use the `BatchTrustRegionBox` class. This class supports
# multiple trust regions, but here we only need one region of type `TREGOBox`. The `TREGOBox` class
# implements the `TREGO` algorithm inside a single trust region. Note: we cover batch trust regions in
# more detail in the next section.
# The trust region rule is created by instantiating the concrete `BatchTrustRegionBox` class. This
# is a "meta" rule that manages the acquisition from multiple possible regions by applying a
# base-rule to each region. The default base-rule is `EfficientGlobalOptimization`, but a different
# base-rule can be provided as an argument to `BatchTrustRegionBox`. Here we explicitly set it to
# make usage clear.
#
# `TREGO` is a "meta" rule that applies a base-rule, either inside a trust region or the whole
# space. The default base-rule is `EfficientGlobalOptimization`, but a different base-rule can be
# provided as an argument to `TREGO`. Here we explicitly set it to make usage clear.
# The regions themselves are implemented as separate classes. The `TREGO` algorithm in this example
# requires a single region that alternates between the full search space and the trust region. This
# is achieved by creating one instance of the `TREGOBox` class, which is responsible for managing
# the state, initialization and update of the region.

# %%
trego_acq_rule = trieste.acquisition.rule.BatchTrustRegionBox(
Expand All @@ -100,7 +119,7 @@ def build_model():
#
# Let's take a look at where we queried the observer, the original query points (crosses), new
# query points (dots) and the optimum point found (purple dot), and where they lie with respect to
# the contours of the Branin.
# the contours of the Branin function.

# %%
from trieste.experimental.plotting import plot_bo_points, plot_function_2d
Expand All @@ -125,14 +144,14 @@ def plot_final_result(_dataset: trieste.data.Dataset) -> None:
# %% [markdown]
# We can also visualize the progress of the optimization by plotting the acquisition space at each
# step. This space is either the full search space or the trust region, depending on the step, and
# is shown as a translucent box; with the current optimum point in a region shown in matching
# color.
# is shown as a translucent box. The new query points per region are plotted in matching color.
#
# Note there is only one trust region in this plot, however the rule in the next section will show
# multiple trust regions.
# Note there is only one trust region in this plot, however the rules in the following sections will
# show multiple trust regions.

# %%
import base64
from typing import Optional

import IPython
import matplotlib.pyplot as plt
Expand All @@ -144,7 +163,10 @@ def plot_final_result(_dataset: trieste.data.Dataset) -> None:
)


def plot_history(result: trieste.bayesian_optimizer.OptimizationResult) -> None:
def plot_history(
result: trieste.bayesian_optimizer.OptimizationResult,
num_query_points: Optional[int] = None,
) -> None:
frames = []
for step, hist in enumerate(
result.history + [result.final_result.unwrap()]
Expand All @@ -154,6 +176,7 @@ def plot_history(result: trieste.bayesian_optimizer.OptimizationResult) -> None:
search_space.lower,
search_space.upper,
hist,
num_query_points=num_query_points,
num_init=num_initial_data_points,
)

Expand All @@ -176,36 +199,33 @@ def plot_history(result: trieste.bayesian_optimizer.OptimizationResult) -> None:
# %% [markdown]
# ## Batch trust region rule
#
# Next we demonstrate how to run Bayesian optimization with the batch trust region rule.
# Next we demonstrate how to run Bayesian optimization in a parallel batch with 5 regions.
#
# ### Create the batch trust region acquisition rule
#
# We achieve Bayesian optimization with trust regions by specifying `BatchTrustRegionBox` as the
# acquisition rule.
#
# This rule needs an initial number `num_query_points` of sub-spaces (or trust regions) to be
# provided and performs optimization in parallel across all these sub-spaces. Each region
# contributes one query point, resulting in each acquisition step collecting `num_query_points`
# points overall. As the optimization process continues, the bounds of these sub-spaces are
# dynamically updated. In this example, we create 5 `SingleObjectiveTrustRegionBox` regions. This
# class encapsulates the behavior of a trust region in a single sub-space; being responsible for
# Multiple trust regions are created by providing a list of regions to the `BatchTrustRegionBox`
# rule constructor. In this example, we create 5 `SingleObjectiveTrustRegionBox` regions. This
# class encapsulates the behavior of a trust region in a single region; being responsible for
# maintaining its own state, initializing it, and updating it after each step.
# Each region contributes one query point, resulting in each acquisition step collecting 5
# points overall. As the optimization process continues, the bounds of these regions are
# dynamically updated.
#
# In addition, `BatchTrustRegionBox` is a "meta" rule that requires the specification of a
# batch aquisition base-rule for performing optimization; for our example we use
# `EfficientGlobalOptimization` coupled with the `ParallelContinuousThompsonSampling` acquisition
# function.
# In order to perform parallel acquisition across multiple regions with `BatchTrustRegionBox` "meta"
# rule, we need to specify a batch base-rule that supports parallel optimization. For our example
# we use `EfficientGlobalOptimization` coupled with the `ParallelContinuousThompsonSampling`
# acquisition function. The batch size is controlled by the `num_query_points` argument.
#
# Note: in this example the number of sub-spaces/regions is equal to the number of batch query
# Note: in this example the number of regions is equal to the number of batch query
# points in the base-rule. This results in each region contributing one query point to the overall
# batch. However, it is possible to generate multiple query points from each region by setting
# `num_query_points` to be a multiple `Q` of the number of regions. In this case, each region will
# `num_query_points` to be a multiple `Q` of the number of regions. In that case, each region will
# contribute `Q` query points to the overall batch.

# %%
num_query_points = 5

init_subspaces = [
init_regions = [
trieste.acquisition.rule.SingleObjectiveTrustRegionBox(search_space)
for _ in range(num_query_points)
]
Expand All @@ -214,7 +234,7 @@ def plot_history(result: trieste.bayesian_optimizer.OptimizationResult) -> None:
num_query_points=num_query_points,
)
batch_acq_rule = trieste.acquisition.rule.BatchTrustRegionBox(
init_subspaces, base_rule
init_regions, base_rule
)

# %% [markdown]
Expand Down Expand Up @@ -246,49 +266,68 @@ def plot_history(result: trieste.bayesian_optimizer.OptimizationResult) -> None:
plot_history(result)

# %% [markdown]
# ## Trust region `TurBO` acquisition rule
# ## Trust region `TuRBO` acquisition rule
#
# Finally, we show how to run Bayesian optimization with the `TurBO` algorithm. This is a
# Finally, we show how to run Bayesian optimization with the `TuRBO` algorithm. This is a
# trust region algorithm that uses local models and datasets to approximate the objective function
# within one trust region.
# within their respective trust regions (see <cite data-cite="eriksson2019scalable"/>).
#
# ### Create `TurBO` rule and run optimization loop
# ### Create `TuRBO` rule and run optimization loop
#
# As before, this meta-rule requires the specification of an aquisition base-rule for performing
# optimization within the trust region; for our example we use the `DiscreteThompsonSampling` rule.
# optimization within the trust regions; for our example we use the `DiscreteThompsonSampling` rule.
#
# Note that trieste maintains a global model that is, by default, automatically trained on each
# iteration. However, this global model is unused for `TurBO`; which uses a local model instead.
# As fitting the global model would be redundant and wasteful, we switch its training off by
# setting `fit_model=False` in the `optimize` method.
# We create 2 `TuRBO` trust regions and associated local models by initially copying the global
# model (using `copy_to_local_models`). The optimizer will return `num_query_points` new query
# points for each region in every step of the loop. With 5 steps and 2 regions, that's 30 points in
# total.
#
# Note: this behavior of the base-rule `num_query_points` argument is different from the
# batch-trust-region example above. In the batch-trust-region example, the total number of
# query points returned per step was `num_query_points`. In this example, the total number of query
# points returned per step is `num_query_points * num_regions`. This depends on whether the
# base-rule is run in parallel across all regions or in parallel only within a region. This
# example runs the acquisition sequentially one region at a time, wherease the
# batch-trust-region case performed the acquisition in parallel across all regions. Fully
# parallel acquisition is only supported when using `EfficientGlobalOptimization` base-rule without
# local models.

# %%
turbo_acq_rule = trieste.acquisition.rule.TURBO(
search_space, rule=trieste.acquisition.rule.DiscreteThompsonSampling(500, 3)
num_regions = 2
num_query_points = 3

turbo_regions = [
trieste.acquisition.rule.TURBOBox(search_space) for _ in range(num_regions)
]
dts_rule = trieste.acquisition.rule.DiscreteThompsonSampling(
num_search_space_samples=500, num_query_points=num_query_points
)
turbo_acq_rule = trieste.acquisition.rule.BatchTrustRegionBox(
turbo_regions, dts_rule
)

bo = trieste.bayesian_optimizer.BayesianOptimizer(observer, search_space)

num_steps = 5
result = bo.optimize(
num_steps,
initial_data,
build_model(),
{trieste.observer.OBJECTIVE: initial_data},
trieste.acquisition.utils.copy_to_local_models(build_model(), num_regions),
turbo_acq_rule,
track_state=True,
fit_model=False,
)
dataset = result.try_get_final_dataset()

# %% [markdown]
# ### Visualizing `TurBO` results
# ### Visualizing `TuRBO` results
#
# We display the results as earlier.

# %%
plot_final_result(dataset)

# %%
plot_history(result)
plot_history(result, num_regions * num_query_points)

# %% [markdown]
# ## LICENSE
Expand Down
8 changes: 5 additions & 3 deletions tests/integration/test_bayesian_optimization.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,6 @@
)
from trieste.acquisition.optimizer import generate_continuous_optimizer
from trieste.acquisition.rule import (
TURBO,
AcquisitionRule,
AsynchronousGreedy,
AsynchronousOptimization,
Expand All @@ -56,6 +55,7 @@
EfficientGlobalOptimization,
SingleObjectiveTrustRegionBox,
TREGOBox,
TURBOBox,
)
from trieste.acquisition.sampler import ThompsonSamplerFromTrajectory
from trieste.acquisition.utils import copy_to_local_models
Expand Down Expand Up @@ -214,7 +214,10 @@ def GPR_OPTIMIZER_PARAMS() -> Tuple[str, List[ParameterSet]]:
),
pytest.param(
10,
TURBO(ScaledBranin.search_space, rule=DiscreteThompsonSampling(500, 3)),
BatchTrustRegionBox(
TURBOBox(ScaledBranin.search_space),
DiscreteThompsonSampling(500, 3),
),
id="Turbo",
),
pytest.param(
Expand Down Expand Up @@ -692,7 +695,6 @@ def _test_optimizer_finds_minimum(
minimum_rtol=rtol_level,
minimum_step_number=2,
),
fit_model=not isinstance(acquisition_rule, TURBO),
fit_initial_model=False,
)

Expand Down
Loading

0 comments on commit cf09b0f

Please sign in to comment.