Composition constraint not satisfied? #71

henkjanvanmanen · 2024-10-07T15:38:50Z

Hello,

I like the Honegumi philosophy and decided to try and reproduce the results from the tutorial Multi Objective Optimization of Polymers for Strength and Biodegradability using the Google Colab link (mobo.ipynb) provided on the Honegumi website.

When running ax_client.get_trials_data_frame() after the 35 trials (Sobol and BoTorch) of the tutorial and inspecting this dataframe, I noticed that the composition constraint (sum of the x1-x5 feature values should equal 1.0) was not satisfied for the large majority of trials.

Can someone confirm that this is indeed the case?
Could there be an issue with the way the composition constraint is currently defined/coded?

Thanks for any feedback/help!

henkjanvanmanen · 2024-10-07T17:40:05Z

I'm digging a bit deeper in the code. Could it be that the ax_client.get_trials_data_frame() dataframe does not show the updated x5 (1.0 - (x1+x2+x3+x4))? In that case it looks like the sum does not add up to one in the dataframe but in fact the composition constraint is correctly applied.

sgbaird · 2024-10-08T02:40:59Z

I'm digging a bit deeper in the code. Could it be that the ax_client.get_trials_data_frame() dataframe does not show the updated x5 (1.0 - (x1+x2+x3+x4))? In that case it looks like the sum does not add up to one in the dataframe but in fact the composition constraint is correctly applied.

Thanks for catching! @AndrewFalkowski could you update the tutorial such that x5 is "hidden" from Ax's perspective? (I.e., shouldn't appear in the search space, gets calculated outside of Ax before passing the parameters into the obj. function)

henkjanvanmanen · 2024-10-08T14:28:34Z

Thanks for catching! @AndrewFalkowski could you update the tutorial such that x5 is "hidden" from Ax's perspective? (I.e., shouldn't appear in the search space, gets calculated outside of Ax before passing the parameters into the obj. function)

Great! I really like the tutorials with examples from the materials field, thanks a lot!

henkjanvanmanen · 2024-11-13T14:12:56Z

Hi,
I'm still trying to wrap my head around the way that the composition constraint is currently applied in Honegumi/Ax. Using the example from above (where x5 = 1.0 - (x1+x2+x3+x4)), I understand that the updated x5 is submitted to the dummy objective function (called "measure_properties" in the Honegumi tutorial) in order to calculate the (dummy) response. However, it seems that the AxClient instance itself does not update x5 and so continues to work with x1-x5 values that do not satisfy the composition constraint of 1.0. Doesn't this affect the calculation of newly suggested trials using Bayesian optimization? Maybe this is clearly not an issue, and my question stems from my lack of knowledge about Bayesian optimization (I'm very new to it), but I just wanted to ask.
Thanks!

sgbaird · 2024-11-13T18:10:40Z

Hi @henkjanvanmanen, I just finished updating the tutorial (sorry for the delay). This should now be self-consistent. Could you have a look and let me know if this makes sense now?

sgbaird · 2024-11-13T18:13:14Z

@henkjanvanmanen, facebook/Ax#727 might also provide some helpful context.

henkjanvanmanen · 2024-11-16T12:47:53Z

Hi @sgbaird, thanks for the update of the mobo tutorial. Looking at the updated notebook, the ax_client instance now never sees/uses the x5 parameter, correct?

I ran the updated notebook both in Colab and in my local environment (both with the same AxClient random_seed of 12345). The output from Colab is shown below (output from running in my local environment is quite similar).

What I notice is that the location of the pareto front differs from the image shown on the Honegumi site. Lower biodegradability values are found by the Bayesian optimization method (max. biodegradability around 11) compared to before (max. about 16), what could be the explanation for this? And do you see this as well?

sgbaird · 2024-11-16T15:42:44Z

Since the search space changed, the optimization won't be identical. Also, it's a single search campaign, so there's stochasticity in the performance. If you want to dig in further, it would probably be best to do two things:

Create a massive set of parameter combinations (e.g., 10k-100k, I have an example - I'll try to dig it up) and use that to approximate the true Pareto front. Keep in mind the objective function was made up to keep things simple while still illustrating the concept.
Run repeat campaigns

This gets into the topic of robust benchmarking. Given that the objective function is made up, I'm not sure it's worth spending a lot of time on this specific task.

sgbaird · 2024-11-16T15:45:34Z

Large Sobol sampling via Ax example: https://github.com/sparks-baird/matsci-opt-benchmarks/blob/main/scripts%2Fcrabnet_hyperparameter%2Fcrabnet_hyperparameter_submitit.py#L75-L78

sgbaird mentioned this issue Nov 13, 2024

composition constraint fix #75

Merged

sgbaird closed this as completed in b660c10 Nov 13, 2024

sgbaird reopened this Nov 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Composition constraint not satisfied? #71

Composition constraint not satisfied? #71

henkjanvanmanen commented Oct 7, 2024

henkjanvanmanen commented Oct 7, 2024

sgbaird commented Oct 8, 2024

henkjanvanmanen commented Oct 8, 2024

henkjanvanmanen commented Nov 13, 2024

sgbaird commented Nov 13, 2024

sgbaird commented Nov 13, 2024

henkjanvanmanen commented Nov 16, 2024

sgbaird commented Nov 16, 2024

sgbaird commented Nov 16, 2024

Composition constraint not satisfied? #71

Composition constraint not satisfied? #71

Comments

henkjanvanmanen commented Oct 7, 2024

henkjanvanmanen commented Oct 7, 2024

sgbaird commented Oct 8, 2024

henkjanvanmanen commented Oct 8, 2024

henkjanvanmanen commented Nov 13, 2024

sgbaird commented Nov 13, 2024

sgbaird commented Nov 13, 2024

henkjanvanmanen commented Nov 16, 2024

sgbaird commented Nov 16, 2024

sgbaird commented Nov 16, 2024