Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Composition constraint not satisfied? #71

Open
henkjanvanmanen opened this issue Oct 7, 2024 · 9 comments
Open

Composition constraint not satisfied? #71

henkjanvanmanen opened this issue Oct 7, 2024 · 9 comments

Comments

@henkjanvanmanen
Copy link

Hello,

I like the Honegumi philosophy and decided to try and reproduce the results from the tutorial Multi Objective Optimization of Polymers for Strength and Biodegradability using the Google Colab link (mobo.ipynb) provided on the Honegumi website.

When running ax_client.get_trials_data_frame() after the 35 trials (Sobol and BoTorch) of the tutorial and inspecting this dataframe, I noticed that the composition constraint (sum of the x1-x5 feature values should equal 1.0) was not satisfied for the large majority of trials.

Can someone confirm that this is indeed the case?
Could there be an issue with the way the composition constraint is currently defined/coded?

Thanks for any feedback/help!

@henkjanvanmanen
Copy link
Author

I'm digging a bit deeper in the code. Could it be that the ax_client.get_trials_data_frame() dataframe does not show the updated x5 (1.0 - (x1+x2+x3+x4))? In that case it looks like the sum does not add up to one in the dataframe but in fact the composition constraint is correctly applied.

@sgbaird
Copy link
Owner

sgbaird commented Oct 8, 2024

I'm digging a bit deeper in the code. Could it be that the ax_client.get_trials_data_frame() dataframe does not show the updated x5 (1.0 - (x1+x2+x3+x4))? In that case it looks like the sum does not add up to one in the dataframe but in fact the composition constraint is correctly applied.

Thanks for catching! @AndrewFalkowski could you update the tutorial such that x5 is "hidden" from Ax's perspective? (I.e., shouldn't appear in the search space, gets calculated outside of Ax before passing the parameters into the obj. function)

@henkjanvanmanen
Copy link
Author

Thanks for catching! @AndrewFalkowski could you update the tutorial such that x5 is "hidden" from Ax's perspective? (I.e., shouldn't appear in the search space, gets calculated outside of Ax before passing the parameters into the obj. function)

Great! I really like the tutorials with examples from the materials field, thanks a lot!

@henkjanvanmanen
Copy link
Author

Hi,
I'm still trying to wrap my head around the way that the composition constraint is currently applied in Honegumi/Ax. Using the example from above (where x5 = 1.0 - (x1+x2+x3+x4)), I understand that the updated x5 is submitted to the dummy objective function (called "measure_properties" in the Honegumi tutorial) in order to calculate the (dummy) response. However, it seems that the AxClient instance itself does not update x5 and so continues to work with x1-x5 values that do not satisfy the composition constraint of 1.0. Doesn't this affect the calculation of newly suggested trials using Bayesian optimization? Maybe this is clearly not an issue, and my question stems from my lack of knowledge about Bayesian optimization (I'm very new to it), but I just wanted to ask.
Thanks!

@sgbaird
Copy link
Owner

sgbaird commented Nov 13, 2024

Hi @henkjanvanmanen, I just finished updating the tutorial (sorry for the delay). This should now be self-consistent. Could you have a look and let me know if this makes sense now?

@sgbaird sgbaird reopened this Nov 13, 2024
@sgbaird
Copy link
Owner

sgbaird commented Nov 13, 2024

@henkjanvanmanen, facebook/Ax#727 might also provide some helpful context.

@henkjanvanmanen
Copy link
Author

Hi @sgbaird, thanks for the update of the mobo tutorial. Looking at the updated notebook, the ax_client instance now never sees/uses the x5 parameter, correct?

I ran the updated notebook both in Colab and in my local environment (both with the same AxClient random_seed of 12345). The output from Colab is shown below (output from running in my local environment is quite similar).

image

What I notice is that the location of the pareto front differs from the image shown on the Honegumi site. Lower biodegradability values are found by the Bayesian optimization method (max. biodegradability around 11) compared to before (max. about 16), what could be the explanation for this? And do you see this as well?

@sgbaird
Copy link
Owner

sgbaird commented Nov 16, 2024

Since the search space changed, the optimization won't be identical. Also, it's a single search campaign, so there's stochasticity in the performance. If you want to dig in further, it would probably be best to do two things:

  1. Create a massive set of parameter combinations (e.g., 10k-100k, I have an example - I'll try to dig it up) and use that to approximate the true Pareto front. Keep in mind the objective function was made up to keep things simple while still illustrating the concept.
  2. Run repeat campaigns

This gets into the topic of robust benchmarking. Given that the objective function is made up, I'm not sure it's worth spending a lot of time on this specific task.

@sgbaird
Copy link
Owner

sgbaird commented Nov 16, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants