-
Notifications
You must be signed in to change notification settings - Fork 418
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Variational Bayesian last layer models as surrogate models #2754
base: main
Are you sure you want to change the base?
Conversation
Edit: max also responded. As he mentions, dependencies are a pain. Since this isn’t much code it makes sense to copy over and we can always reconsider if there is substantial new functionality and more extensive unit tests.
Thanks for this contribution, Paul! This was a very intriguing paper and
it’s great to see it in botch.
We typically don’t take on new external dependencies but if it’s pure
PyTorch that should be OK,, as long as it’s possible for your package to
have some sort of unit or integration test that ensures any future changes
to your package do not cause problems for the botch implementation/
functionality demonstrated in the nb.
…On Fri, Feb 21, 2025 at 9:57 AM Paul Brunzema ***@***.***> wrote:
Motivation
This PR adds variational Bayesian last layers (VBLLs) [1], which
demonstrated pre very promising results in the context of BO in our last
paper [2], to BoTorch. The goal is to provide a BoTorch-compatible
implementation of VBLL surrogates for standard use cases (single-output
models), making them accessible to the community as quickly as possible.
This PR does not yet contain all the features discussed in [2] such as the
continual learning. If there is the interest to also add the continual
learning, I am happy to add them down the line!
The VBLLs can be used in standard acquisition functions such as (log)EI
but are especially nice for Thompson sampling as the Thompson sample of a
Bayesian last layer model is a differentiable standard feed forward neural
network which is useful for (almost) global optimization of the sample for
the next query location.
Implementation details
This PR adds the implementation to the community folders--also here, if
there is a large interest in the model, I am happy to help merge them into
the main part of the repo. The added files of this PR are the following
botorch_community
|-- acquisition
| |-- bll_thompson_sampling.py # TS for Bayesian last layer models
|-- models
| |-- vblls.py # BoTorch wrapper for VBLLs
|-- posteriors
| |-- bll_posterior.py # Posterior class for Bayesian last layer models
notebooks_community
|-- vbll_thompson_sampling.ipynb # Tutorial on how use the VBLL model
test_community
|-- models
| |-- test_vblls.py # test for the VBLL models functionality (backbone freezing for feature reuse, etc)
The current implementation build directly on the VBLL repo
<https://github.com/VectorInstitute/vbll>, which is actively maintained
and depends only on PyTorch. Using this repo allows improvements—e.g.,
better variational posterior initialization—to be directly beneficial for
BO.
Have you read the Contributing Guidelines on pull requests
<https://github.com/pytorch/botorch/blob/main/CONTRIBUTING.md#pull-requests>
?
Yes.
Test Plan
The PR does not change any functionality of the current code base. The
core functionality of the VBLLs should be covered by test_vblls.py. Let
me know if further tests are required.
Related PRs
This PR does not change functionality and I did not see any PRs regarding
last layer models in BoTorch. Maybe this implementation can useful also for
other BLLs.
References
[1] P. Brunzema, M. Jordahn, J. Willes, S. Trimpe, J. Snoek, J. Harrison. Bayesian
Optimization via Continual Variational Last Layer Training
<https://arxiv.org/abs/2412.09477>. International Conference on Learning
Representations (ICLR), 2025.
[2] J. Harrison, J. Willes, J. Snoek. Variational Bayesian Last Layers
<https://arxiv.org/abs/2404.11599>. International Conference on Learning
Representations (ICLR), 2024.
------------------------------
You can view, comment on, or merge this pull request online at:
#2754
Commit Summary
- f599471
<f599471>
add vbll surrogate and notebook
- 6b82184
<6b82184>
update notebook and base implementation
- 1b0f899
<1b0f899>
update optim config
- 2606ac3
<2606ac3>
add tests for vbll model
- 873163f
<873163f>
update tutorial for vblls
File Changes
(6 files <https://github.com/pytorch/botorch/pull/2754/files>)
- *A* botorch_community/acquisition/bll_thompson_sampling.py
<https://github.com/pytorch/botorch/pull/2754/files#diff-e1d4fb223629e80cd1469b9e3965e95cbde8c7cd3aabc2a588e7a859c75fd8d4>
(133)
- *A* botorch_community/models/vblls.py
<https://github.com/pytorch/botorch/pull/2754/files#diff-7c675eabda341230ee44f210363a77bc8ec06e8a1f5d905bcec7a614eef78502>
(418)
- *A* botorch_community/posteriors/__init__.py
<https://github.com/pytorch/botorch/pull/2754/files#diff-c655d31ec372b021b283dd2eb9469ffe897917b48cb753120689904c029241d2>
(4)
- *A* botorch_community/posteriors/bll_posterior.py
<https://github.com/pytorch/botorch/pull/2754/files#diff-a45c3f4d1096e4c276c5269ed330a44650361f816d5e5830eb29d7a0b38adf94>
(54)
- *A* notebooks_community/vbll_thompson_sampling.ipynb
<https://github.com/pytorch/botorch/pull/2754/files#diff-4faf6dfab200cc0e3b077731b3438146ce836eecef404a848be04a27aa5b3020>
(803)
- *A* test_community/models/test_vblls.py
<https://github.com/pytorch/botorch/pull/2754/files#diff-17cdea2ea479a1ee1616e238b6b1a155df7a92b3223d281ce7f37d4beb9cda61>
(186)
Patch Links:
- https://github.com/pytorch/botorch/pull/2754.patch
- https://github.com/pytorch/botorch/pull/2754.diff
—
Reply to this email directly, view it on GitHub
<#2754>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAW34NCIDINOF67WZXXLGL2Q45FJAVCNFSM6AAAAABXTNRYUOVHI2DSMVQWIX3LMV43ASLTON2WKOZSHA3DSMRWGIYDIMQ>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Thanks for putting this up, @brunzema. The notebook looks great, and I plan to review this PR in more detail over the next day or two. Regarding the dependency on vbll: Right now it looks like the code only uses ~120 lines of pure torch code from the vbll repo (namely this My preference would be to move the relevant pieces of the vbll code mentioned above into a helper module (and clearly attribute the source there, of course) so we can avoid the dependency for now. If we do end up expanding the functionality and use additional features from vbll, then I'd be happy to reconsider (provided the vbll repo adds proper unit tests and a CI setup). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unit tests are failing b/c vbll is not installed in the CI. This will not be necessary if we move the minimal code required in a helper module that we include here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops, submitted prematurely. Here is the rest of the review.
"\n", | ||
" ax.plot(x_test, mean, label=\"Posterior predictive\", color=\"tab:blue\")\n", | ||
"\n", | ||
" # Posterior samples\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The TS samples shown are not the samples that were optimized to obtain the next candidate point. This initially caused some confusion; can you make the plots in a way so that this is consistent (i.e. that we show the set of TSs that were optimized to obtain the next candidate)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mhm, yeah thats true.. I now removed the samples from the plot to avoid this. I could also include in BLLMaxPosteriorSampling
to return the sampled functions but this use case seemed too specific. Do you think this is also relevant beyond visualization? Then, this might be nice imo
@eytan Thank you for the nice comment and @Balandat thank you for the detailed review! Just wanted to quickly give an update on the vblls: I talked to my collaborators and we are happy to move the relevant part (regression layer + some utils) to botorch. I will update this PR to incorporate all suggested changes within the next few days + include the vbll code 👍 |
…trate the functionality
@Balandat Thank you again for the detailed review--I really appreciate it! I’ve updated the PR to address all your points, but please let me know if any further changes are needed. I’m happy to make any additional updates! Biggest change is of course the added code from the vbll package, let me know if the way I know included it is ok (additional file + credit at the top). |
@brunzema thanks a lot for the updates - will review this within the next couple of days! |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2754 +/- ##
===========================================
- Coverage 100.00% 99.79% -0.21%
===========================================
Files 206 211 +5
Lines 18599 19197 +598
===========================================
+ Hits 18599 19158 +559
- Misses 0 39 +39 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates and in particular for avoiding another external dependency! I left a number of inline comments, here are some higher level ones:
- Please rebase this on a recent version of
master
(the base of commit of this PR is pretty old) - Please address flake8 (incl. line length) and import sorting errors - you can install the
pre-commit
hooks to make sure that your code confirms to the standard (see https://github.com/pytorch/botorch/blob/main/CONTRIBUTING.md#pre-commit-hooks) - I spotted some some rather problematic numerical code in the vbll helpers, let's update that (highlighted inline), shall we?
@brunzema checking in here, anything needed to get this over the finish line? Seems like we're very close. |
@Balandat no, the delay is fully on me, sorry! Everything is pretty much done, flake8 is adressed and also with the import sorting + |
Excellent! |
@Balandat I updated the PR. Not sure why I thought the importing was fixed yesterday—I still had the type checking in place. I noticed that some files in the main repo also use it, so it should be fine? That said, I think a cleaner approach (which I’ve implemented in the updated PR) is to extract the abstract Bayesian last-layer (BLL) class and use it as a parent class/interface. This should also make it more extensible for future BLL models.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Overall this looks great.
Before I merge this in, could you please increase unit test coverage - looks like quite a lot of things are not covered by tests, including some important parts such as BLLPosterior.rsample()
.
The tutorial failure is unrelated, we'll fix this on our end.
hey @Balandat , again took a while--sorry about that! I tried to push the test coverage to 100%. Only places that I am unsure about are the following:
def _optimize_sample_path():
...
optimization_successful = False
for j in range(num_restarts):
# map to bounds
x0 = lb + (ub - lb) * x0s[j]
# optimize sample path
res = scipy.optimize.minimize(
func, x0, jac=grad_func, bounds=bounds, method="L-BFGS-B"
)
# check if optimization was successful
if res.success:
optimization_successful = True
if not res.success:
logger.warning(f"Optimization failed with message: {res.message}")
# store the candidate
X_cand[j, :] = torch.from_numpy(res.x).to(dtype=torch.float64)
Y_cand[j] = torch.tensor([-res.fun], dtype=torch.float64)
if not optimization_successful:
raise RuntimeError("All optimization attempts on the sample path failed.")
Here I am unsure how to actually test this as this never happened to be before. I am inclined to put a
|
Awesome, thanks for pushing this along. Once we cover the last few lines with tests this can go in.
I'd recommend having a lightweight mocked test that mocks out
Yes, having those tests separately would be great. Looks like the misses are mainly in |
Motivation
This PR adds variational Bayesian last layers (VBLLs) [1], which demonstrated very promising results in the context of BO in our last paper [2], to BoTorch. The goal is to provide a BoTorch-compatible implementation of VBLL surrogates for standard use cases (single-output models), making them accessible to the community as quickly as possible. This PR does not yet contain all the features discussed in [2] such as the continual learning. If there is the interest to also add the continual learning, I am happy to add them down the line!
The VBLLs can be used in standard acquisition functions such as (log)EI but are especially nice for Thompson sampling as the Thompson sample of a Bayesian last layer model is a differentiable standard feed forward neural network which is useful for (almost) global optimization of the sample for the next query location.
Implementation details
This PR adds the implementation to the community folders--also here, if there is a large interest in the model, I am happy to help merge them into the main part of the repo. The added files of this PR are the following
The current implementation build directly on the VBLL repo, which is actively maintained and depends only on PyTorch. Using this repo allows improvements—e.g., better variational posterior initialization—to be directly beneficial for BO.
Have you read the Contributing Guidelines on pull requests?
Yes.
Test Plan
The PR does not change any functionality of the current code base. The core functionality of the VBLLs should be covered by
test_vblls.py
. Let me know if further tests are required.Related PRs
This PR does not change functionality and I did not see any PRs regarding last layer models in BoTorch. Maybe this implementation can useful also for other BLLs.
References
[1] P. Brunzema, M. Jordahn, J. Willes, S. Trimpe, J. Snoek, J. Harrison. Bayesian Optimization via Continual Variational Last Layer Training. International Conference on Learning Representations (ICLR), 2025.
[2] J. Harrison, J. Willes, J. Snoek. Variational Bayesian Last Layers. International Conference on Learning Representations (ICLR), 2024.