[Feature] Variational Bayesian last layer models as surrogate models #2754

brunzema · 2025-02-21T14:57:02Z

Motivation

This PR adds variational Bayesian last layers (VBLLs) [1], which demonstrated very promising results in the context of BO in our last paper [2], to BoTorch. The goal is to provide a BoTorch-compatible implementation of VBLL surrogates for standard use cases (single-output models), making them accessible to the community as quickly as possible. This PR does not yet contain all the features discussed in [2] such as the continual learning. If there is the interest to also add the continual learning, I am happy to add them down the line!

The VBLLs can be used in standard acquisition functions such as (log)EI but are especially nice for Thompson sampling as the Thompson sample of a Bayesian last layer model is a differentiable standard feed forward neural network which is useful for (almost) global optimization of the sample for the next query location.

Implementation details

This PR adds the implementation to the community folders--also here, if there is a large interest in the model, I am happy to help merge them into the main part of the repo. The added files of this PR are the following

botorch_community
|-- acquisition
|   |-- bll_thompson_sampling.py # TS for Bayesian last layer models
|-- models
|   |-- vblls.py # BoTorch wrapper for VBLLs
|-- posteriors
|   |-- bll_posterior.py # Posterior class for Bayesian last layer models
notebooks_community
|-- vbll_thompson_sampling.ipynb # Tutorial on how use the VBLL model
test_community
|-- models
|   |-- test_vblls.py # test for the VBLL models functionality (backbone freezing for feature reuse, etc)

The current implementation build directly on the VBLL repo, which is actively maintained and depends only on PyTorch. Using this repo allows improvements—e.g., better variational posterior initialization—to be directly beneficial for BO.

Have you read the Contributing Guidelines on pull requests?

Yes.

Test Plan

The PR does not change any functionality of the current code base. The core functionality of the VBLLs should be covered by test_vblls.py. Let me know if further tests are required.

Related PRs

This PR does not change functionality and I did not see any PRs regarding last layer models in BoTorch. Maybe this implementation can useful also for other BLLs.

References

[1] P. Brunzema, M. Jordahn, J. Willes, S. Trimpe, J. Snoek, J. Harrison. Bayesian Optimization via Continual Variational Last Layer Training. International Conference on Learning Representations (ICLR), 2025.

[2] J. Harrison, J. Willes, J. Snoek. Variational Bayesian Last Layers. International Conference on Learning Representations (ICLR), 2024.

eytan · 2025-02-25T13:55:53Z

Edit: max also responded. As he mentions, dependencies are a pain. Since this isn’t much code it makes sense to copy over and we can always reconsider if there is substantial new functionality and more extensive unit tests. Thanks for this contribution, Paul! This was a very intriguing paper and it’s great to see it in botch. We typically don’t take on new external dependencies but if it’s pure PyTorch that should be OK,, as long as it’s possible for your package to have some sort of unit or integration test that ensures any future changes to your package do not cause problems for the botch implementation/ functionality demonstrated in the nb.

…

On Fri, Feb 21, 2025 at 9:57 AM Paul Brunzema ***@***.***> wrote: Motivation This PR adds variational Bayesian last layers (VBLLs) [1], which demonstrated pre very promising results in the context of BO in our last paper [2], to BoTorch. The goal is to provide a BoTorch-compatible implementation of VBLL surrogates for standard use cases (single-output models), making them accessible to the community as quickly as possible. This PR does not yet contain all the features discussed in [2] such as the continual learning. If there is the interest to also add the continual learning, I am happy to add them down the line! The VBLLs can be used in standard acquisition functions such as (log)EI but are especially nice for Thompson sampling as the Thompson sample of a Bayesian last layer model is a differentiable standard feed forward neural network which is useful for (almost) global optimization of the sample for the next query location. Implementation details This PR adds the implementation to the community folders--also here, if there is a large interest in the model, I am happy to help merge them into the main part of the repo. The added files of this PR are the following botorch_community |-- acquisition | |-- bll_thompson_sampling.py # TS for Bayesian last layer models |-- models | |-- vblls.py # BoTorch wrapper for VBLLs |-- posteriors | |-- bll_posterior.py # Posterior class for Bayesian last layer models notebooks_community |-- vbll_thompson_sampling.ipynb # Tutorial on how use the VBLL model test_community |-- models | |-- test_vblls.py # test for the VBLL models functionality (backbone freezing for feature reuse, etc) The current implementation build directly on the VBLL repo <https://github.com/VectorInstitute/vbll>, which is actively maintained and depends only on PyTorch. Using this repo allows improvements—e.g., better variational posterior initialization—to be directly beneficial for BO. Have you read the Contributing Guidelines on pull requests <https://github.com/pytorch/botorch/blob/main/CONTRIBUTING.md#pull-requests> ? Yes. Test Plan The PR does not change any functionality of the current code base. The core functionality of the VBLLs should be covered by test_vblls.py. Let me know if further tests are required. Related PRs This PR does not change functionality and I did not see any PRs regarding last layer models in BoTorch. Maybe this implementation can useful also for other BLLs. References [1] P. Brunzema, M. Jordahn, J. Willes, S. Trimpe, J. Snoek, J. Harrison. Bayesian Optimization via Continual Variational Last Layer Training <https://arxiv.org/abs/2412.09477>. International Conference on Learning Representations (ICLR), 2025. [2] J. Harrison, J. Willes, J. Snoek. Variational Bayesian Last Layers <https://arxiv.org/abs/2404.11599>. International Conference on Learning Representations (ICLR), 2024. ------------------------------ You can view, comment on, or merge this pull request online at: #2754 Commit Summary - f599471 <f599471> add vbll surrogate and notebook - 6b82184 <6b82184> update notebook and base implementation - 1b0f899 <1b0f899> update optim config - 2606ac3 <2606ac3> add tests for vbll model - 873163f <873163f> update tutorial for vblls File Changes (6 files <https://github.com/pytorch/botorch/pull/2754/files>) - *A* botorch_community/acquisition/bll_thompson_sampling.py <https://github.com/pytorch/botorch/pull/2754/files#diff-e1d4fb223629e80cd1469b9e3965e95cbde8c7cd3aabc2a588e7a859c75fd8d4> (133) - *A* botorch_community/models/vblls.py <https://github.com/pytorch/botorch/pull/2754/files#diff-7c675eabda341230ee44f210363a77bc8ec06e8a1f5d905bcec7a614eef78502> (418) - *A* botorch_community/posteriors/__init__.py <https://github.com/pytorch/botorch/pull/2754/files#diff-c655d31ec372b021b283dd2eb9469ffe897917b48cb753120689904c029241d2> (4) - *A* botorch_community/posteriors/bll_posterior.py <https://github.com/pytorch/botorch/pull/2754/files#diff-a45c3f4d1096e4c276c5269ed330a44650361f816d5e5830eb29d7a0b38adf94> (54) - *A* notebooks_community/vbll_thompson_sampling.ipynb <https://github.com/pytorch/botorch/pull/2754/files#diff-4faf6dfab200cc0e3b077731b3438146ce836eecef404a848be04a27aa5b3020> (803) - *A* test_community/models/test_vblls.py <https://github.com/pytorch/botorch/pull/2754/files#diff-17cdea2ea479a1ee1616e238b6b1a155df7a92b3223d281ce7f37d4beb9cda61> (186) Patch Links: - https://github.com/pytorch/botorch/pull/2754.patch - https://github.com/pytorch/botorch/pull/2754.diff — Reply to this email directly, view it on GitHub <#2754>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAW34NCIDINOF67WZXXLGL2Q45FJAVCNFSM6AAAAABXTNRYUOVHI2DSMVQWIX3LMV43ASLTON2WKOZSHA3DSMRWGIYDIMQ> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

Balandat · 2025-02-25T14:21:41Z

Thanks for putting this up, @brunzema. The notebook looks great, and I plan to review this PR in more detail over the next day or two.

Regarding the dependency on vbll: Right now it looks like the code only uses ~120 lines of pure torch code from the vbll repo (namely this
https://github.com/VectorInstitute/vbll/blob/main/vbll/layers/regression.py#L34-L149 plus the minimal function https://github.com/VectorInstitute/vbll/blob/main/vbll/utils/distributions.py#L94-L98). It seems questionable to me to take that dependency, especially since the vllb repo doesn't include unit tests and/or CI. We spend a nontrivial amount of time fixing issues with downstream dependencies, so we're really careful about adding them only if really necessary.

My preference would be to move the relevant pieces of the vbll code mentioned above into a helper module (and clearly attribute the source there, of course) so we can avoid the dependency for now. If we do end up expanding the functionality and use additional features from vbll, then I'd be happy to reconsider (provided the vbll repo adds proper unit tests and a CI setup).

Balandat

Unit tests are failing b/c vbll is not installed in the CI. This will not be necessary if we move the minimal code required in a helper module that we include here.

botorch_community/acquisition/bll_thompson_sampling.py

botorch_community/models/vblls.py

Balandat

oops, submitted prematurely. Here is the rest of the review.

botorch_community/models/vblls.py

Balandat · 2025-02-28T04:10:06Z

notebooks_community/vbll_thompson_sampling.ipynb

+    "\n",
+    "    ax.plot(x_test, mean, label=\"Posterior predictive\", color=\"tab:blue\")\n",
+    "\n",
+    "    # Posterior samples\n",


The TS samples shown are not the samples that were optimized to obtain the next candidate point. This initially caused some confusion; can you make the plots in a way so that this is consistent (i.e. that we show the set of TSs that were optimized to obtain the next candidate)?

Mhm, yeah thats true.. I now removed the samples from the plot to avoid this. I could also include in BLLMaxPosteriorSampling to return the sampled functions but this use case seemed too specific. Do you think this is also relevant beyond visualization? Then, this might be nice imo

notebooks_community/vbll_thompson_sampling.ipynb

brunzema · 2025-02-28T13:48:24Z

@eytan Thank you for the nice comment and @Balandat thank you for the detailed review!

Just wanted to quickly give an update on the vblls: I talked to my collaborators and we are happy to move the relevant part (regression layer + some utils) to botorch. I will update this PR to incorporate all suggested changes within the next few days + include the vbll code 👍

…trate the functionality

brunzema · 2025-03-10T21:15:41Z

@Balandat Thank you again for the detailed review--I really appreciate it! I’ve updated the PR to address all your points, but please let me know if any further changes are needed. I’m happy to make any additional updates!

Biggest change is of course the added code from the vbll package, let me know if the way I know included it is ok (additional file + credit at the top).

Balandat · 2025-03-10T23:15:10Z

@brunzema thanks a lot for the updates - will review this within the next couple of days!

codecov · 2025-03-10T23:24:41Z

Codecov Report

Attention: Patch coverage is 92.52874% with 39 lines in your changes missing coverage. Please review.

Project coverage is 99.79%. Comparing base (39dd171) to head (ae1dca7).
Report is 2 commits behind head on main.

Files with missing lines	Patch %	Lines
botorch_community/models/vbll_helper.py	85.82%	37 Missing ⚠️
...rch_community/acquisition/bll_thompson_sampling.py	96.66%	2 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##              main    #2754      +/-   ##
===========================================
- Coverage   100.00%   99.79%   -0.21%     
===========================================
  Files          206      211       +5     
  Lines        18599    19197     +598     
===========================================
+ Hits         18599    19158     +559     
- Misses           0       39      +39

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Balandat

Thanks for the updates and in particular for avoiding another external dependency! I left a number of inline comments, here are some higher level ones:

Please rebase this on a recent version of master (the base of commit of this PR is pretty old)
Please address flake8 (incl. line length) and import sorting errors - you can install the pre-commit hooks to make sure that your code confirms to the standard (see https://github.com/pytorch/botorch/blob/main/CONTRIBUTING.md#pre-commit-hooks)
I spotted some some rather problematic numerical code in the vbll helpers, let's update that (highlighted inline), shall we?

botorch_community/acquisition/bll_thompson_sampling.py

botorch_community/models/vbll_helper.py

botorch_community/posteriors/bll_posterior.py

Balandat · 2025-03-25T14:37:59Z

@brunzema checking in here, anything needed to get this over the finish line? Seems like we're very close.

brunzema · 2025-03-26T18:12:24Z

@Balandat no, the delay is fully on me, sorry! Everything is pretty much done, flake8 is adressed and also with the import sorting + from __future__ import annotations I no longer require the not so nice type_checking condition 👍 I will make a fresh rebase tomorrow and a final pass and hopefully in the morning your time, I have submitted the updated PR :)

Balandat · 2025-03-26T18:17:39Z

Excellent!

brunzema · 2025-03-27T14:28:51Z

@Balandat I updated the PR. Not sure why I thought the importing was fixed yesterday—I still had the type checking in place. I noticed that some files in the main repo also use it, so it should be fine?

That said, I think a cleaner approach (which I’ve implemented in the updated PR) is to extract the abstract Bayesian last-layer (BLL) class and use it as a parent class/interface. This should also make it more extensible for future BLL models.

botorch_community
|-- acquisition
|   |-- bll_thompson_sampling.py # AbstractBLL to specify "interface"
|-- models
|   |-- blls.py # Define AbstractBLL model
|   |-- vblls.py # VBLL inherit from AbstractBLL
|-- posteriors
|   |-- bll_posterior.py # AbstractBLL to specify "interface"
...

Balandat

Thanks! Overall this looks great.

Before I merge this in, could you please increase unit test coverage - looks like quite a lot of things are not covered by tests, including some important parts such as BLLPosterior.rsample().

The tutorial failure is unrelated, we'll fix this on our end.

botorch_community/acquisition/bll_thompson_sampling.py

botorch_community/models/vbll_helper.py

brunzema · 2025-04-09T08:57:31Z

hey @Balandat , again took a while--sorry about that! I tried to push the test coverage to 100%. Only places that I am unsure about are the following:

After how small discussion, I now added an error to the numerical optimization of the posterior samples paths (See [Feature] Variational Bayesian last layer models as surrogate models #2754 (comment)):

def _optimize_sample_path():
    ...
    
    optimization_successful = False
    for j in range(num_restarts):
        # map to bounds
        x0 = lb + (ub - lb) * x0s[j]

        # optimize sample path
        res = scipy.optimize.minimize(
            func, x0, jac=grad_func, bounds=bounds, method="L-BFGS-B"
        )

        # check if optimization was successful
        if res.success:
            optimization_successful = True
        if not res.success:
            logger.warning(f"Optimization failed with message: {res.message}")

        # store the candidate
        X_cand[j, :] = torch.from_numpy(res.x).to(dtype=torch.float64)
        Y_cand[j] = torch.tensor([-res.fun], dtype=torch.float64)

    if not optimization_successful:
        raise RuntimeError("All optimization attempts on the sample path failed.")

Here I am unsure how to actually test this as this never happened to be before. I am inclined to put a # pragma: no cover here. What do you think?

Should I write a separate test for the vbll helpers? The not covered lines are essentially all properties and the important parts (different parameterization of the VBLL head) are now all covered in the VBLL test. What is your opinion? Best place would be in utils?

Balandat · 2025-04-09T17:30:42Z

Awesome, thanks for pushing this along. Once we cover the last few lines with tests this can go in.

Here I am unsure how to actually test this as this never happened to be before. I am inclined to put a # pragma: no cover here. What do you think?

I'd recommend having a lightweight mocked test that mocks out scipy.optimize.minimize to return a mocked OptimizeResult where success = False.

Should I write a separate test for the vbll helpers? The not covered lines are essentially all properties and the important parts (different parameterization of the VBLL head) are now all covered in the VBLL test. What is your opinion? Best place would be in utils?

Yes, having those tests separately would be great. Looks like the misses are mainly in botorch_community/models/vbll_helper.py, so I would just test these in a new test_community/models/test_vbll_helper.py module.

brunzema added 5 commits February 13, 2025 16:45

add vbll surrogate and notebook

f599471

update notebook and base implementation

6b82184

update optim config

1b0f899

add tests for vbll model

2606ac3

update tutorial for vblls

873163f

facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Feb 21, 2025

brunzema marked this pull request as ready for review February 25, 2025 08:27

update docsting vbll

f4563ca

Balandat reviewed Feb 28, 2025

View reviewed changes

brunzema added 7 commits March 9, 2025 16:16

remove vbll repo dependency and add relevant code

0944f57

update tutorial including a comparison between TS and logEI to demons…

cf9e6b1

…trate the functionality

add vbll helper module for standard regression layer

cbf568e

include PR comments, fix docstrings

2541cef

add additional test for forward methods of VBLL and sample network

348749d

update tutorial

66ff26d

minor refactors

ad2fb22

Balandat requested changes Mar 11, 2025

View reviewed changes

Merge branch 'pytorch:main' into feature/add_vbll_surrogates

9119db3

brunzema and others added 2 commits March 27, 2025 07:37

Merge branch 'pytorch:main' into feature/add_vbll_surrogates

d01e1fd

include PR comments, address flake8 and µfmt

353196c

resolve cyclic imports by pulling out the abstract BLL

630cbc2

brunzema requested a review from Balandat March 27, 2025 14:12

fix import error bll TS

8ad0620

clean up docstrings, sorry

92d3247

Balandat reviewed Mar 30, 2025

View reviewed changes

brunzema and others added 5 commits April 7, 2025 07:44

Merge branch 'pytorch:main' into feature/add_vbll_surrogates

75b3993

add test for bll posterior sampling

dd30bf9

add tests for bll posterior

2858347

modify vbll tests to include all possible parameterizations

20c4fcd

increase test coverage for vblls

ae1dca7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Variational Bayesian last layer models as surrogate models #2754

[Feature] Variational Bayesian last layer models as surrogate models #2754

brunzema commented Feb 21, 2025

eytan commented Feb 25, 2025 via email •

edited

Loading

Balandat commented Feb 25, 2025

Balandat left a comment

Balandat left a comment

Balandat Feb 28, 2025

brunzema Mar 10, 2025

brunzema commented Feb 28, 2025

brunzema commented Mar 10, 2025

Balandat commented Mar 10, 2025

codecov bot commented Mar 10, 2025 •

edited

Loading

Balandat left a comment

Balandat commented Mar 25, 2025

brunzema commented Mar 26, 2025

Balandat commented Mar 26, 2025

brunzema commented Mar 27, 2025

Balandat left a comment

brunzema commented Apr 9, 2025

Balandat commented Apr 9, 2025

[Feature] Variational Bayesian last layer models as surrogate models #2754

Are you sure you want to change the base?

[Feature] Variational Bayesian last layer models as surrogate models #2754

Conversation

brunzema commented Feb 21, 2025

Motivation

Implementation details

Have you read the Contributing Guidelines on pull requests?

Test Plan

Related PRs

References

eytan commented Feb 25, 2025 via email • edited Loading

Balandat commented Feb 25, 2025

Balandat left a comment

Choose a reason for hiding this comment

Balandat left a comment

Choose a reason for hiding this comment

Balandat Feb 28, 2025

Choose a reason for hiding this comment

brunzema Mar 10, 2025

Choose a reason for hiding this comment

brunzema commented Feb 28, 2025

brunzema commented Mar 10, 2025

Balandat commented Mar 10, 2025

codecov bot commented Mar 10, 2025 • edited Loading

Codecov Report

Balandat left a comment

Choose a reason for hiding this comment

Balandat commented Mar 25, 2025

brunzema commented Mar 26, 2025

Balandat commented Mar 26, 2025

brunzema commented Mar 27, 2025

Balandat left a comment

Choose a reason for hiding this comment

brunzema commented Apr 9, 2025

Balandat commented Apr 9, 2025

eytan commented Feb 25, 2025 via email •

edited

Loading

codecov bot commented Mar 10, 2025 •

edited

Loading