Score-based iid sampling #1381

manuelgloeckler · 2025-01-30T14:05:18Z

Completes the missing features based on score estimation #1226.

manuelgloeckler · 2025-02-17T15:35:24Z

Okey, everything should be implemented now. This acutally became quite a big PR now. A few more points:

Check if batch jacobian with torch.func.vmap actually works correctly
Check if the above or other cause some performance degradation in jac_gauss (although this can be sensitive to how the network is preconditioned)
Add an API to pass hyperparameters to the IID method (and make iid_methods more customizable i.e. auto_gauss)
Multivariate priors
General Empirical prior support for automatic denoising and marginalization (then auto_gauss should become default)

codecov · 2025-02-17T16:19:30Z

Codecov Report

Attention: Patch coverage is 92.88991% with 31 lines in your changes missing coverage. Please review.

Project coverage is 79.07%. Comparing base (18f92b1) to head (fd0f964).
Report is 7 commits behind head on main.

Files with missing lines	Patch %	Lines
sbi/utils/score_utils.py	91.66%	16 Missing ⚠️
sbi/inference/potentials/score_fn_iid.py	92.53%	15 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##             main    #1381       +/-   ##
===========================================
- Coverage   89.31%   79.07%   -10.24%     
===========================================
  Files         119      121        +2     
  Lines        8779     9311      +532     
===========================================
- Hits         7841     7363      -478     
- Misses        938     1948     +1010

Flag	Coverage Δ
unittests	`79.07% <92.88%> (-10.24%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
sbi/inference/posteriors/score_posterior.py	`81.18% <100.00%> (-15.83%)`	⬇️
sbi/inference/potentials/score_based_potential.py	`82.85% <100.00%> (-14.12%)`	⬇️
sbi/samplers/score/correctors.py	`98.18% <100.00%> (+46.00%)`	⬆️
sbi/samplers/score/diffuser.py	`90.74% <100.00%> (+5.83%)`	⬆️
sbi/inference/potentials/score_fn_iid.py	`92.53% <92.53%> (ø)`
sbi/utils/score_utils.py	`91.66% <91.66%> (ø)`

... and 36 files with indirect coverage changes

manuelgloeckler · 2025-02-20T11:07:56Z

This is now basically done. With the review, one should probably wait until the other score branch and type fixes are merged.

But the major changes are:

ScoreFnIID classes which manage the score composition
ScoreUtil, which has a bunch of helpers for "automatic" marginalization and denoising of PyTorch distributions (i.e., what the user can pass as the prior). If there is no analytic solution (or the user does not pass a prior) it will fall back to a rather good MoG approximation.

janfb · 2025-02-20T14:14:22Z

@manuelgloeckler #1370 has been merged into main. please merge main to resolve the conflicts showing up here.

janfb

Wow, great effort! 👏 Thanks for adding all those methods!

Looks great overall, but I was a bit confused by the class structure in the score_fn_iid.py and added a couple of comments.
Also, the tests can be refactored a bit.

Please note that you might have to rebase or merge again with main once #1404 is merged.

janfb · 2025-02-25T09:29:29Z

sbi/inference/posteriors/score_posterior.py

@@ -123,6 +125,9 @@ def sample(
            steps: Number of steps to take for the Euler-Maruyama method.
            ts: Time points at which to evaluate the diffusion process. If None, a
                linear grid between t_max and t_min is used.
+            iid_method: Which method to use for computing the score in the iid setting.
+                We currently support "fnpe", "gauss", "auto_gauss", "jac_gauss".


this is great! can you please add a bit of details about these methods, e.g,. the full names? We should then also cover these options in the extended NPSE tutorial. Can you please add a not under #1392 ?

These kinda are the names of the methods, but I can add a bit of detail on what they do and what to care about (from an applied perspective).

janfb · 2025-02-25T09:31:34Z

sbi/inference/posteriors/score_posterior.py

@@ -138,7 +143,10 @@ def sample(

        x = self._x_else_default_x(x)
        x = reshape_to_batch_event(x, self.score_estimator.condition_shape)
-        self.potential_fn.set_x(x, x_is_iid=True)
+        is_iid = x.ndim > 1 and x.shape[0] > 1


above, x is reshaped to "batch_event", so it will be always ndim>1, no?

Good catch I think so, too.

janfb · 2025-02-25T09:37:55Z

sbi/inference/posteriors/score_posterior.py

@@ -176,6 +184,7 @@ def sample(
    def _sample_via_diffusion(
        self,
        sample_shape: Shape = torch.Size(),
+        x: Optional[Tensor] = None,


Why do we need x here? I think it is just taken internally from potential_fn.x_o.

I did not add that this was part of the #1226 changes. But I can have a look.

janfb · 2025-02-25T09:38:01Z

sbi/inference/posteriors/score_posterior.py

@@ -244,6 +253,7 @@ def _sample_via_diffusion(
    def sample_via_ode(
        self,
        sample_shape: Shape = torch.Size(),
+        x: Optional[Tensor] = None,


Why do we need x here? I think it is just taken internally from potential_fn.x_o.

I did not add that this was part of the #1226 changes. But I can have a look.

janfb · 2025-02-25T09:43:31Z

sbi/inference/potentials/score_based_potential.py

@@ -57,7 +58,8 @@ def __init__(
        score_estimator: ConditionalScoreEstimator,
        prior: Optional[Distribution],
        x_o: Optional[Tensor] = None,
-        iid_method: str = "iid_bridge",
+        iid_method: str = "auto_gauss",
+        iid_params: Optional[dict] = None,


dict vs Dict. Also, can we specify the types of the dict, e.g., Dict[str, float], or will it have different types?

Would do Dict[str, Any] can be from floats to tensors to strings.

janfb · 2025-02-25T12:34:27Z

tests/linearGaussian_npse_test.py


    theta = prior.sample((num_simulations,))
    x = linear_gaussian(theta, likelihood_shift, likelihood_cov)

    score_estimator = inference.append_simulations(theta, x).train(
-        training_batch_size=100,
+        training_batch_size=200, max_num_epochs=400


do we still need this increased num_epochs? was not this fixed with the recent update on the convergence?

Yeah, will have a look into this. The tests did, however, fail on the new setting (although not by much, I can just change the tolerance of the checks a bit). In general, these will be slightly more sensitive to a small change in approximation performance on 1 trial; it will propagate.

janfb · 2025-02-25T12:34:56Z

tests/linearGaussian_npse_test.py

+            check_c2st(
+                samples,
+                target_samples,
+                alg=f"npse-vp-gaussian-2D-{iid_method}-{num_trial}iid-trials",


add num_dim instead of 2D?

janfb · 2025-02-25T12:37:01Z

tests/linearGaussian_npse_test.py

-    check_c2st(
-        samples, target_samples, alg=f"npse-vp-gaussian-1D-{num_trials}iid-trials"
+
+@pytest.mark.slow


this is essentially the same test as above but with prior=Uniform no? So I suggest merging the two tests and just using pytest.mark.parametrize("prior", ("gaussian", "uniform", None))

or am I missing something?

janfb · 2025-02-25T12:37:45Z

tests/score_samplers_test.py

+    ],
+)
+@pytest.mark.parametrize("d", [1, 2, 3])
+def test_score_fn_iid_on_different_priors(sde_type, iid_method, d):


please add docstring with basic explanation.

janfb · 2025-02-25T12:37:56Z

tests/score_samplers_test.py

+        "jac_gauss",
+    ],
+)
+@pytest.mark.parametrize("d", [1, 2, 3])


gmoss13 added 5 commits January 29, 2025 15:59

npse MAP

5350e58

set default enable_Transform to True

58fd7b8

ruff formatting version change

74cfad4

sampling via diffusion twice

857399d

batched sampling for score-based posteriors

bcea468

manuelgloeckler mentioned this pull request Jan 30, 2025

additional features for NPSE #1370

Merged

6 tasks

manuelgloeckler added 10 commits January 30, 2025 15:16

iid api integration

843ce7d

new ruff

384f36f

adding corrector back

016c5a7

adding untesed GAUSS

9152d28

All other methods

df1f30d

reformat

85bf355

messy version of simple gauss, API to sample method

5195417

jac method (still needs feasible Lambda projection to work)

5bcf427

Only jac left

09f0113

Formating and so on

ad240b7

manuelgloeckler added 6 commits February 18, 2025 09:34

Adding correct tests

44e08f2

Add empirical support - but this doesnt work that well

d29ebf8

A bunch of auto marginalize and denois methods

b0b8b41

general prior with GMM approx

0d0991b

Bunch of reffactorings and customizability

f22467d

Update API docstirngs

6cbe5ae

janfb mentioned this pull request Feb 20, 2025

docs: add tutorial for score-based methods that covers all new features #1392

Open

manuelgloeckler added 2 commits February 20, 2025 12:00

New tests, passes all now

97a8dda

Formating linting, type error form other PR

5349747

manuelgloeckler requested review from janfb and gmoss13 February 20, 2025 11:02

janfb linked an issue Feb 20, 2025 that may be closed by this pull request

missing features and todos for score estimation #1226

Open

manuelgloeckler added 3 commits February 21, 2025 10:09

Merge branch 'main' into 1226-score-based-iid

0539b62

Remove assert that IID data is not supported

bfe6df2

Ruff linting

fd0f964

janfb reviewed Feb 25, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Score-based iid sampling #1381

Score-based iid sampling #1381

manuelgloeckler commented Jan 30, 2025 •

edited

Loading

manuelgloeckler commented Feb 17, 2025 •

edited

Loading

codecov bot commented Feb 17, 2025 •

edited

Loading

manuelgloeckler commented Feb 20, 2025

janfb commented Feb 20, 2025

janfb left a comment

janfb Feb 25, 2025

manuelgloeckler Feb 28, 2025

janfb Feb 25, 2025

manuelgloeckler Feb 28, 2025

janfb Feb 25, 2025

manuelgloeckler Feb 28, 2025

janfb Feb 25, 2025

manuelgloeckler Feb 28, 2025

janfb Feb 25, 2025

manuelgloeckler Feb 28, 2025

janfb Feb 25, 2025

manuelgloeckler Feb 28, 2025

janfb Feb 25, 2025

janfb Feb 25, 2025

janfb Feb 25, 2025

janfb Feb 25, 2025

Score-based iid sampling #1381

Are you sure you want to change the base?

Score-based iid sampling #1381

Conversation

manuelgloeckler commented Jan 30, 2025 • edited Loading

manuelgloeckler commented Feb 17, 2025 • edited Loading

codecov bot commented Feb 17, 2025 • edited Loading

Codecov Report

manuelgloeckler commented Feb 20, 2025

janfb commented Feb 20, 2025

janfb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

manuelgloeckler commented Jan 30, 2025 •

edited

Loading

manuelgloeckler commented Feb 17, 2025 •

edited

Loading

codecov bot commented Feb 17, 2025 •

edited

Loading