SNPE_C: Lower acceptance rate when using large number of simulations #1416
-
Hi, Just wanted to ask if it’s expected to have a lower acceptance rate when training on a larger simulation dataset (23M). I found that there seem to be leakage problems when sampling the posterior for some observed data points, which wasn’t the case when I used a posterior trained on a smaller simulation dataset (2M). I am working with a simulator with 6 model parameters and 9 observables. Thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
Hi there! I think this behavior is possible. I expect that the observations for which the acceptance rate is low are misspecified, i.e. they systematically differ from the simulated training dataset. If this is the case, then the posterior is ill-defined, and no amount of training data will fix this issue. A simple fix is to add noise to the simulated training dataset (to make it cover a broader range of observations). There are also a range of more advanced methods, see e.g. here, but these are not implemented in the Hope this helps! |
Beta Was this translation helpful? Give feedback.
Hi there!
I think this behavior is possible. I expect that the observations for which the acceptance rate is low are misspecified, i.e. they systematically differ from the simulated training dataset. If this is the case, then the posterior is ill-defined, and no amount of training data will fix this issue.
A simple fix is to add noise to the simulated training dataset (to make it cover a broader range of observations). There are also a range of more advanced methods, see e.g. here, but these are not implemented in the
sbi
toolbox.Hope this helps!
Michael