BUG: HurdleGamma results in large number of divergences, even under the correct model #7630

Dpananos · 2024-12-27T16:36:12Z

Describe the issue:

A simple HurdleGamma experiences a very high number of divergences, even when priors are tightly centered around true values and the data generating process is correct.

Some chains get "stuck" -- they do not move from their initialized values.

For more, please see this thread in the PyMC community forums

Reproduceable code example:

x = df.x.values
y = df.y.values 

with pm.Model() as model:
    X = pm.Data("X", df.x.values, dims="ix")
    Y = pm.Data("Y", y)


    b0 = pm.Normal("b0", 0, 1)
    b1 = pm.Normal("b1", 0, 1)
    eta = b0 + b1 * X
    mu = pm.math.exp(eta)

    sigma = pm.Exponential("sigma", 1)
    psi = pm.Uniform('psi', 0, 1)
    Yobs = pm.HurdleGamma('Yobs', mu=mu, sigma=sigma, observed=y, psi=psi)


with model:
    idata = pm.sample()
    idata.extend(pm.sample_posterior_predictive(idata))

Error message:

No response

PyMC version information:

5.19.1

Context for the issue:

No response

The text was updated successfully, but these errors were encountered:

ricardoV94 · 2024-12-27T18:32:24Z

I suspect some instability in the truncation logp or the gradient

Dpananos · 2024-12-27T19:06:43Z

Let me know if this isn't helpful. I've been reading Some Mixture Modeling Basics by Michael Betancourt which have informed the contents of this comment.

Hurdle models are currently being implemented as a mixture of a Dirac and the chosen density. As it stands, the DiracDelta is a probability mass function, which means that the logp method will return 0 or -infinity corresponding to 1 or 0 on the pdf scale.

pm.DiracDelta.logp(value = 0.0, c=0.0).eval()
#array(0., dtype=float32)

pm.DiracDelta.logp(value = 1.0, c=0.0).eval()
#array(-inf, dtype=float32)

If there was a continuous analog of DiracDelta in which the logp method returned 0 when value != c and 1 when value=c then the mixture probabilities would be correct (I think) and there would be no need for the machine epsilon "hack" which I suspect is the source of the problem.

Would using pytensor.tensor.switch be possible here to return the appropriate density/probability value when needed?

Dpananos added the bug label Dec 27, 2024

Dpananos changed the title ~~BUG: <Please write a comprehensive title after the 'BUG: ' prefix>~~ BUG: HurdleGamma results in large number of divergences, even under the correct model Dec 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: HurdleGamma results in large number of divergences, even under the correct model #7630

BUG: HurdleGamma results in large number of divergences, even under the correct model #7630

Dpananos commented Dec 27, 2024

ricardoV94 commented Dec 27, 2024

Dpananos commented Dec 27, 2024 •

edited

Loading

BUG: HurdleGamma results in large number of divergences, even under the correct model #7630

BUG: HurdleGamma results in large number of divergences, even under the correct model #7630

Comments

Dpananos commented Dec 27, 2024

Describe the issue:

Reproduceable code example:

Error message:

PyMC version information:

Context for the issue:

ricardoV94 commented Dec 27, 2024

Dpananos commented Dec 27, 2024 • edited Loading

Dpananos commented Dec 27, 2024 •

edited

Loading