Frailty survival model #580

NathanielF · 2023-10-02T21:24:15Z

Frailty Models - Hierarchical Survival Models

Related to this issue: #579

Leaving as Draft for now.

Notebook follows style guide https://docs.pymc.io/en/latest/contributing/jupyter_style.html
PR description contains a link to the relevant issue:
- a tracker one for existing notebooks (tracker issues have the "tracker id" label)
- or a proposal one for new notebooks
Check the notebook is not excluded from any pre-commit check: https://github.com/pymc-devs/pymc-examples/blob/main/.pre-commit-config.yaml

Helpful links

https://github.com/pymc-devs/pymc-examples/blob/main/CONTRIBUTING.md

📚 Documentation preview 📚: https://pymc-examples--580.org.readthedocs.build/en/580/

Signed-off-by: Nathaniel <[email protected]>

review-notebook-app · 2023-10-02T21:24:19Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

Signed-off-by: Nathaniel <[email protected]>

…ates Signed-off-by: Nathaniel <[email protected]>

review-notebook-app · 2023-10-08T23:16:38Z

View / edit / reply to this conversation on ReviewNB

ricardoV94 commented on 2023-10-08T23:16:38Z
----------------------------------------------------------------

Any reason not to use pm.Censored for the likelihood ?

NathanielF commented on 2023-10-09T07:54:47Z
----------------------------------------------------------------

In the case of the CoxPH regression the "Poisson trick" is a classic of the literature and it works to give me the results i was expecting. It's also consistent with the approach already documented in Austin's notebook.

More generally in the case of the AFT models below. I tried using the pm.Censored with the Weibull regression, but it gave me garbage results and was allot slower than using the Potential. And again in the case of the Weibull AFT my current parameterisation gives the expected results.

But maybe i'm missing something, was there a pattern of usage you had in mind w.r.t. to using censored liklihoods for survival?

NathanielF · 2023-10-09T07:54:49Z

In the case of the CoxPH regression the "Poisson trick" is a classic of the literature and it works to give me the results i was expecting. It's also consistent with the approach already documented in Austin's notebook.

More generally in the case of the AFT models below. I tried using the pm.Censored with the Weibull regression, but it gave me garbage results and was allot slower than using the Potential. And again in the case of the Weibull AFT my current parameterisation gives the expected results.

But maybe i'm missing something, was there a pattern of usage you had in mind w.r.t. to using censored liklihoods for survival?

View entire conversation on ReviewNB

ricardoV94 · 2023-10-09T08:00:35Z

But maybe i'm missing something, was there a pattern of usage you had in mind w.r.t. to using censored likelihoods for survival?

From a first glance it sounded like you just wrote your own Censored likelihood by hand. Since we implemented it in PyMC we've been moving examples towards using the Censored factory instead because that fits much more with the PyMC vibe. Similarly, nobody is writing Potentials for Mixture likelihoods either, because we have pm.Mixture.

In the case of the CoxPH regression the "Poisson trick" is a classic of the literature

I am not familiar with the Poisson trick. Google didn't elucidate it quickly for me.

ricardoV94 · 2023-10-09T08:12:32Z

More generally in the case of the AFT models below. I tried using the pm.Censored with the Weibull regression, but it gave me garbage results and was allot slower than using the Potential.

I was talking about this one yes, not the Poisson trick. Sounds like a bug or a difference in the parametrization of the PyMC-defined Weibull and what you're comparing against. Using pm.Censored could be slower but shouldn't give different results. Logp wise it should be equivalent to what you did with the Potential.

NathanielF · 2023-10-09T08:39:53Z

Can you show me an example of how to use the censored distribution for AFT models. This is what i tried....

NathanielF · 2023-10-09T08:42:51Z

I am not familiar with the Poisson trick. Google didn't elucidate it quickly for me.

See maybe here: https://cran.r-project.org/web/packages/survival/vignettes/approximate.pdf
Also, in a textbook i have at home...

ricardoV94 · 2023-10-09T08:47:43Z

In that example you didn't pass observed.

When you have mixed censored and not censored you have to pass an array to upper with +inf (or anything above y) for the uncensored obs and y for the censored ones. Something like

pm.Censored("obs", dist, lower=None, upper=np.where(cens, y, np.inf), observed=y)

NathanielF · 2023-10-09T09:22:59Z

Ah... i did miss observed... but now i just get a crash out:

Similar crashes for different upper bounds

ricardoV94 · 2023-10-09T09:38:46Z

Maybe try upper=np.where(cens, y, y+1). The last value has to be larger than the observations when they are not censored. np.inf could be leading to numerical issues in the logcdf

NathanielF · 2023-10-09T09:44:57Z

So that does fit:

But gives predictions out of line with Lifelines and my prior Weibull fit. Top table here is the new fit predictions. Bottom is the lifelines predictions and my Potential based fit agreed with Lifelines here

ricardoV94 · 2023-10-09T10:05:09Z

Your upper is still weird. Do you have a constant censoring at y=20? If that's the case you should just be able to set upper=20

upper is the point of censoring. You are not allowed to observe any value beyond upper. If it matches exactly with upper it means that observation was censored. If it is below, it was not censored. upper=np.where(cens, y+1, 20) is odd. It say for the cases where cens is True, the censoring point is y+1 (which means it won't be treated as censored in the logp, because it's always higher than the observed value), otherwise it's 20.

NathanielF · 2023-10-09T10:09:09Z

Ugh, sorry. Don't know where my head is this morning!

The max y in the data set is 12. But it crashes out under this parameterisation:

ricardoV94 · 2023-10-09T10:20:58Z

Just as a sanity check do you get a -inf logp at the starting point if you just use a vanilla Weibull (without censoring). I wonder if the PyMC parametrization is just different than what you were working with.

NathanielF · 2023-10-09T10:35:23Z

You mean, just run the observed like this:

Seems to run fine

ricardoV94 · 2023-10-09T10:49:06Z

No, run it with all observations, censored and non-censored as if they were all uncensored

NathanielF · 2023-10-09T11:02:14Z

Seems to run just fine with all observations:

ricardoV94 · 2023-10-09T12:11:02Z

Interesting, that suggests a precision issue with the CDF (or a bug in the CDF). Can you open a GitHub issue with a minimal reproducible example?

NathanielF · 2023-10-09T13:20:25Z

@ricardoV94 added the issue here: pymc-devs/pytensor#471

Signed-off-by: Nathaniel <[email protected]>

NathanielF · 2023-11-16T22:36:39Z

Fixed

View entire conversation on ReviewNB

NathanielF · 2023-11-16T22:36:51Z

Fixed

View entire conversation on ReviewNB

NathanielF · 2023-11-16T22:37:10Z

Adjusted

View entire conversation on ReviewNB

NathanielF · 2023-11-16T22:37:35Z

Adjusted

View entire conversation on ReviewNB

NathanielF · 2023-11-16T22:38:02Z

Added

View entire conversation on ReviewNB

NathanielF · 2023-11-16T22:39:18Z

Thanks so much for the detailed feedback!

I think I've addressed all the above comments and it's much stronger now.

NathanielF · 2023-11-16T22:39:34Z

Indeed!

View entire conversation on ReviewNB

NathanielF · 2023-11-16T22:39:58Z

Replaced with print statement

View entire conversation on ReviewNB

Signed-off-by: Nathaniel <[email protected]>

NathanielF · 2023-11-19T22:29:18Z

I'm happy with this now! Caught a few extra typos and mistakes in the writing, but basically the same with adjustments for your feedback @drbenvincent. Thanks again for the detailed review!

Signed-off-by: Nathaniel <[email protected]>

NathanielF · 2023-11-27T09:35:52Z

Just giving this another nudge @drbenvincent . If you have time it'd be great to get it over the line this week.

drbenvincent · 2023-11-27T15:28:52Z

Thanks for the nudge. Fingers crossed I'll have time in the next few days

drbenvincent · 2023-11-28T15:08:58Z

"In the context of a failure modelling" -> "In the context of failure modelling"

View entire conversation on ReviewNB

drbenvincent · 2023-11-28T15:09:58Z

Thanks

View entire conversation on ReviewNB

drbenvincent · 2023-11-28T15:21:10Z

Found a typo "strucuture"

drbenvincent · 2023-11-28T15:23:57Z

It could be worth clarifying this is Wilkinson notation?

drbenvincent · 2023-11-28T15:28:08Z

A pedantic/stylistic point, but we could label the line (presumably posterior mean) and make the 50 and 99% credible intervals the same colour but different opacity?

drbenvincent · 2023-11-28T15:29:45Z

I'm not sure what I feel about this, but I'm wondering if in this case of utility functions we might want to add type hints just to help the reader out a bit

def cum_hazard(hazard):
    return hazard.cumsum(dim="intervals")


def survival(hazard):
    return [np.exp](https://numpy.org/doc/stable/reference/generated/numpy.exp.html#numpy.exp)(-cum_hazard(hazard))


def get_mean(trace):
    return trace.mean(("draw", "chain"))

drbenvincent · 2023-11-28T15:36:03Z

Where we have the cell az.plot_compare, it could be good to add a reference to the model comparison tag (https://pymcio--580.org.readthedocs.build/projects/examples/en/580/blog/tag/model-comparison.html)

drbenvincent · 2023-11-28T15:40:57Z

Missing capitalisation. Doesn't follow on from a previous unfinished sentence.

"which allows us to pull out the gender specific..."

Same here:

"which suggests that the model over.."

and

"where we see a stark difference..."

drbenvincent · 2023-11-28T15:44:40Z

Cool! Have added a set of comments above. Mostly small grammatical or stylistic things. Great addition to the set of examples.

Signed-off-by: Nathaniel <[email protected]>

NathanielF · 2023-11-28T19:37:00Z

Thanks @drbenvincent, addressed those notes! Pleased with this one.

drbenvincent

Nicely done

NathanielF · 2023-11-28T21:39:33Z

Thank you Ben! Appreciate it!

NathanielF added 2 commits October 2, 2023 20:55

working frailty model and plots

d6a276e

Signed-off-by: Nathaniel <[email protected]>

adding colormap plots

aa22932

Signed-off-by: Nathaniel <[email protected]>

NathanielF added 2 commits October 7, 2023 21:06

adding aft regression models

b0e8127

Signed-off-by: Nathaniel <[email protected]>

adding more explicit comparison between lifelines, scipy and my estim…

900e2d0

…ates Signed-off-by: Nathaniel <[email protected]>

ricardoV94 mentioned this pull request Oct 9, 2023

Optimize log1mexp(log1mexp(x)) -> x pymc-devs/pytensor#471

Open

NathanielF mentioned this pull request Oct 10, 2023

Rewrite useless log1mexp(log1mexp(x)) pymc-devs/pytensor#474

Open

6 tasks

NathanielF added 5 commits October 10, 2023 21:35

extracted correct loglogistic fit removed lifelines models

cee1570

Signed-off-by: Nathaniel <[email protected]>

adding some write-up

6e340e8

Signed-off-by: Nathaniel <[email protected]>

adding more write up and discussion

a478182

Signed-off-by: Nathaniel <[email protected]>

rearranging and adding more description

b64dc61

Signed-off-by: Nathaniel <[email protected]>

more write up and plot histograms

21f9020

Signed-off-by: Nathaniel <[email protected]>

adding fix for occurrence typo

9f8b1ae

Signed-off-by: Nathaniel <[email protected]>

NathanielF added 2 commits November 19, 2023 15:53

correcting publication date and tidying

4e9979b

Signed-off-by: Nathaniel <[email protected]>

more tidying

6a19152

Signed-off-by: Nathaniel <[email protected]>

NathanielF added 2 commits November 21, 2023 16:54

more tidying and change final plot

391aafe

Signed-off-by: Nathaniel <[email protected]>

added note about analogy with sorites paradox

4d70e73

Signed-off-by: Nathaniel <[email protected]>

addressing Ben's comments

c8fbba2

Signed-off-by: Nathaniel <[email protected]>

drbenvincent approved these changes Nov 28, 2023

View reviewed changes

drbenvincent merged commit cd77a9b into pymc-devs:main Nov 28, 2023
2 checks passed

Frailty survival model #580

Frailty survival model #580

Conversation

NathanielF commented Oct 2, 2023 • edited Loading

Frailty Models - Hierarchical Survival Models

Helpful links

review-notebook-app bot commented Oct 2, 2023

review-notebook-app bot commented Oct 8, 2023 • edited Loading

NathanielF commented Oct 9, 2023

ricardoV94 commented Oct 9, 2023 • edited Loading

ricardoV94 commented Oct 9, 2023 • edited Loading

NathanielF commented Oct 9, 2023

NathanielF commented Oct 9, 2023

ricardoV94 commented Oct 9, 2023 • edited Loading

NathanielF commented Oct 9, 2023

ricardoV94 commented Oct 9, 2023

NathanielF commented Oct 9, 2023 • edited Loading

ricardoV94 commented Oct 9, 2023 • edited Loading

NathanielF commented Oct 9, 2023 • edited Loading

ricardoV94 commented Oct 9, 2023 • edited Loading

NathanielF commented Oct 9, 2023

ricardoV94 commented Oct 9, 2023 • edited Loading

NathanielF commented Oct 9, 2023

ricardoV94 commented Oct 9, 2023 • edited Loading

NathanielF commented Oct 9, 2023

NathanielF commented Nov 16, 2023

NathanielF commented Nov 16, 2023

NathanielF commented Nov 16, 2023

NathanielF commented Nov 16, 2023

NathanielF commented Nov 16, 2023

NathanielF commented Nov 16, 2023

NathanielF commented Nov 16, 2023

NathanielF commented Nov 16, 2023

NathanielF commented Nov 19, 2023

NathanielF commented Nov 27, 2023

drbenvincent commented Nov 27, 2023

drbenvincent commented Nov 28, 2023

drbenvincent commented Nov 28, 2023

drbenvincent commented Nov 28, 2023

drbenvincent commented Nov 28, 2023

drbenvincent commented Nov 28, 2023

drbenvincent commented Nov 28, 2023

drbenvincent commented Nov 28, 2023

drbenvincent commented Nov 28, 2023 • edited Loading

drbenvincent commented Nov 28, 2023

NathanielF commented Nov 28, 2023

drbenvincent left a comment

Choose a reason for hiding this comment

NathanielF commented Nov 28, 2023

NathanielF commented Oct 2, 2023 •

edited

Loading

review-notebook-app bot commented Oct 8, 2023 •

edited

Loading

ricardoV94 commented Oct 9, 2023 •

edited

Loading

ricardoV94 commented Oct 9, 2023 •

edited

Loading

ricardoV94 commented Oct 9, 2023 •

edited

Loading

NathanielF commented Oct 9, 2023 •

edited

Loading

ricardoV94 commented Oct 9, 2023 •

edited

Loading

NathanielF commented Oct 9, 2023 •

edited

Loading

ricardoV94 commented Oct 9, 2023 •

edited

Loading

ricardoV94 commented Oct 9, 2023 •

edited

Loading

ricardoV94 commented Oct 9, 2023 •

edited

Loading

drbenvincent commented Nov 28, 2023 •

edited

Loading