Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Tweedie distribution, incl mathematics and design #429

Open
fkiraly opened this issue Jul 18, 2024 · 4 comments
Open

[ENH] Tweedie distribution, incl mathematics and design #429

fkiraly opened this issue Jul 18, 2024 · 4 comments
Assignees
Labels
implementing algorithms Implementing algorithms, estimators, objects native to skpro module:probability&simulation probability distributions and simulators

Comments

@fkiraly
Copy link
Collaborator

fkiraly commented Jul 18, 2024

For #423 and other GLM interfaces, we would need a Tweedie distribution.

The Tweedie distribution has a number of difficulties:

  • it is parameterized by a power law parameter p, a real number, which leads to special cases that include well-known and obscure distribution families. This also leads to design questions on how these families and Tweedie should relate
  • it is mixed, so cannot be represented in the scipy interface, hence it does not exist in scipy for all p. In fact, scipy has no Tweedie distribution, see discussion here: Add Tweedie distributions to scipy.stats scipy/scipy#11291 (comment)
  • the problem of how to best compute various functions for the various obscure Tweedie ED subfamilies is bleeding edge research, even for cdf or ppf

There is some discussion about Tweedie in the issue related to the sklearn Tweedie regressor: #423 (comment)

@fkiraly fkiraly added module:probability&simulation probability distributions and simulators implementing algorithms Implementing algorithms, estimators, objects native to skpro labels Jul 18, 2024
@fkiraly
Copy link
Collaborator Author

fkiraly commented Jul 18, 2024

Regarding point 1, architecture, I would propose a delegator design, using the _DelegatedDistribution:

  • we first implement the different subfamilies
  • then Tweedie can be a delegator, delegating to the subfamily determined by p
  • also, the delegator still has to be changed to delegate private methods, not public - given the new distributions interface. This is a separate matter but a condition for a clean implementation

@fkiraly
Copy link
Collaborator Author

fkiraly commented Jul 18, 2024

Regarding point 3, section 3 of Withers, Nadarajah - "On the compound Poisson-gamma distribution" has numerically exact approximation formulae for cdf and ppf, for the CPG.

@fkiraly
Copy link
Collaborator Author

fkiraly commented Aug 15, 2024

@ShreeshaM07, could you kindly reply to this so I can assign you?

@ShreeshaM07
Copy link
Contributor

Sure. So I have been working on the implementation of the Tweedie Distribution. So far completed the pdf and cdf implementation of the sub distribution which is the compound poisson gamma distribution in #428.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
implementing algorithms Implementing algorithms, estimators, objects native to skpro module:probability&simulation probability distributions and simulators
Projects
Status: PR in progress
Development

No branches or pull requests

2 participants