Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Tweedie Distribution #428

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from
Draft

Conversation

ShreeshaM07
Copy link
Contributor

Reference Issues/PRs

related to #423

What does this implement/fix? Explain your changes.

Implements a Tweedie distribution in skpro by working the compound poisson gamma case when power parameter 1<p<2. Calls the other pre-existing distributions in other interger cases

Does your contribution introduce a new dependency? If yes, which one?

No

What should a reviewer concentrate their feedback on?

If this idea would be suitable or is there something that might hinder the further extension as skpro does not handle case when 2<power<3.

Did you add any tests for the change?

no

PR checklist

For all contributions
  • I've added myself to the list of contributors with any new badges I've earned :-)
    How to: add yourself to the all-contributors file in the skpro root directory (not the CONTRIBUTORS.md). Common badges: code - fixing a bug, or adding code logic. doc - writing or improving documentation or docstrings. bug - reporting or diagnosing a bug (get this plus code if you also fixed the bug in the PR).maintenance - CI, test framework, release.
    See here for full badge reference
  • The PR title starts with either [ENH], [MNT], [DOC], or [BUG]. [BUG] - bugfix, [MNT] - CI, test framework, [ENH] - adding or improving code, [DOC] - writing or improving documentation or docstrings.
For new estimators
  • I've added the estimator to the API reference - in docs/source/api_reference/taskname.rst, follow the pattern.
  • I've added one or more illustrative usage examples to the docstring, in a pydocstyle compliant Examples section.
  • If the estimator relies on a soft dependency, I've set the python_dependencies tag and ensured
    dependency isolation, see the estimator dependencies guide.

@ShreeshaM07 ShreeshaM07 marked this pull request as draft July 17, 2024 18:16
@ShreeshaM07
Copy link
Contributor Author

@fkiraly I have tried to implement pdf of the compound poisson-gamma case when power is in (1,2). But I am still unsure what can be done to handle the case when power is in (2,3) or when it is p<0.

Need some thought on whether this would be a good idea to distinguish them based on the power parameters in the way I have done or is there any better way.

Copy link
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, this is incomplete - only covers p between 1 and 2, no?

May I suggest calling this CompoundPoissonGamma instead, and later using it as one of the Tweedie ED components in a delegator, as suggested here?
#423

@ShreeshaM07
Copy link
Contributor Author

hm, this is incomplete - only covers p between 1 and 2, no?

Yes I will be splitting it into Compound poisson gamma and then calling it in the tweedie distribution. However I still face the problem of integrating the pdf to get the cdf. Is there any method in scipy apart from using quad to integrate the pdf manually?

Also facing the same issue for finding the ppf some ideas would be helpful. I've asked gpt and it suggests using a scipy.optimize.brentq to find the ppf I'm not very sure if that will work correct.

@fkiraly
Copy link
Collaborator

fkiraly commented Jul 19, 2024

Brute force integration and such are typically last resort defaults.

I would suggest to check the approximation formulae along the lines of those linked in this comment: #429 (comment)

@ShreeshaM07
Copy link
Contributor Author

I was going through the section 3 of the Withers Nadaraja paper and it seems to me that the cdf ppf
image
is still not a simple term it although it looks straight forward.
It requires the summation of the infinite series of the Hermite polynomial
image
where $\phi(x)=(e^-\frac{(x)^2}{2})$ . and $\Phi(x)$ is the wright_bessel function.

I am not sure how we can do this efficiently. Also where is the power parameter here I do not seem to spot it it should ideally be present right as it will alter the value of the cdf.
Some insight here would be helpful.

@fkiraly
Copy link
Collaborator

fkiraly commented Jul 20, 2024

Also where is the power parameter here I do not seem to spot it it should ideally be present right as it will alter the value of the cdf.

it's implicit in the constants $c_{rj}$ (the formula below the screenshot box, in the paper)

@ShreeshaM07
Copy link
Contributor Author

it's implicit in the constants crj (the formula below the screenshot box, in the paper)

I see then how do I generate these crj from the power parameter alone.

And also any thoughts on how this infinite series summation can be made efficient. Is there any way this expression will simplify to something without a summation?

@fkiraly
Copy link
Collaborator

fkiraly commented Jul 22, 2024

I see then how do I generate these crj from the power parameter alone.

I think you need to map them, following through the correspondence between Tweedie and CPG parameters. I would recommend you start implementing CPG in pure CPG parameters, and the mapping is done in the delegator.

@fkiraly
Copy link
Collaborator

fkiraly commented Jul 22, 2024

And also any thoughts on how this infinite series summation can be made efficient. Is there any way this expression will simplify to something without a summation?

I don't think so - the usual procedure is to prescribe an epsilon and abort once the change becomes less than that epsilon - this is sound as long as you know the magnitudes are decreasing, and you have some theory that links magnitude of individual terms to magnitude of the total remainder.

@ShreeshaM07
Copy link
Contributor Author

ShreeshaM07 commented Jul 24, 2024

I don't think so - the usual procedure is to prescribe an epsilon and abort once the change becomes less than that epsilon - this is sound as long as you know the magnitudes are decreasing, and you have some theory that links magnitude of individual terms to magnitude of the total remainder.

Can we not just keep the pdf = poisson.pdf(x) * gamma.pdf(x) where gamma will have the $\alpha=i*\alpha$ where $\alpha$ would have been the shape otherwise and i is at all the integers. so now we get the complete poisson gamma distribution. For the cdf we must sum these up for all values keeping a tolerance does that work?

For now i have implemented the CPG distributions pdf manually as given in the paper by Nadaraja and Withers. I'll try to implement the cdf next.

@ShreeshaM07
Copy link
Contributor Author

I am not able to get the gr(x) polynomial could you please help out as to what that is.

@fkiraly fkiraly added enhancement module:probability&simulation probability distributions and simulators implementing algorithms Implementing algorithms, estimators, objects native to skpro labels Jul 24, 2024
@ShreeshaM07 ShreeshaM07 requested a review from fkiraly July 25, 2024 10:12
@ShreeshaM07
Copy link
Contributor Author

@fkiraly I have implemented the cdf as per the Witheres Nadaraja paper. But I think there is some small issue as the function is not converging to a sum within [0,1] when the iterations(r) are large. Some help in finding out the fault would be helpful.

@fkiraly
Copy link
Collaborator

fkiraly commented Jul 25, 2024

Sure!

What would help me if you write a snippet of code, either assuming the branch or just numpy imports, that computes the sum.

@fkiraly
Copy link
Collaborator

fkiraly commented Jul 25, 2024

I am not able to get the gr(x) polynomial could you please help out as to what that is.

I have no context here, what is the gr polynomial, what does it mean to "get" it, and why do you need to get it?

@ShreeshaM07
Copy link
Contributor Author

What would help me if you write a snippet of code, either assuming the branch or just numpy imports, that computes the sum.

With reference to this theorem
Screenshot from 2024-07-25 14-11-33

cdf

  • equation(3.8) is the cdf which requires sum(equation(3.10)) for all positive integer r.
  • Now the equation(3.10) requires the sum of for values of j from 1 to r.
  • cdf is being computed by
    def _cdf(self, x):
        """Cumulative distribution function.

        Parameters
        ----------
        x : 2D np.ndarray, same shape as ``self``
            values to evaluate the cdf at

        Returns
        -------
        2D np.ndarray, same shape as ``self``
            cdf values at the given points
        """
        rho = self.alpha
        max_r = 10  # Adjust as needed
        phi_x = np.exp(-(x**2) / 2) / np.sqrt(2 * np.pi)
        series_sum = np.zeros_like(x, dtype=float)
        for r in range(1, max_r + 1):
            hr_x = self._compute_hr(x, r, rho)
            series_sum += hr_x
            # print(f"r={r}, hr_x={hr_x}, series_sum={series_sum}")
            # if np.any(np.abs(hr_x) < 1e-6):  # Convergence check
            #     break
        cdf = phi_x * (1 + series_sum)
        return cdf
  • Using $\phi(x)=\frac{e^-\frac{x^2}{2}}{sqrt(2*pi)}$.
  • He(r+2j,x) is being computed by
from scipy.special import eval_hermitenorm
eval_hermitenorm(r + 2 * j, x)

in the method

    def _compute_hr(self, x, r, rho):
        from scipy.special import eval_hermitenorm, factorial

        hr = np.zeros_like(x, dtype=float)
        for j in range(1, r + 1):
            H = eval_hermitenorm(r + 2 * j, x)
            crj = self._compute_crj(r, j, rho)
            hr += H * crj / factorial(j)
        return hr
  • This now requires the crj term for each value of r which is being computed by.
    def _compute_crj(self, r, j, rho):
        from itertools import combinations_with_replacement

        from scipy.special import factorial

        c_rj = 0
        partitions = [
            p for p in combinations_with_replacement(range(1, r + 1), j) if sum(p) == r
        ]
        for partition in partitions:
            term = 1
            for s_i in partition:
                term *= factorial(rho + 1 + s_i) / (
                    factorial(rho - 1) * factorial(s_i + 2)
                )
            c_rj += term
        c_rj *= (rho**2 + rho) ** (-r / 2 - j)
        return c_rj
  • So the idea here is that the partitions array contains the possible arrays which in turn contain values of si (that is required in calculation of crj as per the equation in the paper) where sum(si) = r for different values of r.
    From this I am then finding the nCr formula and multiplying all the possible arrays for a specific r value and then summing them up to make crj.
  • I am then multiplying the resultant crj by the constant term $(\rho^2 + \rho)^(-\frac{r}{2}-j)$ to get the final required crj.
  • There seems to be some fault in the way crj is being calculated as the sum is not converging to a value between [0,1] when r=say 10.

ppf

  • the ppf requires the computing of the equation(3.7) where $\Phi(x)$ is the standard normal cumulative distribution and $\lambda$ is the self.lambda but I am not sure what the definition of gr(x) here is. So I now need the definition of this polynomial g(r,x).

@fkiraly
Copy link
Collaborator

fkiraly commented Jul 26, 2024

Just following through the references, my interpretation: the $g_r$ are the g-polynomials in the Cornish-Fisher expansion, see here: https://en.wikipedia.org/wiki/Cornish%E2%80%93Fisher_expansion

To avoid confusion, there are g- and h-polynomials, and also f-polynomials, and Withers claims there is a mistake in some original explicit formulae for the low order g-polynomials in the papers.

The trail of references is:

  • paper cited references "Withers - Asymptotic expansion for Distributions and Quantiles with Power Series Cumulants, JRSS, 1984". Section 3 contains the g-polynomials.
  • This in turn references an explicit formula of g-polynomials in terms of f-polynomials in "order inference for asymptoticaliy normal random variables. Sankhya B, 1982", Theorem A2 contains the expression.

This seems to become the definition of a rabbit hole...

I would suggest trying to implement fast numpy based functions for the various polynomials or expressions, in cases they cannot be obtained from scipy or numpy directly. I would also test these individually against known values and cover them well with tests.

For expressions such as the standard normal pdf, I would use scipy or numpy where available, instead of reimplementing them, to avoid little bugs like wrong signs, missing factors, etc.

@ShreeshaM07
Copy link
Contributor Author

ShreeshaM07 commented Jul 30, 2024

Just following through the references, my interpretation: the gr are the g-polynomials in the Cornish-Fisher expansion, see here: https://en.wikipedia.org/wiki/Cornish%E2%80%93Fisher_expansion

I am not quite sure I see an implementation of $g_r$ polynomial here. It only shows the $h_r$ polynomial .

I am not able to find any pre-existing method in scipy to interface the computation of $c_{rj}$. From the various implementations I have done so far it does not produce results in the interval [0,1] for the cdf. It would be of much help if you could take a look at the implementation of the _compute_crj method. As from what I infer from the Withers and Natharaja paper the implementation of the crj seems to be correct however it does not produce the right results.

I've also tried using $l_r$ polynomial for the calculation of the same $c_{rj}$ but that was producing the exact same results for the respective values of x and not in range [0,1]. So I am not really sure what exactly is going wrong.

@ShreeshaM07
Copy link
Contributor Author

This is the code that was used to test

from skpro.distributions.cmp_pois_gam import CmpPoissonGamma
import numpy as np
lambda_param = 2.0
alpha_param = 1.5
beta_param = 1.0

cmp_poisson_gamma_dist = CmpPoissonGamma(lambda_=lambda_param, alpha=alpha_param, beta=beta_param)

x_values = np.array([[0.5, 1.0, 1.5], [2.0, 2.5, 3.0]])
p_values = np.array([[0.1, 0.25, 0.5], [0.75, 0.9, 0.95]])

pdf_values = cmp_poisson_gamma_dist._pdf(x_values)
pmf_values = cmp_poisson_gamma_dist._pmf(np.floor(x_values).astype(int))
cdf_values = cmp_poisson_gamma_dist._cdf(x_values)
# ppf_values = cmp_poisson_gamma_dist._ppf(p_values)

print(f"PDF values:\n{pdf_values}")
print(f"PMF values:\n{pmf_values}")
print(f"CDF values:\n{cdf_values}")
# print(f"PPF values:\n{ppf_values}")

Output

PDF values:
[[0.17620241 0.19445143 0.19319874]
 [0.18385598 0.17024862 0.154407  ]]
PMF values:
[[0.13533528 0.27067057 0.27067057]
 [0.27067057 0.27067057 0.18044704]]
CDF values:
[[ 2483.23279161 -1265.84513228   273.00637241]
 [  233.08588426  -333.03847813   238.31041401]]

Regarding the implementation specifics is present in the comment.

@fkiraly
Copy link
Collaborator

fkiraly commented Jul 30, 2024

ok, I see, this is typical for "implementation from paper" - it is not unusual that it looks like things have been correctly implemented, but they then do not behave as expected. My primary suggestion for continuing with this is to proceed with "bottom up validation", i.e., first test correctness of the lower down functions like crj and gr.

What I would do, in sequence:

  • replace common functions computed explicitly by scipy functions, e.g., normal pdf, cdf used in the formulae
  • compute some values of crj manually. Add these as tests for crj
  • compute some values of gr manually. Add these as tests for gr
  • then, compute some pdf and pmf values that are easy to get. Add as tests for tweedie.

Debug wherever manually computed values do not agree with machine computed ones.

If we did this, and pdf, pmf values fail to match the constraints still, then there is supsicion that there is an error in the math, in the paper. But until we can conclude this, we need to do the tedious manual verification above.

Of course we can decide to work on sth else if this feels like too much of a rabbit hole.

@ShreeshaM07
Copy link
Contributor Author

This comment is the output for the final output of cdf.

Now for the individual values itself for example
the hermite polynomial for respective values of x as given in the comment linked above is

[[ -3240.81005859    936.           5141.87841797]
 [ -3342.         -13760.63232422  12312.        ]]

when r=5 and j=3 so the above array contains $He_{11}(x)$ and
$c_rj$ for the same is 0.018245457192850898
and $j!$ = 6.0

and the value $\frac{He_{r+2j}(x)*c_{rj}}{j!}$ for this iteration is

[[ -9.8550102    2.84629132  15.63598709]
 [-10.16271966 -41.844838    37.43967816]]

Now doing this for all values of j from 1 to r where r=5 gives $h_5(x)=$

[[ -8.43080033   7.69420971   6.21852576]
 [-23.28549087   8.26764324  73.77196399]]

Now these are smaller for small values of r but when r is larger the hermite polynomial also becomes bigger
for example when r=8 and j=8
we get hermite $He_{24}(x)$ =

[[-2.64545228e+11  9.25901342e+10  2.46363203e+11]
 [-7.97293258e+11  1.45418958e+12 -1.42824459e+12]]

and $c_{rj}$=6.779681328689356e-05
and 8!=40320.0

the value of this iteration given by $\frac{He_{r+2j}(x)*c_{rj}}{j!}$
is

[[ -444.82448913   155.68740181   414.25198522]
 [-1340.62356486  2445.17409824 -2401.54841047]]

and doing this for all values of j from 1 to r=8 gives
$h_8$=sum( $\frac{He_{r+2j}(x)*c_{rj}}{j!}$) we get

[[ -369.49482312    70.70751258   412.63323489]
 [-1030.47215469  1589.06248313 -1351.95793231]]
```.

@fkiraly
Copy link
Collaborator

fkiraly commented Aug 6, 2024

could you kindly do this bottom-up, with some values calculated manually, and saying explicitly which object in the papers this corresponds to?

E.g., sth like

manual computation for he_5

$He_5 = ...$

using the code

In the implementation, we call he(5), this gives ...

@ShreeshaM07
Copy link
Contributor Author

ShreeshaM07 commented Aug 7, 2024

So the whole stack output for say $h_5$ =

1. r =  5  and j = 1
2. He( 7 , [[0.5 1.  1.5]
 [2.  2.5 3. ]] ) = 
 [[ -40.0234375  -20.          54.4921875]
 [  86.         -62.3046875 -396.       ]]
3. crj for r= 5 j= 1  =  0.030768701028203366
4. j! 1.0
5. h_ 5 upto from j=1 to j= 1  =  [[ -1.23146918  -0.61537402   1.67665383]
 [  2.64610829  -1.9170343  -12.18440561]]
-------------------------------
1. r =  5  and j = 2
2. He( 9 , [[0.5 1.  1.5]
 [2.  2.5 3. ]] ) = 
 [[ 326.53320312   28.         -541.21289062]
 [-190.         1431.10351562 1620.        ]]
3. crj for r= 5 j= 2  =  0.034147998534292366
4. j! 2.0
5. h_ 5 upto from j=1 to j= 2  =  [[ 4.34375849 -0.13730204 -7.56401467]
 [-0.59795157 22.51762607 15.47547321]]
-------------------------------
1. r =  5  and j = 3
2. He( 11 , [[0.5 1.  1.5]
 [2.  2.5 3. ]] ) = 
 [[ -3240.81005859    936.           5141.87841797]
 [ -3342.         -13760.63232422  12312.        ]]
3. crj for r= 5 j= 3  =  0.018245457192850898
4. j! 6.0
5. h_ 5 upto from j=1 to j= 3  =  [[ -5.51125171   2.70898928   8.07197242]
 [-10.76067123 -19.32721193  52.91515137]]
-------------------------------
1. r =  5  and j = 4
2. He( 13 , [[0.5 1.  1.5]
 [2.  2.5 3. ]] ) = 
 [[  37809.77648926  -23672.          -47160.5592041 ]
 [  84398.           73069.20471191 -350568.        ]]
3. crj for r= 5 j= 4  =  0.004783453196627576
4. j! 24.0
5. h_ 5 upto from j=1 to j= 4  =  [[  2.02463563  -2.00909006  -1.32762456]
 [  6.06074056  -4.76374856 -16.95674948]]
-------------------------------
1. r =  5  and j = 5
2. He( 15 , [[0.5 1.  1.5]
 [2.  2.5 3. ]] ) = 
 [[ -505845.15194702   469456.           365090.80709839]
 [-1419802.           630472.64480591  4389552.        ]]
3. crj for r= 5 j= 5  =  0.0024803090649180024
4. j! 120.0
5. h_ 5 upto from j=1 to j= 5  =  [[ -8.43080033   7.69420971   6.21852576]
 [-23.28549087   8.26764324  73.77196399]]
-------------------------------
6. h_ 5  =  [[ -8.43080033   7.69420971   6.21852576]
 [-23.28549087   8.26764324  73.77196399]]

This corresponds to
image

The eqn 2) gives the Hermite polynomial $He_{r+2j}(x)$.
In the code it corresponds to the call

H = eval_hermitenorm(r + 2 * j, x)

The eqn 3) computes the $c_{rj}$ and is being computed using the call

    def _compute_crj(self, r, j, rho):
        from itertools import combinations_with_replacement

        from scipy.special import comb

        partitions = np.array(
            [
                p
                for p in combinations_with_replacement(range(1, r + 1), j)
                if sum(p) == r
            ]
        )
        if partitions.size == 0:
            return 0

        term = np.prod([comb(rho + 1 + s_i, s_i + 2) for s_i in partitions.T], axis=0)
        c_rj = np.sum(term)
        c_rj *= (rho**2 + rho) ** (-r / 2 - j)
        return c_rj

Which corresponds to
image

Here $\rho$ corresponds to $\alpha$ in our code.

@fkiraly
Copy link
Collaborator

fkiraly commented Aug 8, 2024

can we schedule a walkthrough for the Monday tech session, or whichever next is convenient? I think this needs discussion with questions and explanations.

@ShreeshaM07
Copy link
Contributor Author

can we schedule a walkthrough for the Monday tech session, or whichever next is convenient? I think this needs discussion with questions and explanations.

Yeah I think Monday tech session should work as per my current schedule.

@ShreeshaM07
Copy link
Contributor Author

ShreeshaM07 commented Aug 12, 2024

So I have implemented the permutation of the partitions that we spotted. The cdf values did come down by a lot to the order 10^0. But it still has some values outside the [0,1] range. Not sure how to tackle that.

The current cdf output

CDF values:
[[ 0.21771439  0.13619774  0.40067358]
 [-1.52041722  3.87381296 -4.79373376]]

The change handles it this way

[[1 2]
 [2 1]]
1. r =  3  and j = 2

where the array is the partition array when r=3 and j=2. So sum is 3 and number of terms that add upto 3 is 2.

The code doing that is

        partitions = np.array(
            [
                p
                for p in combinations_with_replacement(range(1, r + 1), j)
                if sum(p) == r
            ]
        )

        partitions = np.vstack([np.array(list(set(permutations(sub_list)))) for sub_list in partitions])

@fkiraly
Copy link
Collaborator

fkiraly commented Aug 12, 2024

hm, this seems correct, if a bit inefficient (and likely subject to combinatorial explosion for larger r) - a better solution would be a direct computation for the multiplicity factor, though I suppose we can leave optimization until after the entire code is correct.

The next oddity I am spotting are the negative values.

I am suspecting pre-factors in capital and lower case phi - can you point me to the exact python code that you have for these two?

@ShreeshaM07
Copy link
Contributor Author

I am suspecting pre-factors in capital and lower case phi - can you point me to the exact python code that you have for these two?

Which pre-factors in capital are we talking about. The lower case $\phi(x)$ is the normal pdf and is being calculated by

phi_x = norm.pdf(x)

@fkiraly
Copy link
Collaborator

fkiraly commented Aug 13, 2024

right here in the cdf
image
there is a capital phi and a lower case phi

@ShreeshaM07
Copy link
Contributor Author

right here in the cdf image there is a capital phi and a lower case phi

The capital phi represents the normal cdf but I have not used it anywhere in my code since I am implementing the equation (3.8) which is the 4th equation here in the paper for the cdf which is an approximation of equation (3.5).

@fkiraly
Copy link
Collaborator

fkiraly commented Aug 13, 2024

Slightly confused - 3.8 is an equation for the pdf, not cdf. How are you using it in implementing the cdf?

@fkiraly
Copy link
Collaborator

fkiraly commented Aug 13, 2024

Waitasec I am now noticing this:

equation(3.8) is the cdf which requires sum(equation(3.10)) for all positive integer r.

Equation 3.8 is the equation for pdf, not cdf! The "dot" above the $P_\lambda$ is physicist notation for derivative. Odd choice in a statistics paper, and strictly speaking wrong since in the narrow sense it refers to derivative with respect to time.

Equation 3.5 is the equation for cdf.

@ShreeshaM07
Copy link
Contributor Author

Ahhh I see thats a blunder. But all the functions created are still useful nevertheless. I shall make the change and commit it.

@ShreeshaM07
Copy link
Contributor Author

So now these are the outputs

PDF values:
[[0.17620241 0.19445143 0.19319874]
 [0.18385598 0.17024862 0.154407  ]]
PMF values:
[[0.13533528 0.27067057 0.27067057]
 [0.27067057 0.27067057 0.18044704]]
CDF values:
[[0.82052395 0.85803452 0.89180471]
 [0.90106802 0.95702399 1.0135732 ]]

for the following code

from skpro.distributions.cmp_pois_gam import CmpPoissonGamma
import numpy as np
lambda_param = 2.0
alpha_param = 1.5
beta_param = 1.0

cmp_poisson_gamma_dist = CmpPoissonGamma(lambda_=lambda_param, alpha=alpha_param, beta=beta_param)

x_values = np.array([[0.5, 1.0, 1.5], [2.0, 2.5, 3.0]])
p_values = np.array([[0.1, 0.25, 0.5], [0.75, 0.9, 0.95]])

pdf_values = cmp_poisson_gamma_dist._pdf(x_values)
pmf_values = cmp_poisson_gamma_dist._pmf(np.floor(x_values).astype(int))
cdf_values = cmp_poisson_gamma_dist._cdf(x_values)

print(f"PDF values:\n{pdf_values}")
print(f"PMF values:\n{pmf_values}")
print(f"CDF values:\n{cdf_values}")

@ShreeshaM07
Copy link
Contributor Author

The reason why the value is going a little out of bound for x=3,x=4 etc maybe due to the minor division errors that keep on aggregating while calculating . It is outputting cdf=1.0 for large x like 20,100,200 etc.

@fkiraly
Copy link
Collaborator

fkiraly commented Aug 14, 2024

Nice! My gut feeling says that we have caught all logical inaccuracies, and the remainder is - as you say - probably indeed numerical issues.

Specifically, I think the large coefficients are contributive here, the naive calculation will lead to severe error propagation.

I would suggest to add tests for the individual functions in play.

The typical next steps would be to use numerical "tricks of the trade" to bring down the error, and at least ensure properties such as monotonicity for the cdf.

@ShreeshaM07
Copy link
Contributor Author

I tried plotting the cdf to see if it is monotonically increasing atleast not decreasing by a large amount due to the division errors. It works well on most inputs but there seemed to be some anomaly in the 1st case any reason why it is coming that way?
image
image
image
image

@fkiraly
Copy link
Collaborator

fkiraly commented Aug 15, 2024

it seems to have problems in case 3 as well.

The oscillations in the 1st case do not look like numerical artefacts, but still might be - we should investigate.

Could you print the coefficients maybe, to see how large they get? And if you do any divisions, how large numerators/denominators are?

@ShreeshaM07
Copy link
Contributor Author

From some playing around with the values I have found that the oscillations are directly dependent only on the $\lambda$ variable. Not sure if it is the expected behaviour.
image
image
image
image
image

  • Upon modifying alpha and beta after attaining a monotonic curve the monotonicity remains same.
    image
    image
    image
    image

@ShreeshaM07
Copy link
Contributor Author

ShreeshaM07 commented Aug 15, 2024

The 2 other contributors to the coefficients are $c_{rj}$ and $j!$ which are always positive so it does not cause any oscillation.
The hermite polynomials does oscillate
image
So it probably makes sense for our cdf to oscillate as well. But this oscillation is getting neutralised when the $\lambda$ is big as is expected from the eqn.
image

@fkiraly
Copy link
Collaborator

fkiraly commented Aug 15, 2024

does this equation converge to something monotonic if you take enough terms?

@ShreeshaM07
Copy link
Contributor Author

@fkiraly I have followed through the references as mentioned in this <a href="https://github.com/sktime/skpro/pull/428#issuecomment-2252710366>comment and I dont think I see anything about the $g_r$ polynomial which is required for calculating the ppf of CPG.

I could only find some formula for first 4 $g_r$ polynomials in one of Withers's papers.
image

Could you please have a look at it again and give a more explicit form for the $g_r$ polynomial.

@fkiraly
Copy link
Collaborator

fkiraly commented Aug 20, 2024

do you have a link to Fisher/Cornish 1960, or is that a physical library-only reference?

@ShreeshaM07
Copy link
Contributor Author

do you have a link to Fisher/Cornish 1960, or is that a physical library-only reference?

The screenshot of the first 4 $g_r$ polynomials I have attached above can be found here it is the 17th reference that was present in the "Withers and Nadaraja On the Compound Poisson Gamma Distribution".

Regarding the Cornish Fisher 1960 expansions it can be found here. I think it can be accessed only through an institute account.

@fkiraly
Copy link
Collaborator

fkiraly commented Aug 22, 2024

that sounds like the wrong paper? Similar name, but the author is someone named Ulyanov from Moscov State U.

@ShreeshaM07
Copy link
Contributor Author

that sounds like the wrong paper? Similar name, but the author is someone named Ulyanov from Moscov State U.

Yea I wasn't able to find any original one as such.

@fkiraly
Copy link
Collaborator

fkiraly commented Aug 29, 2024

Hm, feels like we are a little stuck then. Should we park this? Would be a bit of a shame.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement implementing algorithms Implementing algorithms, estimators, objects native to skpro module:probability&simulation probability distributions and simulators
Projects
Status: PR in progress
Development

Successfully merging this pull request may close these issues.

2 participants