Add loo_expectation #2301

aadya940 · 2023-12-22T17:51:40Z

Related to #2059

📚 Documentation preview 📚: https://arviz--2301.org.readthedocs.build/en/2301/

aadya940 · 2024-01-07T13:14:21Z

Hello! Can anyone please review this PR and let me know If I'm missing out something?

OriolAbril

Really sorry about not reviewing until now. Let me know if still interested, if not I will finish this PR so when merged you get credit for the contribution.

cc @sethaxen I would appreciate an assist on making sure I am not messing up input/output shapes particularly

Also, api-wise, it looks like the most common scenario is using posterior predictive samples as values. Given the first input is InferenceData, what would you think about allowing values=None and using the posterior preditive when that happens?

OriolAbril · 2024-12-20T18:21:11Z

arviz/stats/stats.py

+        values: ndarray
+            A vector of quantities to compute expectations for.


We should add information about the shape of values here. My understanding is that is should be an array/dataarray with the same shape as the pointwise log likelihood (e.g. chain, draw, obs_id) which in general won't work here. I think there should be an extra check for when the input is a dataarray so that chain, draw dimensions get stacked into __sample__ one, otherwise something like:

loo_expectation(data, data.posterior_predictive.y)

would not work as is.

OriolAbril · 2024-12-20T18:27:51Z

arviz/stats/stats.py

+        pointwise: bool, optional
+            If True the pointwise predictive accuracy will be returned. Defaults to
+            ``stats.ic_pointwise`` rcParam.


This can be removed, it is not used anywhere

OriolAbril · 2024-12-20T18:28:44Z

arviz/stats/stats.py

+        **kwargs:
+            Additional keyword arguments to pass to the `psislw` function.


psislw only takes two arguments, which are already provided explicitly, so passing any kwargs here would end up as a keyword not recognized error when calling psislw.

OriolAbril · 2024-12-20T18:30:25Z

arviz/stats/stats.py

+        expectation: float
+            The computed expectation of `values` across LOO posteriors.


This should also have an indication of the expected output shape. From the examples in https://mc-stan.org/loo/reference/E_loo.html it looks like it should have the shape of pointwise log likelihood values minus __sample__ dimension.

OriolAbril · 2024-12-20T18:31:58Z

arviz/stats/stats.py

+    log_weights, _ = psislw(-log_likelihood, reff=reff, **kwargs)
+
+    # Numerically stable Weighted sum
+    # Do computations in the log-space for numerical stability


Right before that I would add a check for DataArrays (preferred input type) to see if they have chain and draw dimensions and if so stack them. Then sum only along the __sample__ dimension. also, as only stack, pointwise functions and sum are used, I think a Dataset should also be a valid input which would allow idata.posterior_predictive as default input for values

OriolAbril · 2024-12-20T18:33:08Z

arviz/stats/stats.py

+    # Numerically stable Weighted sum
+    # Do computations in the log-space for numerical stability
+    w_exp = log_weights + np.log(np.abs(values))
+    _expectation = (np.sign(values) * np.exp(w_exp)).sum()


Suggested change

_expectation = (np.sign(values) * np.exp(w_exp)).sum()

expectation = (np.sign(values) * np.exp(w_exp)).sum(dim="__sample__")

The variable is only defined within the scope of the function, no need to add any underscore to the name.

OriolAbril · 2024-12-20T18:34:06Z

arviz/tests/base_tests/test_stats.py

+    log_likelihood = get_log_likelihood(centered_eight)
+    log_likelihood = log_likelihood.stack(__sample__=("chain", "draw"))
+    values = np.arange(1, log_likelihood.shape[-1] + 1)


I think values should the the log likelihood directly, no extra processing or using its shape to create different objects.

aadya940 added 5 commits December 22, 2023 23:14

Add loo_expectation in arviz.stats

1d6d44c

Add tests for loo_expectation

efa85cf

pylint

04de7ca

black

99ee68d

black

a9779d1

sethaxen self-requested a review February 14, 2024 13:59

OriolAbril self-requested a review February 21, 2024 22:07

OriolAbril reviewed Dec 20, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add loo_expectation #2301

Add loo_expectation #2301

aadya940 commented Dec 22, 2023 •

edited by github-actions bot

Loading

aadya940 commented Jan 7, 2024

OriolAbril left a comment •

edited

Loading

OriolAbril Dec 20, 2024

OriolAbril Dec 20, 2024

OriolAbril Dec 20, 2024

OriolAbril Dec 20, 2024

OriolAbril Dec 20, 2024 •

edited

Loading

OriolAbril Dec 20, 2024

OriolAbril Dec 20, 2024

		values: ndarray
		A vector of quantities to compute expectations for.

		**kwargs:
		Additional keyword arguments to pass to the `psislw` function.

		expectation: float
		The computed expectation of `values` across LOO posteriors.

	_expectation = (np.sign(values) * np.exp(w_exp)).sum()
	expectation = (np.sign(values) * np.exp(w_exp)).sum(dim="__sample__")

Add loo_expectation #2301

Are you sure you want to change the base?

Add loo_expectation #2301

Conversation

aadya940 commented Dec 22, 2023 • edited by github-actions bot Loading

aadya940 commented Jan 7, 2024

OriolAbril left a comment • edited Loading

Choose a reason for hiding this comment

OriolAbril Dec 20, 2024

Choose a reason for hiding this comment

OriolAbril Dec 20, 2024

Choose a reason for hiding this comment

OriolAbril Dec 20, 2024

Choose a reason for hiding this comment

OriolAbril Dec 20, 2024

Choose a reason for hiding this comment

OriolAbril Dec 20, 2024 • edited Loading

Choose a reason for hiding this comment

OriolAbril Dec 20, 2024

Choose a reason for hiding this comment

OriolAbril Dec 20, 2024

Choose a reason for hiding this comment

aadya940 commented Dec 22, 2023 •

edited by github-actions bot

Loading

OriolAbril left a comment •

edited

Loading

OriolAbril Dec 20, 2024 •

edited

Loading