Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document that VonMises standard deviation and variance are circular #1705

Open
itsdfish opened this issue Apr 1, 2023 · 7 comments
Open
Labels
documentation Improvements of the documentation

Comments

@itsdfish
Copy link

itsdfish commented Apr 1, 2023

Hi,

I noticed that the VonMises standard deviation is wrong. Consider the following:

using Distributions
using Random 

Random.seed!(65)

dist = VonMises(0.0, 2.0)
x = rand(dist, 100_000)
@show std(x)
@show std(dist)

Output

std(x) = 0.8741380214035047
std(dist) = 0.5497502542391335

Version

Distributions v0.25.86
Julia 1.9.0 rc2

Update:

I'm not familiar with variance vs. circular variance. Could this be the reason for the discrepancy? If so, it might be good to make a note in the documentation.

@andreasnoack
Copy link
Member

Could this be the reason for the discrepancy?

Yes. That is the difference. See https://en.wikipedia.org/wiki/Directional_statistics.

julia> sqrt(1 - abs(mean(exp.(complex.(0.0, x)))))
0.5492460106924333

Indeed, this could be better documented, so let's keep this issue open to track that. I've adjusted the title accordingly.

@andreasnoack andreasnoack added the documentation Improvements of the documentation label Apr 2, 2023
@andreasnoack andreasnoack changed the title VonMises standard deviation appears to be wrong Document that VonMises standard deviation and variance are circular Apr 2, 2023
@aplavin
Copy link
Contributor

aplavin commented Sep 2, 2023

What's the decision, if any?

Currently, var is assumed to be circular:

julia> using Distribustions

julia> d = VonMises(0, 2)

julia> var(d)
0.30222534203599194

julia> var(rand(d, 10^6))
0.7641079384821371

julia> using DirectionalStatistics

julia> Circular.var(rand(d, 10^6))
0.3036343981785453

while std is probably just incorrect:

julia> std(d)
0.5497502542391335

julia> std(rand(d, 10^6))
0.8744747812547986

julia> Circular.std(rand(d, 10^6))
0.8485527593646777

The path forward seems either:

  • Keep var as-is, fix std, update docs to refer to circular statistics
  • Convert both var and std to regular (non-circular), keep docs as-is

Technically, the latter seems more consistent with other distributions... It can potentially be confusing if std(d) != std(rand(d, N)).
But are there actual scenarios when non-circular std/var of VonMises are needed?

@ParadaCarleton
Copy link
Contributor

  • Keep var as-is, fix std, update docs to refer to circular statistics

I believe this is the intention; I'd be happy to review a PR on this if mentioned.

@itsdfish
Copy link
Author

itsdfish commented Sep 4, 2023

I can submit a PR to make a note about circular statistics. I'm not sure where to update the docs, but I identified at least three places where the docs could be updated. @ParadaCarleton, do you prefer any or all of these? Please advise.

"""
std(d::UnivariateDistribution)
Return the standard deviation of distribution `d`, i.e. `sqrt(var(d))`.
"""

var(d::VonMises) = 1 - besselix(1, d.κ) / d.I0κx

"""
std(d::UnivariateDistribution)
Return the standard deviation of distribution `d`, i.e. `sqrt(var(d))`.
"""

@itsdfish
Copy link
Author

itsdfish commented Sep 4, 2023

Sorry. I misunderstood that std has an error. I assumed std(d) = sqrt(var(d)).

@mateuszbaran
Copy link

mateuszbaran commented Sep 29, 2023

Technically, the latter seems more consistent with other distributions... It can potentially be confusing if std(d) != std(rand(d, N)).
But are there actual scenarios when non-circular std/var of VonMises are needed?

I find using circular variance for for circular distributions confusing. Non-circular std/var is reasonable to use when the distribution is (approximately) supported on an arc.

Von Mises Fisher distribution can be defined on any-dimensional sphere and has a valid, general notion of variance: http://www.math.ualberta.ca/~thillen/paper/vonmises.pdf . Were it implemented in Distributions.jl, should it use circular or generic definition when the sphere happens to be a circle?

@aplavin
Copy link
Contributor

aplavin commented Dec 29, 2023

For convenience, I added Circular.var() etc overloads for this distribution to DirectionalStatistics.jl – to return circular variance and std. Now, we have there:

Circular.mean(d) == Circular.mean(rand(d, N))
Circular.var(d) == Circular.var(rand(d, N))
Circular.std(d) == Circular.std(rand(d, N))

These relations are very nice to have for consistency!
So, maybe regular Statistics.std(VonMises) and var() should return regular std and var values, not circular ones? Then they'll also enjoy the same consistency.
Or throw an error if not useful/not possible analytically.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements of the documentation
Projects
None yet
Development

No branches or pull requests

5 participants