Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Equality of OrderedDict #82

Open
lucaferranti opened this issue Dec 31, 2021 · 7 comments
Open

Equality of OrderedDict #82

lucaferranti opened this issue Dec 31, 2021 · 7 comments
Labels
breaking Change breaks behavior and so needs to go into a major release

Comments

@lucaferranti
Copy link

I am a little confused by the following, is it intentional that == does not take into account the order of keys for OrderedDict?

julia> a1 = OrderedDict(:A => 1, :B => 2, :C => 3)
OrderedDict{Symbol, Int64} with 3 entries:
  :A => 1
  :B => 2
  :C => 3

julia> a2 = OrderedDict(:B => 2, :A => 1, :C => 3)
OrderedDict{Symbol, Int64} with 3 entries:
  :B => 2
  :A => 1
  :C => 3

julia> a1 == a2
true

this is mainly motivated by testing purposes, as I think in tests one wants to check that also the keys order is correct. Well, arguably one can always do

@test all(kv1 == kv2 for (kv1, kv2) in zip(a1, a2))

so maybe it's not a biggie.

@oxinabox
Copy link
Member

oxinabox commented Jan 11, 2022

This is because it is falling back to the generic == on AbstractDicts, which are unordered.

julia> od = OrderedDict(1=>2)
OrderedDict{Int64, Int64} with 1 entry:
  1 => 2

julia> @which ==(od, od)
==(l::AbstractDict, r::AbstractDict) in Base at abstractdict.jl:468

Whether or not this is sensible is not clear to me.

For what it is worth our behavour disagrees with python.

>>> from collections import OrderedDict
>>>  d = OrderedDict.fromkeys('abcde')
>>>  d2 = OrderedDict.fromkeys('edcba')
>>> d==d2
False

>>> d==d
True

Python treats ordered dicts as equal only if they have the same contents in the same order)

@einarpersson
Copy link

I definitely expected that == would differentiate between two OrderedDicts if they differ in order. Isn´t that the whole point, i.e. saying that "no these are not same despite having the same elements, because they are in different order".

@jariji
Copy link

jariji commented Sep 1, 2023

For any type T, x::T and y::T should not have x == y unless x and y agree on all aspects of their public API. Since order is part of the public API of Ordered containers, they should not be ==(::T, ::T) unless they agree on order.

@fingolfin
Copy link
Contributor

@jariji that is a very vague characterization, it is not clear to me what that really would mean.

If I understand you correctly (and I may not) then Julia is violating this principle already in many ways. E.g.:

julia> 1 == true
true

julia> !true
false

julia> !1
ERROR: MethodError: no method matching !(::Int64)
The function `!` exists, but no method is defined for this combination of argument types.

Arguably ! is part of the public API of booleans. And it not working on other integers also is public behavior.

OK perhaps you'll argue this doesn't count because one is an error. So how about this:

julia> x = Int8(127); y = UInt8(127);

julia> x == y
true

julia> x + x == y + y
false

julia> x + x == x + y
false

@fingolfin fingolfin added the breaking Change breaks behavior and so needs to go into a major release label Nov 25, 2024
@fingolfin
Copy link
Contributor

Don't get me wrong, this is not meant to argue that the current behavior is right / correct / preferable / whatever, just that I don't buy this particular argument.

Another aspect to consider is that changing the behavior of == would be a breaking change, i.e. it would have to go into a major release like 2.0.0.

Unlike some other breaking changes we may want to apply, like fixing some type piracy issues (see issue #25 ) this one is more tricky as it has the potential to introduce one of the worst kinds of bugs: code that previously worked correctly in downstream packages could be suddenly start producing wrong results. That is really undesirable.

Thus, personally I think the ship has sailed on this one. If this was day 1, I'd be fine with taking the order into account, but there are now > 350 direct dependencies on this package in the ecosystem, and a couple thousand indirect one. Thus as outlined I believe changing this risks breaking too much code in a really bad way. The gain seems minimal.

So I would say "By now, this is indeed as intended", and we should document it explicitly as such.

To still help users, perhaps we could introduce a "compare while taking order into account" method. Say is_strictly_equal or so

Also perhaps it would help to add a new StrictOrderedDict type which behaves like OrderedDict except for equality (and perhaps a few additional points). To implement this, one could add another type parameter strict to OrderedDict, so the implementation would be the same. Roughly like this:

mutable struct GenOrderedDict{K,V,strict} <: AbstractDict{K,V}
    slots::Vector{Int32}
    keys::Vector{K}
    vals::Vector{V}
    ndel::Int
    maxprobe::Int
    dirty::Bool
end

const OrderedDict{K,V} = GenOrderedDict{K,V,false}
const StrictOrderedDict{K,V} = GenOrderedDict{K,V,true}

Of course then one still would have to duplicate a bunch of constructors and adjust other places to use GenOrderedDict instead of OrderedDict

Any code which currently only expects an OrderedDict would not accept a StrictOrderedDict out of box -- but that's a feature, not a bug, as the idea precisely was to not sneak in the changed behavior but rather require code to explicitly support it...

@jariji
Copy link

jariji commented Nov 25, 2024

@fingolfin I say "For any type T, x::T and y::T ...", assuming x and y have the same type. But in the example of x::Bool; y::Int; x==y, x and y have different types. I was not making an argument about what == means across different types, which is more complicated. I'm only addressing the simpler case of a single type.

@fingolfin
Copy link
Contributor

but there is also the matter of transitivity: x == y and y == z should perhaps imply x == z (granted I am sure Julia also violates that already).

In any case it does not change the fact that surely there is code out there relying on the current behavior, so changing it would be a (painful) breaking change

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Change breaks behavior and so needs to go into a major release
Projects
None yet
Development

No branches or pull requests

5 participants