FIX: `assert_identical` now considers `xindexes` + improve `RangeIndex` equals #11035

ianhi · 2025-12-18T20:48:50Z

identical and friends now also compare xindexes. Checking that they are

the same type
are equal if they define __equals__
fallback to comparing their coords otherwise.

Note for Reviewers

This PR contains the initial fix for assert_identical. However this uncovered several would have been failures in tests. The rest of this PR is correcting those issues. I tried to keep them in separate commits as much as possible.

RangeIndex - `dbb649a`

RangeIndex was failing assert_identical due to floating point error accumulation after slicing. I updated the RangeIndex equals method to use np.isclose by default, but with the ability to fall back to exact comparison

test_concat - `dcfc46e`

It seems like this test was not checking the correct behavior. I think this changed at some point after #6385 and it wasn't caught by the identical check.

groupby intervalindex - `1a1219a`

similar to concat. the current behavior is that an intervalIndex gets constructed.

timedelta dtype - `6214d4f`

I think this was a typo that wasn't caught. the data is encoded with s so I would expect it to come back with s not ns

The thing I feel least confident in is how nice the formatting of the assertion error diff looks like. Some examples (from the tests):

AssertionError: Left and right Dataset objects are not identical
Indexes only on the right object: ['time_metadata']

E       AssertionError: Left and right DataArray objects are not identical
E       Differing indexes:
E       L   group_bins           IntervalIndex([(-2, -1], (-1, 0], (0, 1], (1, 2]], dtype='interval[int64, right]', name='group_bins')
E       R   group_bins           Index([(-2, -1], (-1, 0], (0, 1], (1, 2]], dtype='object', name='group_bins')

AssertionError: Left and right Dataset objects are not identical
Differing indexes:
    Indexes only on the left object: ['x']
    Indexes only on the right object: ['y']

AssertionError: Left and right Dataset objects are not identical
Differing indexes:
    Differing index types: ['x: PandasIndex vs CustomIndex']

AssertionError: Left and right Dataset objects are not identical
Differing coordinates:
L * x        (z) int64 32B 10 10 20 20
R * x        (z) int64 32B 10 20 10 20
L * y        (z) <U1 16B 'a' 'b' 'a' 'b'
R * y        (z) <U1 16B 'a' 'a' 'b' 'b'
L * z        (z) object 32B MultiIndex
R * z        (z) object 32B MultiIndex
Differing data variables:
L   data     (z) int64 32B 1 2 3 4
R   data     (z) int64 32B 1 3 2 4
Differing indexes:
    Differing index values: ['x', 'y', 'z']

🤖 Ideas and directions mine, typing by claude. I went through a few rounds of local review and revision before opening.

Closes Should assert_identical also compare indexes? #11033
Tests added
User visible changes (including notable bug fixes) are documented in whats-new.rst
[NA?] New functions/methods are listed in api.rst

ianhi · 2025-12-18T20:50:32Z

Thinking about @keewis comment: #11033 (comment)

here I have it attempt to compare via __equals__ if available, and only then fall back to coord level compare. But I have not changed the indexes_equal to do a similar thing. We can add that here if you'd like, or leave that function be

ianhi · 2025-12-18T20:58:37Z

Oh no. I should have run the full test suite locally instead of just my new ones

Not super stoked to go in and modify many tests 🤔

FAILED xarray/tests/test_backends.py::TestGenericNetCDFData::test_roundtrip_timedelta_data - AssertionError: Left and right Dataset objects are not identical
Indexes with differing values: ['td']
FAILED xarray/tests/test_backends.py::TestScipyInMemoryData::test_roundtrip_timedelta_data - AssertionError: Left and right Dataset objects are not identical
Indexes with differing values: ['td']
FAILED xarray/tests/test_backends.py::TestScipyFileObject::test_roundtrip_timedelta_data - AssertionError: Left and right Dataset objects are not identical
Indexes with differing values: ['td']
FAILED xarray/tests/test_backends.py::TestScipyFilePath::test_roundtrip_timedelta_data - AssertionError: Left and right Dataset objects are not identical
Indexes with differing values: ['td']
FAILED xarray/tests/test_formatting.py::TestFormatting::test_diff_array_repr - AssertionError: assert 'Left and rig...ription: desc' == 'Left and rig...ription: desc'
  
  Skipping 401 identical leading characters in diff, use -v to show
    4 16B 1 2
  + Indexes only on the left object:  ['y']
  + Indexes with differing values: ['x']
    Differing attributes:
    L   units: m
    R   units: kg
    Attributes only on the left object:
        description: desc
FAILED xarray/tests/test_formatting.py::TestFormatting::test_diff_dataset_repr - AssertionError: assert 'Left and rig...ription: desc' == 'Left and rig...ription: desc'
  
  Skipping 606 identical leading characters in diff, use -v to show
    4 16B 3 4
  + Indexes only on the left object:  ['y']
  + Indexes with differing values: ['x']
    Differing attributes:
    L   title: mytitle
    R   title: newtitle
    Attributes only on the left object:
        description: desc
FAILED xarray/tests/test_concat.py::TestConcatDataset::test_concat_promote_shape_with_scalar_coordinates - AssertionError: Left and right Dataset objects are not identical
Indexes with differing values: ['x']
FAILED xarray/tests/test_groupby.py::TestDataArrayGroupBy::test_groupby_bins_multidim - AssertionError: Left and right DataArray objects are not identical
Indexes with differing values: ['group_bins']
FAILED xarray/tests/test_dataset.py::TestDataset::test_rename_dims - AssertionError: Left and right Dataset objects are not identical
Indexes only on the right object: ['x']
FAILED xarray/tests/test_dataset.py::TestDataset::test_rename_vars - AssertionError: Left and right Dataset objects are not identical
Indexes only on the right object: ['x_new']
FAILED xarray/tests/test_dataset.py::TestDataset::test_expand_dims_create_index_from_iterable - AssertionError: Left and right Dataset objects are not identical
Indexes only on the left object:  ['x']
FAILED xarray/tests/test_dataset.py::TestDataset::test_to_and_from_dict_with_nan_nat[array] - AssertionError: Left and right Dataset objects are not identical
Indexes with differing values: ['t']
FAILED xarray/tests/test_groupby.py::test_multiple_groupers_mixed[True-True] - AssertionError: Left and right Dataset objects are not identical
Indexes with differing values: ['x_bins']
FAILED xarray/tests/test_groupby.py::test_multiple_groupers_mixed[True-False] - AssertionError: Left and right Dataset objects are not identical
Indexes with differing values: ['x_bins']
FAILED xarray/tests/test_range_index.py::test_range_index_isel - AssertionError: Left and right Dataset objects are not identical
Indexes with differing values: ['x']
FAILED xarray/tests/test_range_index.py::test_range_index_sel - AssertionError: Left and right Dataset objects are not identical
Indexes with differing values: ['x']
FAILED xarray/tests/test_groupby.py::test_multiple_groupers_mixed[False-True] - AssertionError: Left and right Dataset objects are not identical
Indexes with differing values: ['x_bins']
FAILED xarray/tests/test_groupby.py::test_multiple_groupers_mixed[False-False] - AssertionError: Left and right Dataset objects are not identical
Indexes with differing values: ['x_bins']

ianhi · 2025-12-18T21:03:30Z

draft until tests are something resembling passing

ianhi · 2025-12-18T21:33:27Z

Ok so a number of the test failures are basically floating point accumulation errors. e.g.:

import xarray as xr
from xarray.indexes import RangeIndex

# Create a RangeIndex-backed dataset
index = RangeIndex.arange(0.0, 1.0, 0.1, dim='x')
ds = xr.Dataset(coords=xr.Coordinates.from_xindex(index))

# Slice it
sliced = ds.isel(x=slice(1, 3))

# Create a fresh RangeIndex with the 'same' values
fresh_index = RangeIndex.arange(0.1, 0.3, 0.1, dim='x')
fresh = xr.Dataset(coords=xr.Coordinates.from_xindex(fresh_index))

# Compare the indexes
sliced_idx = sliced.xindexes['x']
fresh_idx = fresh.xindexes['x']

print('Both have the same coordinate values:')
print(f'  sliced.x.values: {sliced.x.values}')  # [0.1 0.2]
print(f'  fresh.x.values:  {fresh.x.values}')   # [0.1 0.2]

print('But the internal RangeIndex state differs due to floating point:')
print(f'  sliced: stop={sliced_idx.stop}, step={sliced_idx.step}')
# sliced: stop=0.30000000000000004, step=0.10000000000000002
print(f'  fresh:  stop={fresh_idx.stop}, step={fresh_idx.step}')
# fresh:  stop=0.3, step=0.09999999999999999

print(f'sliced_idx.equals(fresh_idx): {sliced_idx.equals(fresh_idx)}')  # False
print(f'sliced.identical(fresh): {sliced.identical(fresh)}')  # False

gives:

Both have the same coordinate values:
  sliced.x.values: [0.1 0.2]
  fresh.x.values:  [0.1 0.2]
But the internal RangeIndex state differs due to floating point:
  sliced: stop=0.30000000000000004, step=0.10000000000000002
  fresh:  stop=0.3, step=0.09999999999999999
sliced_idx.equals(fresh_idx): False
sliced.identical(fresh): False

so I have added a backwards compat that uses check_default_indexes=False to imply not checking indexes for identicalness. This make far fewer tests fail which is nice. but it probably remains worthwhile to go through them all one by one and see

Without this you get: AssertionError: Left and right Dataset objects are not identical Differing indexes: L x IntervalIndex([(-1, 0], (0, 1]], dtype='interval[int64, right]', name='x') R x Index([(-1, 0], (0, 1]], dtype='object', name='x')

just matching what it is set to above. this was not caught before by assert_identical

max-sixty · 2025-12-18T22:22:34Z

I would support making incremental changes if that lets us make changes — e.g. make the change to the function, fix a few of the tests, but then have an LLM set some flag check_indexes=False and a TODO in 50 places

and then future contributions can work through the 50 places...

ianhi · 2025-12-18T22:31:41Z

I would support making incremental changes if that lets us make changes — e.g. make the change to the function, fix a few of the tests, but then have an LLM set some flag check_indexes=False and a TODO in 50 places

and then future contributions can work through the 50 places...

I got sucked into a rhythm. I've fixed most of the issues and left a commit by commit breakdown in the first comment. The remaining ones I ran out of steam to fully fix are the changes in test_units and test_dataset which i use an escape hatch with a TODO: 27b4275

keewis · 2025-12-18T22:30:29Z

xarray/core/formatting.py

        return compat


+def diff_indexes_repr(a_indexes, b_indexes, col_width: int = 20) -> str:


I've implemented something very similar: https://github.com/xarray-contrib/xdggs/blob/52d8b1dd23bf809757c7e3f5c04945129f6905af/xdggs/tests/__init__.py#L64-L105

oh neat! I'll take a close look tomorrow. Is there anything different we should do here that would have made your xdggs use cases easier?

there's not much that's different (the diff formatting is slightly different). However, compared to indexes_equal it may be worth grouping indexes with indexes.group_by_index() (which would mean we don't have to worry about caching)

keewis · 2025-12-18T22:32:10Z

xarray/core/formatting.py

+            try:
+                a_repr = inline_index_repr(
+                    a_indexes.to_pandas_indexes()[key], max_width=70
+                )
+                b_repr = inline_index_repr(
+                    b_indexes.to_pandas_indexes()[key], max_width=70
+                )
+            except TypeError:
+                # Custom indexes may not support to_pandas_index()
+                a_repr = repr(a_idx)
+                b_repr = repr(b_idx)


might be worth calling index._repr_inline_(max_width=70) with a fallback to repr(index)

Is this well defined API for a custom index to support? Def happy to add it, just also wondering if the knoweldge of that being helpful is (or should be) written down somewhere

we already use those in inline_index_repr, so yes, this should be well defined.

This should definitely be part of the custom index development page, and worth adding if it is not already part of that.

keewis · 2025-12-18T22:35:59Z

I ran out of steam to fully fix are the changes in test_units

test_units predates the custom indexes, which means it tries not to create any indexes (units would get stripped by the pandas index). If there are indexes anyways it might be worth marking (those would be bugs)

ianhi · 2025-12-18T22:45:06Z

If there are indexes anyways it might be worth marking (those would be bugs)

see: 27b4275

I'm happy to open an issue about this instead of fixing here. If I understand correctly that there is a bug here?

keewis · 2025-12-18T22:48:13Z

that would be great, thanks. In the long run I'd like to replace those with the tests in xarray-array-testing but didn't have time to make progress on that.

max-sixty · 2025-12-21T03:32:25Z

thanks @ianhi !

ianhi added 4 commits December 18, 2025 15:30

FIX: assert_identical compares xindexes

81615ed

move logic to be more consistent

287812f

combine tests to simplify

6f27ac7

add reprs to dataset fixtures

7fce3c3

github-actions bot added the topic-indexing label Dec 18, 2025

ianhi changed the title ~~True ident~~ FIX: assert_identical now considers xindexes Dec 18, 2025

ianhi changed the title ~~FIX: assert_identical now considers xindexes~~ FIX: assert_identical now considers xindexes Dec 18, 2025

ianhi marked this pull request as draft December 18, 2025 21:03

update formatting tests

0ac4870

backcompat to escape index checks

2167013

make range_index equals float error tolerant

dbb649a

ianhi changed the title ~~FIX: assert_identical now considers xindexes~~ FIX: assert_identical now considers xindexes + improve RangeIndex equals Dec 18, 2025

ianhi added 7 commits December 18, 2025 17:00

RangeIndex whatsnew

b2ffd28

clean up whatsnew

eccb09c

better index value diff formatting

6e4099a

formatting change tests

348306c

fix test_concat test

dcfc46e

Without this you get: AssertionError: Left and right Dataset objects are not identical Differing indexes: L x IntervalIndex([(-1, 0], (0, 1]], dtype='interval[int64, right]', name='x') R x Index([(-1, 0], (0, 1]], dtype='object', name='x')

update groupby tests for IntervalIndex

1a1219a

correct timedelta encoding in test

6214d4f

just matching what it is set to above. this was not caught before by assert_identical

escape remaining failing tests with TODO notes

27b4275

keewis reviewed Dec 18, 2025

View reviewed changes

remove over-verbose AI addition

789b824

ianhi marked this pull request as ready for review December 18, 2025 22:44

ianhi mentioned this pull request Dec 18, 2025

📚 Add repr information to CustomIndex docs #11036

Open

max-sixty added the plan to merge Final call for comments label Dec 19, 2025

max-sixty merged commit 0c07685 into pydata:main Dec 21, 2025
52 checks passed

spencerkclark mentioned this pull request Dec 22, 2025

⚠️ Nightly upstream-dev CI failed ⚠️ #11043

Closed

sshekhar563 mentioned this pull request Dec 22, 2025

Add _repr_inline_ documentation to CustomIndex docs #11046

Open

ianhi deleted the true-ident branch December 22, 2025 15:47

martinfleis mentioned this pull request Dec 29, 2025

⚠️ Nightly upstream-dev CI failed ⚠️ xarray-contrib/xvec#136

Closed

		return compat


		def diff_indexes_repr(a_indexes, b_indexes, col_width: int = 20) -> str:

Uh oh!

FIX: assert_identical now considers xindexes + improve RangeIndex equals #11035

FIX: assert_identical now considers xindexes + improve RangeIndex equals #11035

Conversation

ianhi commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Note for Reviewers

RangeIndex - dbb649a

test_concat - dcfc46e

groupby intervalindex - 1a1219a

timedelta dtype - 6214d4f

Uh oh!

ianhi commented Dec 18, 2025

Uh oh!

ianhi commented Dec 18, 2025

Uh oh!

ianhi commented Dec 18, 2025

Uh oh!

ianhi commented Dec 18, 2025

Uh oh!

max-sixty commented Dec 18, 2025

Uh oh!

ianhi commented Dec 18, 2025

Uh oh!

keewis Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

ianhi Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

keewis Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

keewis Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

ianhi Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

keewis Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

keewis commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ianhi commented Dec 18, 2025

Uh oh!

keewis commented Dec 18, 2025

Uh oh!

Uh oh!

max-sixty commented Dec 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

FIX: `assert_identical` now considers `xindexes` + improve `RangeIndex` equals #11035

FIX: `assert_identical` now considers `xindexes` + improve `RangeIndex` equals #11035

ianhi commented Dec 18, 2025 •

edited

Loading

RangeIndex - `dbb649a`

test_concat - `dcfc46e`

groupby intervalindex - `1a1219a`

timedelta dtype - `6214d4f`

keewis commented Dec 18, 2025 •

edited

Loading