Skip to content

Conversation

@angela-tarantula
Copy link

@angela-tarantula angela-tarantula commented Oct 13, 2025

Fixes #138
Fixes #124

TL;DR

When DiffBuilder optimized a Remove + Add pair into a Move, some operations targeting nested positions within the same array were not reindexed correctly. The undo hooks only compared exact paths, leaving child paths (e.g., /0/x/1/y) unchanged even when /0/x is affected. This PR updates the undo logic to use prefix-based matching on parsed path segments and introduces helpers to safely adjust non-terminal path components.

Background / Problem

You can reproduce #138 and #124 by repeatedly running:

The core failure: once an Add/Remove pair is collapsed into a Move, operations that refer to deeper locations in the same array (e.g., /0/x/1/y) were not having their indices updated, because the undo hooks only matched exact string paths.

Example from #138: at the first relevant undo call, path == '0/x' but self.path == '0/x/1/y'; the equality check fails, so the subsequent operation’s index never decrements.

Root cause

  • _on_undo_remove / _on_undo_add used if self.path == path, which only fires when the target is exactly the array path, not when the target is under that array.
  • PatchOperation only exposed key (last path part), so we couldn’t adjust indices in the middle of the path (e.g., /0/x/1/y).

What this changes

  1. Prefix matching by parts
    • _on_undo_remove / _on_undo_add now receive sub_parts and use _is_prefix(sub_parts, parts) to detect impact on nested paths, not just exact matches.
  2. Part-wise path utilities
    • PatchOperation gains get_part/set_part and small increment/decrement helpers. MoveOperation mirrors these for from paths. This lets undo logic update indices in the middle of a path (not only the last part).
  3. Call sites updated
    • DiffBuilder passes op.pointer.parts[:-1] into the undo hooks so they can adjust the correct array index segment.
  4. Smaller correctness tweak
    • Align exception type in MoveOperation.from_key from TypeError to ValueError (consistent with key).

Design notes

  • Helpers are intentionally minimal: they assume the caller passes a numeric segment where needed; bad calls will surface as errors rather than being silently ignored.
  • We operate on unescaped parts (consistent with existing key semantics). Prefix checks at the parts-level avoid bugs caused by '/' and '~' in string paths.
  • The undo hooks’ signatures change (private methods), which keeps the patch clear and localized.
  • MoveOperation re-parses the from pointer in its helper methods, mirroring the existing from_key pattern. Caching is a possible follow-up if we want.

Backward compatibility

  • Public API: unchanged. Kept the existing path property even though internal logic now prefers parsed parts.
  • Private methods: _on_undo_remove / _on_undo_add now accept sub_parts rather than a string path.

Performance

  • Complexity is unchanged. _is_prefix is O(k) in path length, same as the prior equality check in practice.

Tests

New regression tests (plus a variant with escaping):

  • test_issue_138 — nested array index updates after a move optimization.
  • test_issue_138b — same as above under a key containing '/' (escaping check).
  • test_issue_124 — similar failure mode with different shapes/ordering.

All existing tests pass locally. Coverage is unchanged.

Migration / risk

  • Risk is limited to consumers that subclass and override the private _on_undo hooks; they’ll need to switch to the sub_parts argument. Typical users of the public API are unaffected.
  • Behavior change is strictly corrective. Patches that previously mis-indexed nested paths now apply correctly.

When a RemoveOperation or AddOperation is removed, subsequent operations
must have their loactions adjusted whenever they depended on deleted
operation. Previously, the logic only tested self.path == path, which
fails to account for edge cases where self.path.startswith(path) but
self.path != path.
Necessary for the case where prefix == 'a/b' and path == 'a/bc/d' in
which case it should return False, not True.

I also considered using JsonPointer.contains(), but for some reason
JsonPointer('/a/b').contains(JsonPointer('/')) returns False, so it's easier to just compare strings.
Comparing string paths to determine if one path depends on another's
removal is prone to error due to special characters '/' and '~'.
Comparing parts is the correct approach.
@angela-tarantula angela-tarantula changed the title DiffBuilder: fix reindexing for nested array paths (prefix-based undo) DiffBuilder: fix reindexing of operations involving nested paths Oct 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Wrong Patch Generated Bug: patch didn't catch changed index causing the apply to fail

1 participant