Skip to content

refactor: store hnsw metadata together with index#7242

Open
BorysTheDev wants to merge 5 commits into
mainfrom
move_metadata_to_index_serialization
Open

refactor: store hnsw metadata together with index#7242
BorysTheDev wants to merge 5 commits into
mainfrom
move_metadata_to_index_serialization

Conversation

@BorysTheDev
Copy link
Copy Markdown
Contributor

@BorysTheDev BorysTheDev commented Apr 30, 2026

Summary: Refactors HNSW snapshot/replication serialization so the entry-point metadata is stored inline with the vector-index opcode stream instead of in a separate AUX JSON field.

Changes:

  • Embed enterpoint_node into RDB_OPCODE_VECTOR_INDEX immediately after index_key
  • Remove save/load of the hnsw-index-metadata AUX field and pending-metadata tracking
  • Update restore path to pass HnswIndexMetadata through the vector-index handler and deferred-node structs
  • Harden RestoreFromNodes by validating wire ordering (internal_id == i) and using nodes.size() for capacity
  • Simplify post-load rebuild logic: each index decides whether to use the “restored” rebuild path based on its own graph + key-index state
  • Always drain buffered vector updates after post-load rebuild completes (per-index no-op when nothing was buffered)

Technical Notes: This changes the on-wire layout of RDB_OPCODE_VECTOR_INDEX and removes the separate AUX metadata emission/consumption, so mixed-version compatibility needs to be considered wherever RDB/full-sync is used.

Copilot AI review requested due to automatic review settings April 30, 2026 08:19
@augmentcode
Copy link
Copy Markdown

augmentcode Bot commented Apr 30, 2026

🤖 Augment PR Summary

Summary: Refactors HNSW snapshot/full-sync serialization so the graph entry-point metadata is stored inline with the vector-index opcode stream (instead of a separate AUX JSON field).

Changes:

  • Extend RDB_OPCODE_VECTOR_INDEX payload to include enterpoint_node after index_key
  • Remove save/load of the hnsw-index-metadata AUX field and related pending-metadata plumbing
  • Thread HnswIndexMetadata through the vector-index handler and deferred-restore structs
  • Harden HNSW restore by enforcing wire-ordering (nodes[i].internal_id == i) and deriving capacity from nodes.size()
  • Make post-load rebuild choose the “restored” path per index based on actual graph+key-index state
  • After post-load rebuild, drain any buffered vector updates to transition indices back to normal operation

Technical Notes: This updates the on-wire layout for RDB_OPCODE_VECTOR_INDEX and removes the AUX emission/consumption, so mixed-version RDB/full-sync compatibility needs to be considered where applicable.

🤖 Was this summary useful? React with 👍 or 👎

Copy link
Copy Markdown

@augmentcode augmentcode Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 4 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

Comment thread src/server/rdb_load.cc
Comment thread src/server/rdb_load.cc
Comment thread src/server/rdb_load.cc
Comment thread src/core/search/hnsw_index.cc
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Refactors HNSW (vector index) replication/RDB serialization to store the HNSW entry-point metadata inline with RDB_OPCODE_VECTOR_INDEX node data, removing the separate hnsw-index-metadata AUX field.

Changes:

  • Extend RDB_OPCODE_VECTOR_INDEX payload to include enterpoint_node ahead of elements_number.
  • Remove collection/serialization/loading of hnsw-index-metadata AUX data and plumb metadata through pending-node restoration.
  • Update HNSW restore logic to derive capacity from nodes.size() and validate entry-point via wire ordering assumptions.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/server/search/serialization_utils.cc Writes enterpoint_node inline as part of vector index opcode serialization.
src/server/rdb_save.h Removes hnsw_index_metadata from global snapshot header data.
src/server/rdb_save.cc Stops collecting/writing hnsw-index-metadata AUX fields.
src/server/rdb_load_context.h Removes pending AUX metadata storage; stores metadata with deferred nodes; adds restored-flag.
src/server/rdb_load_context.cc Uses deferred-node metadata directly; replaces metadata list with a restored-flag.
src/server/rdb_load.h Updates RestoreVectorIndex API to accept metadata parameter.
src/server/rdb_load.cc Parses new vector-index wire format and passes metadata through restore/defer paths.
src/core/search/hnsw_index.h Updates metadata documentation to reflect inline restore needs.
src/core/search/hnsw_index.cc Updates restore path to use O(1) entry-point checks and nodes.size()-based capacity.

Comment thread src/server/search/serialization_utils.cc
Comment thread src/server/rdb_load.cc
Comment thread src/server/rdb_load.cc Outdated
Comment thread src/server/rdb_load_context.h
Comment thread src/core/search/hnsw_index.cc
@BorysTheDev
Copy link
Copy Markdown
Contributor Author

augment review

Copy link
Copy Markdown

@augmentcode augmentcode Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 2 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

Comment thread src/server/search/serialization_utils.cc
Comment thread src/server/rdb_load_context.cc
@BorysTheDev
Copy link
Copy Markdown
Contributor Author

augment review

Copy link
Copy Markdown

@augmentcode augmentcode Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 2 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

Comment thread src/core/search/hnsw_index.cc
Comment thread src/server/rdb_load_context.cc Outdated
@BorysTheDev
Copy link
Copy Markdown
Contributor Author

augment review

Copy link
Copy Markdown

@augmentcode augmentcode Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 1 suggestion posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

Comment thread src/core/search/hnsw_index.cc Outdated
@BorysTheDev
Copy link
Copy Markdown
Contributor Author

augment review

Copy link
Copy Markdown

@augmentcode augmentcode Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. No suggestions at this time.

Comment augment review to trigger a new review at any time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants