refactor: store hnsw metadata together with index#7242
Conversation
🤖 Augment PR SummarySummary: Refactors HNSW snapshot/full-sync serialization so the graph entry-point metadata is stored inline with the vector-index opcode stream (instead of a separate AUX JSON field). Changes:
Technical Notes: This updates the on-wire layout for 🤖 Was this summary useful? React with 👍 or 👎 |
There was a problem hiding this comment.
Pull request overview
Refactors HNSW (vector index) replication/RDB serialization to store the HNSW entry-point metadata inline with RDB_OPCODE_VECTOR_INDEX node data, removing the separate hnsw-index-metadata AUX field.
Changes:
- Extend
RDB_OPCODE_VECTOR_INDEXpayload to includeenterpoint_nodeahead ofelements_number. - Remove collection/serialization/loading of
hnsw-index-metadataAUX data and plumb metadata through pending-node restoration. - Update HNSW restore logic to derive capacity from
nodes.size()and validate entry-point via wire ordering assumptions.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| src/server/search/serialization_utils.cc | Writes enterpoint_node inline as part of vector index opcode serialization. |
| src/server/rdb_save.h | Removes hnsw_index_metadata from global snapshot header data. |
| src/server/rdb_save.cc | Stops collecting/writing hnsw-index-metadata AUX fields. |
| src/server/rdb_load_context.h | Removes pending AUX metadata storage; stores metadata with deferred nodes; adds restored-flag. |
| src/server/rdb_load_context.cc | Uses deferred-node metadata directly; replaces metadata list with a restored-flag. |
| src/server/rdb_load.h | Updates RestoreVectorIndex API to accept metadata parameter. |
| src/server/rdb_load.cc | Parses new vector-index wire format and passes metadata through restore/defer paths. |
| src/core/search/hnsw_index.h | Updates metadata documentation to reflect inline restore needs. |
| src/core/search/hnsw_index.cc | Updates restore path to use O(1) entry-point checks and nodes.size()-based capacity. |
|
augment review |
|
augment review |
|
augment review |
|
augment review |
Summary: Refactors HNSW snapshot/replication serialization so the entry-point metadata is stored inline with the vector-index opcode stream instead of in a separate AUX JSON field.
Changes:
Technical Notes: This changes the on-wire layout of RDB_OPCODE_VECTOR_INDEX and removes the separate AUX metadata emission/consumption, so mixed-version compatibility needs to be considered wherever RDB/full-sync is used.