Skip to content

Commit 492175a

Browse files
bingyanglinmuXxer
andcommitted
feat(iota-core): backfill epochs_v2 and seed it on snapshot restore (#11697)
# Description of change Makes a gRPC fullnode's `epochs_v2` index complete since genesis on every bootstrap path and enforces it at startup: the node closes any gap before services start, or refuses to run instead of silently serving an incomplete index — the consumer side of #11453's `EPOCH_INFO` file. No object/`previous_transaction_checkpoint` backfill: the snapshot-publisher node resyncs from genesis instead, enforced by the writer's existing refusal of `None` rows. - **Synchronous startup backfill (`iota-node`, `iota-config`).** A node that detects a gap (`epochs_v2_gap`: index short of the last executed closed epoch) fetches only MANIFEST + `EPOCH_INFO` from the new `state_snapshot_read_config`, seeds `[0, snapshot_epoch]`, closes any residual above it from local history, seeds the open epoch's row, and re-checks; a remaining gap — or no configured source — aborts startup. Runs before live indexing exists, so the watermark has one writer at a time and the previous design's background task, retry/backoff, and `epoch_watermark_lock` are gone. - **Local indexing fallback (`iota-core`).** When the latest published snapshot lags local execution (delayed snapshot pipeline), `index_missing_epochs_locally` replays only the missing epochs' closing checkpoints, located via the never-pruned `epoch_last_checkpoint_map`; best-effort up to the pruning horizon. - **Atomic `EpochIndexed` advance (`iota-core`).** The live path advances the watermark in the same batch as the close-of-epoch row (gap-aware `try_advance_epoch_indexed_watermark`); `reconcile_epoch_indexed_watermark` remains only to jump across a seeded prefix. - **Restore builds the whole gRPC index store (`iota-tool`, `iota-snapshot`, `iota-core`).** `download_formal_snapshot` tees the restored object stream into the live-state indexers (`RestoreWithGrpcIndexes`), seeds the epoch rows, and finalizes the store (`Watermark::Indexed`, then `meta` — a crash before `meta` leaves a store the next open wipes and re-inits). The node opens it in place instead of re-indexing the whole restored state; opt out with `--skip-grpc-indexes`. `init` and the restore share one indexing implementation (`GrpcLiveObjectRestorer`; `ParMakeLiveObjectIndexer` is lifetime-generic now). - **Chain-identity gate (`iota-snapshot`).** `verify_and_restore_epoch_info` rejects a snapshot whose manifest `chain_id` differs from this node's chain before writing any row. - **`RestoreEpochInfo` trait (`iota-snapshot`).** Separate single-method trait instead of a new `Restore` method: the two cover different snapshot payloads with different targets, so each call site requires exactly the capability it uses; the unified indexer (#11023) can implement both. - **No `epochs` migration (`iota-core`).** The deprecated `epochs` CF is dropped without migration: its rows lack the end-of-epoch fields, so they could never satisfy `EpochIndexed` and the backfill would overwrite them anyway. - **Open-epoch seeding (`iota-core`).** `initialize_current_epoch_info` keys off the open epoch (`open_epoch_of`): a restore lands on a closing checkpoint, and seeding that checkpoint's own (closed) epoch would leave the open epoch's row permanently missing and wedge the watermark. Its start checkpoint derives from `epoch_last_checkpoint_map` (new seq-only `CheckpointStore` accessor; the backwards scan over prunable summaries is deleted). - **Default snapshot source (`setups`).** Fullnode setups for mainnet/testnet/devnet ship a default `state-snapshot-read-config` pointing at the IOTA Foundation buckets; ignored unless the gRPC API is enabled and the index is incomplete. - **Misc.** Uploader honors `state_snapshot_write_config.concurrency` (default still 20); `get_latest_available_epoch` delegates to `iota_snapshot::reader::latest_available_epoch`; `snapshots.mdx` documents the restore flag and the backfill setup. ## Links to any relevant issues follow-up (PR-2) of #11453 · part of #11023 ## How the change has been tested - [x] Basic tests (linting, compilation, formatting, unit/integration tests) - [x] Patch-specific tests (correctness, functionality coverage) - [ ] I have added tests that prove my fix is effective or that my feature works - [ ] I have checked that new and existing unit tests pass locally with my changes --------- Co-authored-by: muXxer <git@muxxer.de>
1 parent 650092f commit 492175a

21 files changed

Lines changed: 2046 additions & 372 deletions

File tree

Cargo.lock

Lines changed: 1 addition & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

crates/iota-config/src/node.rs

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -193,6 +193,12 @@ pub struct NodeConfig {
193193
#[serde(default)]
194194
pub state_snapshot_write_config: StateSnapshotConfig,
195195

196+
/// Read-side formal-snapshot source. When set, a running fullnode
197+
/// background-backfills its gRPC `epochs_v2` table from the snapshot's
198+
/// `EPOCH_INFO`. Disabled when `None` (the default).
199+
#[serde(default, skip_serializing_if = "Option::is_none")]
200+
pub state_snapshot_read_config: Option<ObjectStoreConfig>,
201+
196202
#[serde(default)]
197203
pub indexer_max_subscriptions: Option<usize>,
198204

crates/iota-core/src/checkpoints/mod.rs

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -803,14 +803,24 @@ impl CheckpointStore {
803803
&self,
804804
epoch_id: EpochId,
805805
) -> IotaResult<Option<VerifiedCheckpoint>> {
806-
let seq = self.tables.epoch_last_checkpoint_map.get(&epoch_id)?;
806+
let seq = self.get_epoch_last_checkpoint_seq_number(epoch_id)?;
807807
let checkpoint = match seq {
808808
Some(seq) => self.get_checkpoint_by_sequence_number(seq)?,
809809
None => None,
810810
};
811811
Ok(checkpoint)
812812
}
813813

814+
/// Sequence number of `epoch_id`'s last checkpoint. Unlike
815+
/// [`Self::get_epoch_last_checkpoint`], this does not require the summary
816+
/// itself to still be present: the underlying map is never pruned.
817+
pub fn get_epoch_last_checkpoint_seq_number(
818+
&self,
819+
epoch_id: EpochId,
820+
) -> Result<Option<CheckpointSequenceNumber>, TypedStoreError> {
821+
self.tables.epoch_last_checkpoint_map.get(&epoch_id)
822+
}
823+
814824
pub fn insert_epoch_last_checkpoint(
815825
&self,
816826
epoch_id: EpochId,

0 commit comments

Comments
 (0)