Skip to content

refactor(core): PoC of decoupling node from JSON-RPC#12047

Draft
bingyanglin wants to merge 10 commits into
developfrom
refactor/decouple-core-jsonrpc-types
Draft

refactor(core): PoC of decoupling node from JSON-RPC#12047
bingyanglin wants to merge 10 commits into
developfrom
refactor/decouple-core-jsonrpc-types

Conversation

@bingyanglin

@bingyanglin bingyanglin commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Description of change

Note: this branch will be rebased after #12035 is merged to develop. Can check a7a94b1 only for ease of review.

The indexer is now the sole JSON-RPC server. This removes JSON-RPC serving (and its backing index) from iota-node/iota-core, plus the code that existed only to support it. The client SDK and the iota-json-rpc* crates stay (used by the indexer / CLI / graphql).

  • iota-node — removed the JSON-RPC HTTP server (build_http_server/build_kv_store), IndexStore wiring, verify_indexes, the /health endpoint, and the iota-json-rpc{,-api} deps. gRPC + TransactionOrchestrator kept.
  • iota-core — deleted jsonrpc_index (IndexStore), subscription_handler, streamer, verify_indexes; removed the AuthorityState index field/methods, the commit-path index/subscription hooks, and the pruner's index logic.
  • iota-core decoupled from iota-json-rpc-typesdev_inspect_transaction_block/dry_exec_transaction now return raw node types (TransactionEffects/TransactionEvents/ExecutionResult); consumers (unit tests, transactional-test-runner, benchmark) migrated. iota-core has zero JSON-RPC deps.
  • iota-json-rpc — stripped node-side serving (StateRead + AuthorityState-backed API structs); kept the server infra + helpers the indexer/graphql reuse.
  • iota-config — removed dead JSON-RPC NodeConfig fields (json_rpc_address, enable_index_processing, jsonrpc_server_type, indexer_max_subscriptions, iota_names_config, num_epochs_to_retain_for_indexes, enable_secondary_index_checks) + swarm-config/swarm/localnet/test-cluster plumbing.
  • health — node health is now served via the existing gRPC LedgerService.GetHealth (same checkpoint-lag check the HTTP /health did).
  • iota-tool / iota-open-rpc / iota-types — removed index db-tool subcommands; re-pointed open-rpc spec generation at iota-json-rpc-api doc types (output byte-identical); dropped the unused IndexStoreNotAvailable error.

Links to any relevant issues

fixes #11260

How the change has been tested

  • Basic tests (linting, compilation, formatting, unit/integration tests)
  • Patch-specific tests (correctness, functionality coverage)
  • I have added tests that prove my fix is effective or that my feature works
  • I have checked that new and existing unit tests pass locally with my changes

bingyanglin and others added 9 commits June 24, 2026 16:18
Bumps the formal-snapshot format to **V2** so new snapshots carry the
data the indexer and other downstream consumers need to rebuild state
without an archival replay:

- **Per-object previous-transaction checkpoint** inline in `.obj`
records: `StoreObjectV2` on the row layer, unified `LiveObject` on the
snapshot wire.
- **Per-epoch metadata** in a new `EPOCH_INFO` file alongside the bucket
files.

**Scope: producer-side only.** This PR teaches a single writer node to
*produce* V2 snapshots; the V1 reader path is preserved on disk and
remains unchanged in behavior. The reader-side restore from `EPOCH_INFO`
and the one-time backfill of `previous_transaction_checkpoint` for
pre-V2 objects are the PR-2 follow-up tracked in **#10957**.

- Magic bumped to `0x00B7EC76` (V1 was `0x00B7EC75`) — V1/V2 readers
fail-fast on each other.
- Records are BCS-encoded `LiveObject` (unified type), carrying
`previous_transaction_checkpoint: u64` inline alongside the live
`Object`. The on-disk shape, not the Rust type name, is what changes
between V1 and V2.
- `LiveObject::Wrapped` removed from the live-set view (the enum
collapsed to a plain struct); `StoreObject::Wrapped` on the row layer
stays (distinct `OBJECT_WRAPPED` digest).

- Unchanged from V1.

- Layout: `magic(0x9000C001, 4 B) | bcs(EpochInfo::V1 { entries:
Vec<Option<EpochInfoEntry>> })`.
- One entry per epoch in `[0, snapshot_epoch]`.
- Integrity anchored by `FileMetadata::sha3_digest` in MANIFEST (same as
`.obj`/`.ref`).

```rust
pub struct EpochInfoEntry {
    pub first_checkpoint: CheckpointSequenceNumber,
    pub start_system_state: Vec<u8>,                              // bcs(IotaSystemState)
    pub last_checkpoint_summary: Option<CertifiedCheckpointSummary>,
    pub end_of_epoch_tx_events: Option<TransactionEvents>,
}
```

`start_system_state` is opaque BCS bytes so the inner
`IotaSystemStateV1/V2/…` can evolve without forcing an `EpochInfoEntry`
schema change.

- Source of truth is the gRPC indexer's `epoch_info` table on
`IndexStoreTables` (sibling of the existing `epochs` table). Populated
by `grpc_indexes::index_epoch` from `CheckpointData` over two checkpoint
boundaries:
- **Boundary 1** (prev epoch closes): insert `first_checkpoint` +
`start_system_state` for the new epoch.
- **Boundary 2** (new epoch closes): upsert `last_checkpoint_summary` +
`end_of_epoch_tx_events` and advance `Watermark::EpochIndexed`.
- Snapshot V2 writer pre-publish gate: refuses to publish unless
`Watermark::EpochIndexed >= snapshot_epoch`, so emitted snapshots are
complete-by-construction.

The single network node that produces the formal snapshot must:

1. Run as a **fullnode** (validators do not run grpc_indexes).
2. Have **`enable_grpc_api = true`** (so `epoch_info` is populated).
3. Run the PR-2 backfill once before publishing the first V2 snapshot
(so `epoch_info` covers `[0, current_epoch]`); after that, live indexing
keeps it complete forever.

All three are enforced at node startup or snapshot-publish time with
operator-facing error messages.

- V2 readers accept V2 snapshots and restore object state from
`LiveObject` records that carry `previous_transaction_checkpoint`
inline.
- V2 readers validate the manifest lists `EPOCH_INFO` (fail-fast on a
missing entry) but **do not** download or parse the file in this PR —
that's PR 2.
- V1 readers are untouched and continue to consume V1 snapshots as
before.

~~- Reader: download + verify + parse `EPOCH_INFO`, populate the
indexer's `epoch_info` table.~~
- Reader: parse `EPOCH_INFO` and dispatch each `EpochInfoEntry` through
a new `Restore` trait method (mirroring the object-write generalization
in #11559). Concrete consumer-side persistence lives behind the trait;
the Indexer-specific impl is tracked in #11023.
- One-time backfill of `epoch_info` for pre-PR epochs (so the writer's
watermark precondition can clear).
- Backfill of `previous_transaction_checkpoint` for objects in snapshots
produced before this PR.

fixes #11254

- [x] Basic tests (linting, compilation, formatting, unit/integration
tests)
- [x] Patch-specific tests (correctness, functionality coverage)
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have checked that new and existing unit tests pass locally with
my changes

---------

Co-authored-by: muXxer <git@muxxer.de>
…11697)

# Description of change

Makes a gRPC fullnode's `epochs_v2` index complete since genesis on
every bootstrap path and enforces it at startup: the node closes any gap
before services start, or refuses to run instead of silently serving an
incomplete index — the consumer side of #11453's `EPOCH_INFO` file. No
object/`previous_transaction_checkpoint` backfill: the
snapshot-publisher node resyncs from genesis instead, enforced by the
writer's existing refusal of `None` rows.

- **Synchronous startup backfill (`iota-node`, `iota-config`).** A node
that detects a gap (`epochs_v2_gap`: index short of the last executed
closed epoch) fetches only MANIFEST + `EPOCH_INFO` from the new
`state_snapshot_read_config`, seeds `[0, snapshot_epoch]`, closes any
residual above it from local history, seeds the open epoch's row, and
re-checks; a remaining gap — or no configured source — aborts startup.
Runs before live indexing exists, so the watermark has one writer at a
time and the previous design's background task, retry/backoff, and
`epoch_watermark_lock` are gone.
- **Local indexing fallback (`iota-core`).** When the latest published
snapshot lags local execution (delayed snapshot pipeline),
`index_missing_epochs_locally` replays only the missing epochs' closing
checkpoints, located via the never-pruned `epoch_last_checkpoint_map`;
best-effort up to the pruning horizon.
- **Atomic `EpochIndexed` advance (`iota-core`).** The live path
advances the watermark in the same batch as the close-of-epoch row
(gap-aware `try_advance_epoch_indexed_watermark`);
`reconcile_epoch_indexed_watermark` remains only to jump across a seeded
prefix.
- **Restore builds the whole gRPC index store (`iota-tool`,
`iota-snapshot`, `iota-core`).** `download_formal_snapshot` tees the
restored object stream into the live-state indexers
(`RestoreWithGrpcIndexes`), seeds the epoch rows, and finalizes the
store (`Watermark::Indexed`, then `meta` — a crash before `meta` leaves
a store the next open wipes and re-inits). The node opens it in place
instead of re-indexing the whole restored state; opt out with
`--skip-grpc-indexes`. `init` and the restore share one indexing
implementation (`GrpcLiveObjectRestorer`; `ParMakeLiveObjectIndexer` is
lifetime-generic now).
- **Chain-identity gate (`iota-snapshot`).**
`verify_and_restore_epoch_info` rejects a snapshot whose manifest
`chain_id` differs from this node's chain before writing any row.
- **`RestoreEpochInfo` trait (`iota-snapshot`).** Separate single-method
trait instead of a new `Restore` method: the two cover different
snapshot payloads with different targets, so each call site requires
exactly the capability it uses; the unified indexer (#11023) can
implement both.
- **No `epochs` migration (`iota-core`).** The deprecated `epochs` CF is
dropped without migration: its rows lack the end-of-epoch fields, so
they could never satisfy `EpochIndexed` and the backfill would overwrite
them anyway.
- **Open-epoch seeding (`iota-core`).** `initialize_current_epoch_info`
keys off the open epoch (`open_epoch_of`): a restore lands on a closing
checkpoint, and seeding that checkpoint's own (closed) epoch would leave
the open epoch's row permanently missing and wedge the watermark. Its
start checkpoint derives from `epoch_last_checkpoint_map` (new seq-only
`CheckpointStore` accessor; the backwards scan over prunable summaries
is deleted).
- **Default snapshot source (`setups`).** Fullnode setups for
mainnet/testnet/devnet ship a default `state-snapshot-read-config`
pointing at the IOTA Foundation buckets; ignored unless the gRPC API is
enabled and the index is incomplete.
- **Misc.** Uploader honors `state_snapshot_write_config.concurrency`
(default still 20); `get_latest_available_epoch` delegates to
`iota_snapshot::reader::latest_available_epoch`; `snapshots.mdx`
documents the restore flag and the backfill setup.

## Links to any relevant issues

follow-up (PR-2) of #11453 · part of #11023

## How the change has been tested

- [x] Basic tests (linting, compilation, formatting, unit/integration
tests)
- [x] Patch-specific tests (correctness, functionality coverage)
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have checked that new and existing unit tests pass locally with
my changes

---------

Co-authored-by: muXxer <git@muxxer.de>
# Description of change

Some cleanup for #11697 

- **Local epoch backfill: only missing data counts as pruned
(`iota-core`, `iota-types`).** Real storage failures were swallowed as
"data pruned, retry once a newer snapshot is published". Absent-data
errors now carry `Kind::Missing` (was `custom`) and only those end the
best-effort replay; anything else propagates. Nothing else reads the
kind.
- **Operator doc fixes (`iota-config`, `docs`).** The config doc still
described the removed background task (the backfill is synchronous and
gates startup); the snapshots guide now documents the refuse-to-start
case when even the latest snapshot is older than the node's pruned-away
history.

## How the change has been tested

- [x] Basic tests (linting, compilation, formatting, unit/integration
tests)
- [ ] Patch-specific tests (correctness, functionality coverage)
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have checked that new and existing unit tests pass locally with
my changes
…rom formal snapshot's EPOCH_INFO (#11868)

# Description of change

The formal-snapshot restore currently depends on two buckets: the
snapshot bucket and the checkpoint-archive bucket, where the default
summary-sync mode only ever fetches the end-of-epoch summaries. Snapshot
V2's EPOCH_INFO file already carries exactly those. It contains one
certified closing summary per epoch since genesis. Additionally, the
gRPC epochs_v2 seeding (restore and node startup backfill) trusted the
bucket after a chain-id check only.

This PR makes the default restore archive-free and anchors every
consumed EPOCH_INFO byte to the operator's genesis:

- `VerifiedEpochInfo` witness (iota-snapshot): `verify_epoch_info_chain`
is its only constructor -> chain-id check, contiguity from epoch 0, and
the committee-chain walk from the genesis committee (built on
`CommitteeChainVerifier`). Holding the witness is proof of verification;
`restore_epoch_info` now lives on it, so unverified rows structurally
cannot be written.
- Archive-free default restore (iota-tool): EPOCH_INFO is downloaded and
chain-verified up front (one small file — also rejecting
wrong-network/tampered snapshots before any large download);
`sync_summaries_from_epoch_info` seeds the genesis checkpoint, every
epoch's verified closing summary (via
`CheckpointStore::insert_verified_checkpoint`, which also maintains the
never-pruned `epoch_last_checkpoint_map`), and the committees, then sets
the four watermarks.
- `--all-checkpoints` stays archive-backed (full summary history has no
snapshot replacement); the archive config is now required only with that
flag. `start_summary_sync` is reduced to that mode, and the stale TODO
plus the redundant manual `insert_epoch_last_checkpoint` are removed
(summary sync maintains the map in both modes).
- Node startup backfill chain-verifies too:
`backfill_epochs_v2_from_snapshot` takes the genesis committee from the
node config and goes through the witness, closing the bucket-trust gap.

Tests: snapshot-test fixtures upgraded to real committee-signed
summaries; six new unit tests cover accept+restore and every rejection
path (wrong chain id, wrong genesis committee, tampered summary, missing
end-of-epoch data, non-contiguous entries); e2e tests adapted to the
witness API and hardened against a timing flake
`wait_until_executed_open_epoch`).

## How the change has been tested

- [x] Basic tests (linting, compilation, formatting, unit/integration
tests)
- [ ] Patch-specific tests (correctness, functionality coverage)
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have checked that new and existing unit tests pass locally with
my changes

### Release Notes

- [ ] Protocol:
- [x] Nodes (Validators and Full nodes): Restoring a node from a formal
snapshot no longer requires a checkpoint-archive bucket. The required
checkpoint summaries now come from the snapshot itself and are
cryptographically verified against the genesis committee.
- [ ] Indexer:
- [ ] JSON-RPC:
- [ ] GraphQL:
- [ ] CLI:
- [ ] Rust SDK:
- [ ] gRPC:
# Description of change

Formal-snapshot restore no longer touches the checkpoint archive. The
default (and now only) path seeds the end-of-epoch summaries and
committees from the snapshot's chain-verified EPOCH_INFO; the
`--all-checkpoints flag`, the `archive_store_config` plumbing, and the
`start_summary_sync_from_archive` machinery are removed from
download_formal_snapshot.

The full-summary-history download, previously bolted onto restore as
`--all-checkpoints`, becomes a standalone command, iota-tool
`backfill-checkpoint-summaries`, runnable on any stopped node. It
downloads every intermediate checkpoint summary from the archive into
the node's checkpoint store up to `min(highest_synced, archive_latest)`,
optionally verifying the chain pairwise from genesis. It only adds
historical summaries below the node's existing watermarks. No watermark
is moved, so it's safe to re-run, and it's decoupled from restore (a
node restored from a formal snapshot can become a full-header source for
peers, or serve historical checkpoint queries).

## How the change has been tested

- [x] Basic tests (linting, compilation, formatting, unit/integration
tests)
- [ ] Patch-specific tests (correctness, functionality coverage)
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have checked that new and existing unit tests pass locally with
my changes

### Release Notes

- [ ] Protocol:
- [ ] Nodes (Validators and Full nodes):
- [ ] Indexer:
- [ ] JSON-RPC:
- [ ] GraphQL:
- [x] CLI: Added `backfill-checkpoint-summaries` command to `iota-tool`
to download the full checkpoint summary history for a stopped node
- [ ] Rust SDK:
- [ ] gRPC:
)

Hardens the formal-snapshot `EPOCH_INFO` trust model. Each entry
previously carried `start_system_state` as raw BCS bytes taken on faith
from the (unsigned) snapshot transport. Now every field is proven
against its certified `last_checkpoint_summary`, and each epoch's start
state is derived from the *next* epoch's verified boundary objects
(epoch 0 from genesis). A restoring node trusts only data reachable from
a signed summary.

- On-disk entry `EpochInfoV1Entry` (moved to `iota-types`): drop
`start_system_state`; carry `last_checkpoint_summary`,
`last_checkpoint_contents`, `end_of_epoch_tx_effects`,
`end_of_epoch_tx_events`, and raw
`next_epoch_start_system_state_objects` (`0x5` + inner state).
`epoch`/`start_checkpoint` are derived from the signed summaries, not
stored.
- Add `verify_epoch_boundary_proof`: hash-chains contents → effects →
events → start-state objects back to the signed summary; rejects on any
mismatch.
- `verify_epoch_info_chain` now also takes `genesis_system_state` (epoch
0's start, which no entry proves); derives each entry's
`epoch`/`start_checkpoint` from the signed summaries and cross-checks
the start committee against the certified chain.
- Decode each epoch's `system_state` from its boundary's digest-verified
object bytes instead of trusting `start_system_state` (which can't
round-trip byte-identically against the effects' object digest).
- `EpochInfoV2` (the `epochs_v2` row, `iota-types`): hold the on-disk
entry directly as `epoch_info_entry: Option<EpochInfoV1Entry>`
(finalized ⟺ `Some`); store only the non-derivable start-of-epoch
identity (`epoch`, `start_checkpoint`, `start_timestamp_ms`,
`system_state`) and expose
`protocol_version`/`reference_gas_price`/`end_timestamp_ms`/`end_checkpoint`
as derived getters.
- `iota-core/grpc_indexes.rs`: at each boundary capture contents,
epoch-change effects, and system-state objects into the entry; load
events during sparse checkpoint assembly so the entry is complete.
- New helpers (`iota-types`): `CheckpointData::end_of_epoch_transaction`
(runtime-checked, not `debug_assert`),
`CheckpointContents::end_of_epoch_execution_digests`,
`get_iota_system_state_objects`.
- `iota-snapshot/writer.rs`: write the row's `epoch_info_entry` straight
into `EPOCH_INFO` (no projection step).
- `iota-grpc-server` `get_epoch`: serve the derived getters instead of
the removed stored fields.
- `iota-node` / `iota-tool`: pass `genesis.iota_system_object()` to the
verifier.
- `simulacrum`: set `epoch_info_entry` to `None`.

- [x] Basic tests (linting, compilation, formatting, unit/integration
tests)
- [x] Patch-specific tests (correctness, functionality coverage)
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have checked that new and existing unit tests pass locally with
my changes

---------

Co-authored-by: muXxer <git@muxxer.de>
Moves the per-epoch verified chain (`epoch_info`) out of the gRPC index
store into the `CheckpointStore`, so every node (validators and full
nodes) holds it. Unblocks the future summary pruning, lets snapshots
publish without `enable_grpc_api`, and decouples the chain from
`--skip-grpc-indexes`.

- New `epoch_info` + `epoch_info_watermark` CFs in `CheckpointStore`,
never pruned; logic in new `checkpoints/epoch_info.rs`.
- Checkpoint executor populates the chain live at every boundary
(validators included).
- gRPC index store drops its `epochs_v2` CF and `index_epoch`; keeps
only transaction + live-state indexes.
- `GetEpoch` now reads the `CheckpointStore` (`get_epoch_info` moved to
`GrpcStateReader`).
- Snapshot writer/uploader read `EPOCH_INFO` from the `CheckpointStore`;
`enable_grpc_api` requirement removed.
- Node startup `seed_epoch_info` fills any historical gap before
serving: a **recognized chain** (mainnet/testnet/current devnet)
restores from that chain's formal-snapshot `EPOCH_INFO` first, then
replays local checkpoints for the residual tail; an **unrecognized
network** rebuilds from local checkpoints only (no snapshot). A residual
gap is **fatal on a recognized chain** (every node must hold the
verified chain since genesis) and a warning otherwise (left unfilled —
such a node won't produce snapshots or serve the epoch gRPC API for
those epochs).
- The formal-snapshot source is **hardcoded per chain** (no operator
config); fetched `EPOCH_INFO` is verified against the genesis committee.
One-time migration aid for pruned upgrading nodes, removed one release
later (#12028).
- Local rebuild reconstructs the chain from genesis using only each
epoch's **boundary (change-epoch) transaction** — cheap, and succeeds
even when non-boundary transactions or old object versions have been
pruned.
- Restore tool seeds the chain unconditionally; `--skip-grpc-indexes`
now governs only live-state indexes.
- Docs: `snapshots.mdx` made node-generic; the per-epoch-metadata
backfill section removed (the source is now built-in). No new node
config.

- [x] Basic tests (linting, compilation, formatting, unit/integration
tests)
- [x] Patch-specific tests (correctness, functionality coverage)
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have checked that new and existing unit tests pass locally with
my changes

---------

Co-authored-by: muXxer <git@muxxer.de>
…ils (#12040)

# Description of change

If the snapshot download fails, a node that still has its full local
history now rebuilds `epoch_info` from that local data instead of
refusing to start. It only fails if neither the snapshot nor local data
can complete the chain (e.g. a pruned node that also can't reach the
snapshot).

## How the change has been tested

- [x] Basic tests (linting, compilation, formatting, unit/integration
tests)
- [ ] Patch-specific tests (correctness, functionality coverage)
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I have checked that new and existing unit tests pass locally with
my changes
# Description of change

This patch introduces three changes motivated by the parallel work on
using the V2 api in the indexer.

1. It updates the progress bar on regular ticks, so it doesn't made for
download completion to appear while restoring
2. Exposes a consuming `VerifiedEpochInfo::into_parts` method that
extracts all inner values, so that custom indexing logic on epoch can be
implemented efficiently.
3. Expose a getter to the `start_system_states` as a slice, to
complement the existing API and make documentation more concise.

## Links to any relevant issues

Part of #11023 

## How the change has been tested

- [x] Basic tests (linting, compilation, formatting, unit/integration
tests)
@bingyanglin bingyanglin self-assigned this Jun 26, 2026
@bingyanglin bingyanglin added the node Issues related to the Core Node team label Jun 26, 2026
@github-actions github-actions Bot added the documentation Improvements or additions to documentation label Jun 26, 2026
@bingyanglin bingyanglin force-pushed the refactor/decouple-core-jsonrpc-types branch from a7a94b1 to 88c1242 Compare June 26, 2026 14:25
@bingyanglin bingyanglin changed the title refactor(core): PoC of decoupling iota-core from JSON-RPC refactor(core): PoC of decoupling node from JSON-RPC Jun 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core-protocol documentation Improvements or additions to documentation node Issues related to the Core Node team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

gRPC: Decouple from JSON-RPC

4 participants