Skip to content

Conversation

@drahnr
Copy link
Contributor

@drahnr drahnr commented Jan 19, 2026

From #1326 (comment) :

Could we move all loading-related methods into a separate file? Basically, I'd split this file into:

src/state/mod.rs
src/state/loader.rs

I think overall, we need 3 loading modes:

* Load everything from memory (when rocksdb feature is disabled).
* Use persistent data from RockDB (when rocksdb feature is enabled). This would probably require passing filepaths to RocksDB databases on startup (or at least assuming some defaults).
* Rebuild RocksDB data from the database (when rocksdb feature is enabled and we are starting a new node, or the user sets some flag that indicates that RocksDB databases are to be re-built).

I think currently we have 1 and 2, and maybe some parts of 3.

Changes

  • Adds a --force-rebuild flag to the CLI to force re-building of the tree stores from sqlite
  • Error out if we detect a mismatch of the tree stores and sqlite

/// has diverged from the database. This will delete existing tree storage and rebuild
/// it from scratch, which may take some time for large databases.
#[arg(long = "force-rebuild-tree-storage", default_value_t = false)]
force_rebuild_tree_storage: bool,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relevant code to review 👆

Copy link
Collaborator

@Mirko-von-Leipzig Mirko-von-Leipzig Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced we want this to be an explicit option. We can also trigger this by manually deleting the database which should be sufficient so long as things are sort of working okay..

From a deployment perspective, this is problematic because we will either have it permanently true or false. Doing this once-off would require a manual intervention in any case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! I think manually removing the files is fine. We should probably just document this (as behavior may not be obvious).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the option is preferred, since it doesn't require having knowledge what all of this means. Deleting a specific subset of files seems to be an unnecessary footgun for an operator.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But there is no operator that will be able to use it since you cannot "deploy" with this option? Its only useable in a local context imo.

We similarly don't have an option to rebuild the sqlite database for example. Though technically its possible from raw blocks.

imo the fewer options there are, the better. The rarer the occurrence, the more "manual" it should be. If this becomes a common problem, then sure we can address it differently.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, that it should be done as a subcommand, the current form 'd require a separate deployment cycle, which makes it less useful.

@drahnr
Copy link
Contributor Author

drahnr commented Jan 19, 2026

Left markers for most relevant code changes

Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thank you! I left some comments inline - many of them are for the future.

Comment on lines +208 to +209
NullifierTree::with_storage_from_entries(self, entries)
.map_err(StateInitializationError::FailedToCreateNullifierTree)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for this PR, but would be great to make the methodology of loading account and nullifier trees consistent. Currently, for nullifier tree we use NullifierTree::with_storage_from_entries() and for account tree we use AccountTree::new().

Some of these changes may need to happen in miden-base. Let's create issues for these (unless we have the already).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thinking is that with_storage_from_entries won't work anyways for larger nullifier sets, and hence this will go away in the next PR #1536

/// has diverged from the database. This will delete existing tree storage and rebuild
/// it from scratch, which may take some time for large databases.
#[arg(long = "force-rebuild-tree-storage", default_value_t = false)]
force_rebuild_tree_storage: bool,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But there is no operator that will be able to use it since you cannot "deploy" with this option? Its only useable in a local context imo.

We similarly don't have an option to rebuild the sqlite database for example. Though technically its possible from raw blocks.

imo the fewer options there are, the better. The rarer the occurrence, the more "manual" it should be. If this becomes a common problem, then sure we can address it differently.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently we return an error if we have a root mismatch during apply_block.

if nullifier_tree_update.as_mutation_set().root() != header.nullifier_root() {
return Err(InvalidBlockError::NewBlockInvalidNullifierRoot.into());
}

Should we have a similar plan if the corruption happens "live"? Its a bit of a weird situation since we can't know if its an SMT bug, or a block building bug.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'd just bubble the error up and find the corrupted tree store on startup, and do it that way.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right but this doesn't actually bubble up to anywhere. This is part of the gRPC server on the store side, so it gets sent back to the block-producer which shrugs, rolls the block back and tries again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll create a follow-up issue, maybe it's time for fatality to disambiguate which ones are relevant for the query response and which ones are an actual issue for the service running.

@drahnr drahnr changed the title feat: add force rebuild and add a sanity check for sync state of rocksdb vs sqlite feat: add a sanity check for sync state of rocksdb vs sqlite, minor partitioning changes Jan 20, 2026
@drahnr drahnr force-pushed the bernhard-startup-build-and-rebuild-rocksdb branch from 8c2d310 to a4f0119 Compare January 20, 2026 13:04
@drahnr drahnr force-pushed the bernhard-startup-build-and-rebuild-rocksdb branch from a4f0119 to 24b0977 Compare January 20, 2026 13:05
ntx_builder_listener,
block_producer_listener,
data_directory: dir,
grpc_timeout: std::time::Duration::from_secs(30),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did this compile previously??

@drahnr drahnr merged commit 0fa7208 into next Jan 20, 2026
18 checks passed
@drahnr drahnr deleted the bernhard-startup-build-and-rebuild-rocksdb branch January 20, 2026 14:09
Comment on lines +94 to +96
nullifier_tree: miden_protocol::block::nullifier_tree::NullifierTree<
miden_protocol::crypto::merkle::smt::LargeSmt<S>,
>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit (but let's address in a follow-up): we don't need to use fully-qualified paths here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants