Skip to content

Conversation

@brice-stacks
Copy link
Contributor

Improves the UX by merging validate-naka-block and validate-block, improves the lookup time by optimizing the queries, and improves the output for a better experience. Adds --early-exit flag to exit on first error instead of completing all blocks.

Improves the UX by merging `validate-naka-block` and `validate-block,`
improves the lookup time by optimizing the queries, and improves the
output for a better experience. Adds `--early-exit` flag to exit on
first error instead of completing all blocks.
@codecov
Copy link

codecov bot commented Dec 2, 2025

Codecov Report

❌ Patch coverage is 1.70213% with 231 lines in your changes missing coverage. Please review.
✅ Project coverage is 71.95%. Comparing base (7ff75a2) to head (48db5d0).
⚠️ Report is 52 commits behind head on develop.

Files with missing lines Patch % Lines
contrib/stacks-inspect/src/lib.rs 0.00% 230 Missing ⚠️
stackslib/src/chainstate/stacks/db/blocks.rs 80.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #6735      +/-   ##
===========================================
+ Coverage    70.05%   71.95%   +1.89%     
===========================================
  Files          578      579       +1     
  Lines       358373   360160    +1787     
===========================================
+ Hits        251064   259140    +8076     
+ Misses      107309   101020    -6289     
Files with missing lines Coverage Δ
contrib/stacks-inspect/src/main.rs 0.00% <ø> (ø)
stackslib/src/chainstate/stacks/db/blocks.rs 82.66% <80.00%> (+10.09%) ⬆️
contrib/stacks-inspect/src/lib.rs 4.51% <0.00%> (-15.37%) ⬇️

... and 390 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7ff75a2...48db5d0. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Comment on lines +227 to +231
fn collect_block_entries_for_selection(
db_path: &str,
selection: &BlockSelection,
chainstate: &StacksChainState,
) -> Vec<BlockScanEntry> {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sanity check: isn't this method currently returning 2n blocks when called with BlockSelection::Last and BlockSelection::First? seems like it will get the first N and last N blocks from both tables

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. I decided to just remove first, since that doesn't really seem like a useful scenario any way, and then fixed last.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to just remove first, since that doesn't really seem like a useful scenario any way

And if someone needs it after all, they can use a height range starting at 0.

panic!("Failed to open staging blocks DB at {staging_blocks_db_path}: {e}");
});
let sql = format!(
"SELECT index_block_hash, consensus_hash, anchored_block_hash, height FROM staging_blocks {clause}"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: since we are only consuming index_block_hash, maybe we can simplity the query as SELECT index_block_hash FROM staging_blocks {clause}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I had that there originally because I needed other fields when I was also reusing the new structure in other places, but then ended up deleting that code. I'll delete these now.

for (i, index_block_hash) in index_block_hashes.iter().enumerate() {
if i % 100 == 0 {
println!("Checked {i}...");
let sql = format!("SELECT index_block_hash, height FROM nakamoto_staging_blocks {clause}");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: since we are only consuming index_block_hash, maybe we can simplity the query as SELECT index_block_hash FROM nakamoto_staging_blocks {clause}

Comment on lines 232 to 233
let mut seen = HashSet::new();
let mut entries = Vec::new();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suuuuper nit: These two variables hold over 5 million index_block_hash values each when the function is called with BlockSelection::All, stored as 64 bytes hex strings. ~320 MB for each variable (plus additional overhead from the data structures). It's not a concern right now, but we could reduce memory usage by about half if we move the

let index_block_hash = StacksBlockId::from_hex(&index_block_hash_hex).unwrap();

processing inside this function and avoid keeping the hex strings around.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as always, feel free to skip the changes for the nits

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point. Also, a good reason to just remove the seen tracking altogether.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

Remove the `seen` tracking since it doesn't really gain us much and uses
a lot of space when processing many blocks.
Comment on lines +188 to +189
if start > end {
return Err("<start-block> must be <= <end-block>".into());

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to keep end exclusive (see other comment), start == end wouldn't make much sense (it would always be zero blocks), so it would indicate a likely mistake. So in that case maybe we should do

Suggested change
if start > end {
return Err("<start-block> must be <= <end-block>".into());
if start >= end {
return Err("<start-block> must be < <end-block>".into());

instead?

Comment on lines +153 to +154
let blocks = end.saturating_sub(*start);
format!("WHERE orphaned = 0 ORDER BY index_block_hash ASC LIMIT {start}, {blocks}")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes end exclusive. We probably want it inclusive for consistency with HeightRange?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually even worse, I don't think this even works?

Say the user wants blocks 10 through 20. If there are 10 nakamoto blocks with a hash that is smaller than any eopch2 block, then the first block in the overall result set is the first epoch2 block. But because our OFFSET was 10, we never retrieved that block.

next_staging_block.commit_burn,
next_staging_block.sortition_burn,
);
Ok(())

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume you meant to change replay_block to return a Result<,> and then return that result here?

Returning an unconditional Ok(()) regardless of what happens in replay_block (if it even ever returns -- it has some process::exits in there) doesn't seem like it's the intent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants