Add dedicated IBD rules for Nakamoto #5655

jcnelson · 2025-01-04T05:38:37Z

This fixes #5642 by adding a dedicated IBD inference rule for Nakamoto. The Stacks node is in IBD mode when either of the following conditions are true:

The sortition height is less than the Bitcoin block height
The highest available tenure (as determined by the Nakamoto block downloader) is higher than the ongoing Stacks tenure

This (hopefully) fixes some edge cases we've seen on testnet whereby a node can erroneously believe it is not sync'ed when it really is. This had impacted the affected node's ability to participate in StackerDB replication. I intend to test this on naka3.sh, on testnet, and on a single mainnet signer.

Writing the code for that second criterion led to the discoveries of #5649 and #5650, since I had been using rc_consensus_hash as the ongoing Stacks tenure when in fact it was not.

EDIT: this also contains #5667, so let's merge that first.

…ures is mutated over the course of the state machine's lifetime

…s tenure, so check this when inferring IBD

…n /v2/info

obycode

LGTM!

stackslib/src/net/download/nakamoto/download_state_machine.rs

stackslib/src/net/p2p.rs

stackslib/src/net/download/nakamoto/download_state_machine.rs

jbencin

Left some style comments but looks fine to me

obycode

Just some minor comments for now, but there are conflicts and failing tests.

stacks-signer/src/v0/signer.rs

kantai · 2025-02-05T15:06:01Z

testnet/stacks-node/src/nakamoto_node/miner.rs

-        if TEST_BLOCK_ANNOUNCE_STALL.get() {
+        if relay::fault_injection::stacks_announce_is_blocked() {


These seem like they could be independently controlled... TEST_BLOCK_ANNOUNCE_STALL is used in the integration tests to prevent the miner themselves from announcing a new block (this is used to test that the signer set can announce blocks by themselves) -- but if the relayer stall is also active, it seems like it prevents the chains coordinator thread from waking up?

At the time this PR was written, this was an attempt to make some CI tests less flaky. It appears that not only is this not necessary anymore, but also my code here has led to CI breakage.

kantai · 2025-02-05T15:09:38Z

testnet/stacks-node/src/tests/signer/v0.rs

+            config.tenure_last_block_proposal_timeout = Duration::from_secs(0);
+


How is the forked_tenure_testing test related to this changeset? Why did this test need to be altered, and can you describe the intended behavior changes to the test?

I've reverted it -- it was part of me trying to un-flake this test from a while ago.

kantai · 2025-02-05T15:14:50Z

testnet/stacks-node/src/nakamoto_node/relayer.rs

+        // if the highest available tenure is known, then is it the same as the ongoing stacks
+        // tenure?  If so, then we're not IBD. If not, then we're IBD.
+        // If it is not known, then we're not in IBD.


Doesn't this mean that a running node operating at chain tip would switch into IBD = true at every tenure boundary?

kantai · 2025-02-05T15:16:54Z

testnet/stacks-node/src/nakamoto_node/relayer.rs

@@ -400,6 +400,67 @@ impl RelayerThread {
        || !self.config.miner.wait_for_block_download
    }

+    /// Compute and set the global IBD flag from a NetworkResult


Suggested change

/// Compute and set the global IBD flag from a NetworkResult

/// Compute and set the global initial block download (IBD) flag using data from the given NetworkResult

kantai

This changeset appears to have broken many of the integration tests.

This should also have some kind of unit test (or tight assertions in an integration test) about what the expected value of IBD should be given different values of the NetworkResult: its not clear to me what the intended behavior around the arrival of a new tenure is, and then what the downstream impact on miner commitments would end up being.

I also have some questions about the necessity of this -- it seems like #5735 actually resolves the testnet genesis sync issues (and the mainnet ones as well), so what is this PR solving? Is it speeding up genesis sync? Restarts? I can't tell from looking at this PR.

… (which is what the relayer would do anyway since the absence of a highest-known tenure on another node implies that the local view is the highest)

into fix/5642

jcnelson added 19 commits January 3, 2025 15:03

feat: add a way to query the highest available tenure

cedbfd6

chore: improve documentation on downloader state

8e283e6

chore: report highest available tenure from downloader via NetworkResult

93fdd22

chore: pass through highest available tenure

b90d5d5

chore: API sync

2429f50

feat: add way to set IBD

2b45248

feat: infer IBD from burnchain IBD and stacks IBD

6979a64

fix: load IBD from globals

9331e25

chore: document pox_sync_wait() better

6d39033

Merge branch 'develop' into fix/5642

c5f7af5

fix: immediately compute highest-available tenure since available_ten…

b8011da

…ures is mutated over the course of the state machine's lifetime

chore: pass ongoing stacks tenure ID to NetworkResult

e1feabe

chore: pass highest available tenure from downloader to NetworkResult

fb61989

chore: API sync

4102306

docs: get_headers_height() is 1-indexed

5049ee4

fix: the highest available tenure may be lower than the ongoing stack…

4506b46

…s tenure, so check this when inferring IBD

chore: make method private again

6dabc04

chore: expand follower_bootup_simple() to test is_fully_synced flag i…

4474f2d

…n /v2/info

Merge branch 'develop' into fix/5642

84658a4

jcnelson requested review from obycode, jbencin and jferrant January 4, 2025 05:38

jcnelson requested a review from a team as a code owner January 4, 2025 05:38

obycode previously approved these changes Jan 6, 2025

View reviewed changes

jbencin reviewed Jan 6, 2025

View reviewed changes

stackslib/src/net/download/nakamoto/download_state_machine.rs Show resolved Hide resolved

jbencin reviewed Jan 6, 2025

View reviewed changes

stackslib/src/net/p2p.rs Show resolved Hide resolved

jbencin reviewed Jan 6, 2025

View reviewed changes

stackslib/src/net/p2p.rs Show resolved Hide resolved

jbencin reviewed Jan 6, 2025

View reviewed changes

stackslib/src/net/download/nakamoto/download_state_machine.rs Show resolved Hide resolved

jbencin previously approved these changes Jan 6, 2025

View reviewed changes

aldur added this to the 3.1.0.0.3 milestone Jan 7, 2025

Merge branch 'develop' into fix/5642

bad7699

kantai mentioned this pull request Jan 9, 2025

Testnet: unable to sync from genesis #5676

Closed

jcnelson modified the milestones: 3.1.0.0.4, 3.1.0.0.3 Jan 13, 2025

aldur modified the milestones: 3.1.0.0.3, 3.1.0.0.4 Jan 13, 2025

aldur assigned jcnelson Jan 15, 2025

aldur modified the milestones: 3.1.0.0.4, 3.1.0.0.5 Jan 21, 2025

jferrant previously approved these changes Jan 21, 2025

View reviewed changes

obycode reviewed Jan 22, 2025

View reviewed changes

stacks-signer/src/v0/signer.rs Outdated Show resolved Hide resolved

stacks-signer/src/v0/signer.rs Outdated Show resolved Hide resolved

stacks-signer/src/v0/signer.rs Outdated Show resolved Hide resolved

aldur modified the milestones: 3.1.0.0.5, 3.1.0.0.6 Feb 4, 2025

kantai changed the title ~~Fix/5642~~ Add dedicated IBD rules for Nakamoto Feb 4, 2025

Merge branch 'develop' into fix/5642

e16070d

jcnelson dismissed jferrant’s stale review via e16070d February 5, 2025 04:23

jcnelson requested review from jbencin, jferrant, obycode and kantai February 5, 2025 04:24

kantai reviewed Feb 5, 2025

View reviewed changes

jcnelson and others added 4 commits February 5, 2025 13:10

fix: remove relayer fault injection to see if it fixes integration tests

5e6f6bd

Merge branch 'develop' into fix/5642

447ba1f

chore: set IBD=false on boot-up if the node has sync'ed its burnchain…

20a4d66

… (which is what the relayer would do anyway since the absence of a highest-known tenure on another node implies that the local view is the highest)

Merge branch 'fix/5642' of https://github.com/stacks-network/stacks-core

f8db9f3

into fix/5642

aldur removed this from the 3.1.0.0.6 milestone Feb 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add dedicated IBD rules for Nakamoto #5655

Add dedicated IBD rules for Nakamoto #5655

jcnelson commented Jan 4, 2025 •

edited

Loading

obycode left a comment

jbencin left a comment

obycode left a comment

kantai Feb 5, 2025

jcnelson Feb 5, 2025

kantai Feb 5, 2025

jcnelson Feb 5, 2025

kantai Feb 5, 2025

kantai Feb 5, 2025

kantai left a comment

		if TEST_BLOCK_ANNOUNCE_STALL.get() {
		if relay::fault_injection::stacks_announce_is_blocked() {

		config.tenure_last_block_proposal_timeout = Duration::from_secs(0);

	/// Compute and set the global IBD flag from a NetworkResult
	/// Compute and set the global initial block download (IBD) flag using data from the given NetworkResult

Add dedicated IBD rules for Nakamoto #5655

Are you sure you want to change the base?

Add dedicated IBD rules for Nakamoto #5655

Conversation

jcnelson commented Jan 4, 2025 • edited Loading

obycode left a comment

Choose a reason for hiding this comment

jbencin left a comment

Choose a reason for hiding this comment

obycode left a comment

Choose a reason for hiding this comment

kantai Feb 5, 2025

Choose a reason for hiding this comment

jcnelson Feb 5, 2025

Choose a reason for hiding this comment

kantai Feb 5, 2025

Choose a reason for hiding this comment

jcnelson Feb 5, 2025

Choose a reason for hiding this comment

kantai Feb 5, 2025

Choose a reason for hiding this comment

kantai Feb 5, 2025

Choose a reason for hiding this comment

kantai left a comment

Choose a reason for hiding this comment

jcnelson commented Jan 4, 2025 •

edited

Loading