Skip to content

Conversation

kianenigma
Copy link
Contributor

@kianenigma kianenigma commented Oct 3, 2025

This PR moves all operations related to staking elections from a mandatory on_initialize with no consideration to weight, to an optional on_poll with accurate, pre-execution weight checking.

Why

  • on_initialize is a mandatory hook. If a single parachain block happens to contain too many of them, this block can never be authored and imported. In solo/relay chains, this is more forgiving, as you would have one slow block, instead of an indefinite stall.
  • For example, message-queue XCMs, scheduler and MBMs might overlap with the staking on_initialize in AH (unlikely, but totally possible), and put the chain at risk.
  • Contrary, poll hooks:
    • Might not happen at all by frame-executive (e.g. during MBMs)
    • Have access to a clear WeigthMeter, allowing the subject to make a decision about whether to proceed or not.

Functional Changes

As seen by the minimal diff in existing tests, this change, in the absence of weight scarcity, is almost a noop. The only difference is that the start signal from the signed pallet to the verifier pallet is now sent at the end of the signed phase, not the beginning the signed validation.

Non-Functional Changes

  • Now, the only pallets that call on_poll are multi_block (only the parent, not verifier and signed), and staking_async. This makes the code easier to audit.
  • Removes a lot of on_initialize terminology from weight functions
  • Cleans up some stale variations in the mock setup, allowing us to skip the signed pallet's on-initialize. This no longer makes sense as the parent pallet is only one that calls on_poll.

Implementation/Review Notes

Overall Design

The overall idea is to move all operations to a model similar to dispatchables, where before executing f(input) -> Result, we have access to a w(input) -> Weight that gives us the pre-execution weight. If the pre-execution weight is good, we proceed with executing. The execution may override the pre-execution weight to a smaller value if it wishes so.

/// ### Type
///
/// The commonly used `(Weight, Box<dyn Fn() -> Option<Weight>>)` should be interpreted as such:
///
/// * The `Weight` is the pre-computed worst case weight of the operation that we are going to
///   do.
/// * The `Box<dyn Fn() -> Option<Weight>>` is the function that represents that the work that
///   will at most consume the said amount of weight.
///   * Optionally, it can return an updated weight that is more "accurate", based on the
///     execution.
fn per_block_exec(current_phase: Phase<T>) -> (Weight, Box<dyn Fn() -> Option<Weight>>) {
    ...
}

Export Weight

Through this PR, I realized that we previously were never registering the weight of the export process. This is because the export is managed by staking pallet, and previously it had no way to know how much the weight of each export step is.

Now, we alter the ElectionProvider::status interface such that not only we signal if we are ready or not, but also we signal we are ready, and this is the weight of the next elect.

fn status() -> Result<Option<Weight>, ()> {
	match <CurrentPhase<T>>::get() {
		// we're not doing anything.
		Phase::Off => Err(()),

		// we're doing sth but not ready.
		Phase::Signed(_) |
		Phase::SignedValidation(_) |
		Phase::Unsigned(_) |
		Phase::Snapshot(_) |
		Phase::Emergency => Ok(None),

		// we're ready, and this is the weight of the next step
		Phase::Done => Ok(Some(T::WeightInfo::export_non_terminal())),
		Phase::Export(p) =>
			if p.is_zero() {
				Ok(Some(T::WeightInfo::export_terminal()))
			} else {
				Ok(Some(T::WeightInfo::export_non_terminal()))
			},
	}
}

Integration

The only breaking change of this PR is:

impl multi_block::Config for Runtime {
    // .. 
    type Signed = multi_block_signed::Pallet<Self>
}

TODO

  • Unit tests
  • Run all papi-integration tests at the end once.
  • Weight update

@kianenigma kianenigma requested a review from a team as a code owner October 3, 2025 12:26
@kianenigma kianenigma added T2-pallets This PR/Issue is related to a particular pallet. A4-backport-unstable2507 Pull request must be backported to the unstable2507 release branch labels Oct 3, 2025
@paritytech-workflow-stopper
Copy link

All GitHub workflows were cancelled due to failure one of the required jobs.
Failed workflow url: https://github.com/paritytech/polkadot-sdk/actions/runs/18223135929
Failed job name: cargo-clippy

@sigurpol sigurpol self-requested a review October 3, 2025 13:21
@kianenigma
Copy link
Contributor Author

/cmd bench --help

Copy link
Contributor

github-actions bot commented Oct 6, 2025

Command help:
usage: /cmd bench [-h] [--quiet] [--clean] [--image IMAGE]
                  [--runtime [{dev,westend,rococo,asset-hub-westend,asset-hub-rococo,bridge-hub-rococo,bridge-hub-westend,collectives-westend,coretime-rococo,coretime-westend,glutton-westend,people-rococo,people-westend} ...]]
                  [--pallet [PALLET ...]] [--fail-fast]

options:
  -h, --help            show this help message and exit
  --quiet               Won't print start/end/failed messages in PR
  --clean               Clean up the previous bot's & author's comments in PR
  --image IMAGE         Override docker image '--image
                        docker.io/paritytech/ci-unified:latest'
  --runtime [{dev,westend,rococo,asset-hub-westend,asset-hub-rococo,bridge-hub-rococo,bridge-hub-westend,collectives-westend,coretime-rococo,coretime-westend,glutton-westend,people-rococo,people-westend} ...]
                        Runtime(s) space separated
  --pallet [PALLET ...]
                        Pallet(s) space separated
  --fail-fast           Fail fast on first failed benchmark

**Examples**:
 Runs all benchmarks 
 /cmd bench

 Runs benchmarks for pallet_balances and pallet_multisig for all runtimes which have these pallets. **--quiet** makes it to output nothing to PR but reactions
 /cmd bench --pallet pallet_balances pallet_xcm_benchmarks::generic --quiet
 
 Runs bench for all pallets for westend runtime and fails fast on first failed benchmark
 /cmd bench --runtime westend --fail-fast
 
 Does not output anything and cleans up the previous bot's & author command triggering comments in PR 
 /cmd bench --runtime westend rococo --pallet pallet_balances pallet_multisig --quiet --clean

@kianenigma
Copy link
Contributor Author

/cmd bench --pallet pallet_election_provider_multi_block pallet_election_provider_multi_block_signed pallet_election_provider_multi_block_verifier pallet_election_provider_multi_block_unsigned --runtime asset-hub-westend

Copy link
Contributor

github-actions bot commented Oct 6, 2025

Command "bench --pallet pallet_election_provider_multi_block pallet_election_provider_multi_block_signed pallet_election_provider_multi_block_verifier pallet_election_provider_multi_block_unsigned --runtime asset-hub-westend" has started 🚀 See logs here

Copy link
Contributor

github-actions bot commented Oct 6, 2025

Command "bench --pallet pallet_election_provider_multi_block pallet_election_provider_multi_block_signed pallet_election_provider_multi_block_verifier pallet_election_provider_multi_block_unsigned --runtime asset-hub-westend" has failed ❌! See logs here

//!
//! ### Phase Transition
//!
//! Within all 4 pallets only the parent pallet is allowed to move forward the phases. As of now,
Copy link

@andreitrand andreitrand Oct 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: s/"move forward the phases"/"move the phases forward"/

_ => T::WeightInfo::on_initialize_nothing(),
};
fn on_poll(_now: BlockNumberFor<T>, weight_meter: &mut WeightMeter) {
// we need current phase to be read in any case -- we can live with it.
Copy link

@andreitrand andreitrand Oct 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Comment not clear enough, maybe rephrase?

crate::log!(info, "TESTING: Starting election at block {}", _now);
crate::mock::MultiBlock::start().unwrap();
}
// NOTE: why in here? bc it is more accessible, for example `roll_to_with_ocw`.
Copy link

@andreitrand andreitrand Oct 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: s/bc/because/ (for extra clarity)

/// * Upon last page of `Phase::Signed`, instruct the `Verifier` to start, if any solution
/// exists.
///
/// What it does not:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: "what it does not do"

Phase::Off => Err(()),

// we're doing sth but not read.
// we're doing sth but not ready.
Copy link

@andreitrand andreitrand Oct 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: s/sth/something/

);

// we have 1 block left in signed verification, but we cannot do anything here.
// we have 2 block left in signed verification, but we cannot do anything here.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: s/block/blocks/


// we have 1 block left in signed verification, but we cannot do anything here.
// we have 2 block left in signed verification, but we cannot do anything here.
// status is not set, as not enough time to do anything.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: (...) as there's not enough time (...)

Self::Export(0) => Self::Off,
Self::Export(non_zero_left) =>
Self::Export(non_zero_left.defensive_saturating_sub(One::one())),
// Export never moves forward via this function, and is always manually set in `elect`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: (...) set in the elect code path.

/// * Optionally, it can return an updated weight that is more "accurate", based on the
/// execution.
fn per_block_exec(current_phase: Phase<T>) -> (Weight, Box<dyn Fn() -> Option<Weight>>) {
type ExecuteFn = Box<dyn Fn() -> Option<Weight>>;
Copy link

@andreitrand andreitrand Oct 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the syntax Box<dyn Fn() -> Option<Weight>> is reused several times in this PR, wouldn't it make sense (for extra clarity) to define it as the type ExecuteFn at the start of the file and then reuse that type alias instead?

/// called. `Ok(false)` means we are doing something, but work is still ongoing. `elect` should
/// not be called. `Ok(true)` means we are done and ready for a call to `elect`.
fn status() -> Result<bool, ()>;
/// * `Err(())` should signal that we are not doing anything, and `elect` should def. not be

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: given the emphasis on it here, I suggest expanding "def." to the full word "definitely".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A4-backport-unstable2507 Pull request must be backported to the unstable2507 release branch T2-pallets This PR/Issue is related to a particular pallet.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants