Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(resharding) - Make shard ids non-contiguous #12181

Open
wants to merge 39 commits into
base: master
Choose a base branch
from

Conversation

wacban
Copy link
Contributor

@wacban wacban commented Oct 1, 2024

This is part 1 of adding support for non-contiguous shard ids. The principle idea is to make ShardId into a newtype so that it's not possible to use it to index arrays with chunk data. In addition I'm adding ShardIndex type and a mapping between shard indices and shard ids so that it's possible to covert one to another as necessary.

The TLDR of this approach is to make the types right, fix compiler errors and pray to the software gods that things work out.

I am now giving up on trying to make the migration in a single PR. Instead I am introducing some temporary structures and methods that are compatible with both approaches. My current plan for the migration is as follows:

  1. Switch to the new ShardId definition.
  2. Fix some number of compilation errors (using the temporary objects)
  3. Switch back to the old definition
  4. PR, review, merge
  5. Repeat 1-4 until there are no more errors.
  6. Cleanup the temporary objects
  7. Adjust some tests to use the new ShardLayout with non-contiguous shard ids.
  8. Try to get rid of the mapping wherever possible

There are a few common themes in this PR:

  • read the shard layout and convert shard id to shard index in order to use it to index some array or chunk data
  • replace enumerate with reading the shard id directly from the chunk header / other chunk data
  • replace using shard id by adding enumerate to get the shard index
  • add ? to shard id in tracing logs because the newtype ShardId doesn't work without it

must-review files:

  • shard_layout.rs
  • primitives-core/src/types.rs
  • shard_assignment.rs

good-to-review files:

  • state_transition_data.rs

/// indices in range 0..NUM_SHARDS and casting to ShardId. Once the transition
/// if fully complete it potentially may be simplified to a regular type alias.
///
/// TODO get rid of serde
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What prevents removing serde now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some tests use serde as a snapshotting library to make sure old layouts are not
changed accidentally.

PartialOrd,
Ord,
)]
pub struct ShardId(u64);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be easily changed to ShardId(u32)?

}

/// Get the numerical value of the shard id. This should not be used as an
/// index into an array, as the shard id may be any arbitrary number.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add a doc-link to get_shard_index?


impl ShardId {
/// Create a new shard id. Please note that this function should not be used
/// directly. Instead the ShardId should be obtained from the shard_layout.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does seem slightly counter-intuitive to say new shouldn't be used, given the fact it's a trivial constructor. What are the risks?

@@ -444,6 +476,14 @@ impl ShardLayout {
pub fn shard_uids(&self) -> impl Iterator<Item = ShardUId> + '_ {
self.shard_ids().map(|shard_id| ShardUId::from_shard_id_and_layout(shard_id, self))
}

pub fn get_shard_index(&self, shard_id: ShardId) -> usize {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we should say this returns a index from 0..num_shards, with stable ordering

btw will there be guarantees about the index of a shard S when shard_layout changes and S is in the old and new layout?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I will add comments and implementation later. I just want to make sure that this approach works fine first.

btw will there be guarantees about the index of a shard S when shard_layout changes and S is in the old and new layout?

Good question and it's the opposite. The shard indices will change for almost all shards during resharding. It is ok though because the shard index is maintained in the shard layout. The users should make sure to use the correct shard layout when getting the shard index - but again I think this expectation is reasonable and quite common.

@shreyan-gupta shreyan-gupta self-requested a review October 2, 2024 06:49
Copy link
Contributor

@shreyan-gupta shreyan-gupta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In future iterations, we should try to get rid of shard_index from places like sample_chunk_producer and ChunkEndorsementsBitmap by changing their structure, but for now this approach looks reasonable.


/// Get the numerical value of the shard id. This should not be used as an
/// index into an array, as the shard id may be any arbitrary number.
pub fn get(self) -> u64 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we keep shard_id.into() as primary conversion method instead of shard_id.get() while we are sorting this mess? That looks more natural to me and the calls to .into() are always explicit in rust

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, we could use ShardId::from(id) instead of ShardId::new(id)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for into. see my similar comment about operations involving u64 or other integers as well.

shard_uids.iter().map(|_| HashSet::new()).collect();

let mut shard_account_ids: BTreeMap<ShardId, HashSet<AccountId>> =
shard_ids.iter().map(|&shard_id| (shard_id, HashSet::new())).collect();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: shard_layout.shard_ids().map(...).collect();

Copy link
Contributor

@tayfunelmas tayfunelmas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general a big +1 for the new type ShardId(u64)!

@@ -85,7 +85,7 @@ impl CongestionControl {
// `clamped_f64_fraction` clamps to exactly 1.0.
if congestion == 1.0 {
// Red traffic light: reduce to minimum speed
if sender_shard == self.info.allowed_shard() as u64 {
if sender_shard.get() == self.info.allowed_shard() as u64 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for these kinds of operations (to hide the internal u64 more), would it make sense to not have get() but overriding the operation involving u64 (in this particular case for example overriding == op or adding equals(u64))?

similarly for ops like self.set_allowed_shard(allowed_shard.get() as u16);, defining into() operations?


/// Get the numerical value of the shard id. This should not be used as an
/// index into an array, as the shard id may be any arbitrary number.
pub fn get(self) -> u64 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for into. see my similar comment about operations involving u64 or other integers as well.

This is non-trivial change. I adjusted the shard assignment to fully
operate only on ShardIndex instead of ShardId.
@Trisfald
Copy link
Contributor

Trisfald commented Oct 2, 2024

I'm not sure if it's in the scope of this PR, but I had issues trying to initialize genesis with certain ShardLayoutV2. It fails around this point when shard ids aren't contiguous.

EDIT: nvm I think it will be fixed with this PR

let congestion_seed = apply_state.block_height.wrapping_add(apply_state.shard_id);
// TODO(wacban) Using non-contiguous shard id here breaks some
// assumptions. The shard index should be used here instead.
let congestion_seed =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the worst that can happen? The allowed shard is not chosen fairly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could lead to some shards never be the allowed shard. I'm not sure as I'm avoiding digging deeper into those non-trivial cases as this task is massive as it stands. For now a todo should do :)

@wacban
Copy link
Contributor Author

wacban commented Oct 2, 2024

I'm not sure if it's in the scope of this PR, but I had issues trying to initialize genesis with certain ShardLayoutV2. It fails around this point when shard ids aren't contiguous.

EDIT: nvm I think it will be fixed with this PR

I think it should be possible to use ShardLayoutV2 in tests as long as you set it up so that the shard ids are contiguous, ordered, etc. Resharding would break that however so yeah you'll need to wait for this PR.

@wacban
Copy link
Contributor Author

wacban commented Oct 4, 2024

@wacban
Copy link
Contributor Author

wacban commented Oct 4, 2024

@shreyan-gupta and @tayfunelmas regarding the new(), get() and into() - I'm on board. For now the temporary solution is to use the new_shard_id_tmp and shard_id_as_u32 methods that are compatible with both old and new ways. Once the migration is done I hope I can easily come back and automatically refactor that to some nice idiomatic rust patterns.

@wacban wacban marked this pull request as ready for review October 4, 2024 15:50
@wacban wacban requested a review from a team as a code owner October 4, 2024 15:50
Copy link

codecov bot commented Oct 4, 2024

Codecov Report

Attention: Patch coverage is 76.94581% with 234 lines in your changes missing coverage. Please review.

Project coverage is 71.63%. Comparing base (0aa1343) to head (ce4e061).

Files with missing lines Patch % Lines
chain/chain/src/chain.rs 64.94% 14 Missing and 20 partials ⚠️
chain/client/src/sync/external.rs 13.04% 18 Missing and 2 partials ⚠️
nearcore/src/state_sync.rs 5.88% 6 Missing and 10 partials ⚠️
chain/chain/src/runtime/tests.rs 72.09% 12 Missing ⚠️
chain/client/src/client.rs 63.63% 6 Missing and 6 partials ⚠️
chain/chain/src/test_utils/kv_runtime.rs 74.41% 7 Missing and 4 partials ⚠️
chain/client/src/debug.rs 31.25% 10 Missing and 1 partial ⚠️
chain/indexer/src/streamer/mod.rs 0.00% 10 Missing ⚠️
chain/chain/src/store/mod.rs 40.00% 5 Missing and 4 partials ⚠️
chain/epoch-manager/src/lib.rs 66.66% 0 Missing and 8 partials ⚠️
... and 31 more
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #12181      +/-   ##
==========================================
+ Coverage   71.60%   71.63%   +0.02%     
==========================================
  Files         824      825       +1     
  Lines      165513   165855     +342     
  Branches   165513   165855     +342     
==========================================
+ Hits       118516   118808     +292     
- Misses      41875    41892      +17     
- Partials     5122     5155      +33     
Flag Coverage Δ
backward-compatibility 0.17% <0.00%> (-0.01%) ⬇️
db-migration 0.17% <0.00%> (-0.01%) ⬇️
genesis-check 1.26% <0.64%> (+<0.01%) ⬆️
integration-tests 38.71% <35.27%> (-0.01%) ⬇️
linux 71.40% <76.05%> (+<0.01%) ⬆️
linux-nightly 71.20% <76.84%> (+<0.01%) ⬆️
macos 54.24% <66.84%> (+0.13%) ⬆️
pytests 1.57% <1.43%> (-0.01%) ⬇️
sanity-checks 1.38% <0.79%> (-0.01%) ⬇️
unittests 65.37% <71.47%> (+0.03%) ⬆️
upgradability 0.21% <0.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@Trisfald Trisfald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGMT

I left a few comment, not very important
(I didn't go very in depth in a few files.. it's a very long PR)

@@ -214,12 +214,14 @@ impl TestEnv {
// TODO(congestion_control): pass down prev block info and read congestion info from there
// For now, just use default.
let prev_block_hash = self.head.last_block_hash;
let state_root = self.state_roots[shard_id as usize];
let epoch_id =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we could just unwrap there

for (shard_id, chunk_header) in block.chunks().iter().enumerate() {

let epoch_id = block.header().epoch_id();
let shard_layout = self.epoch_manager.get_shard_layout(&epoch_id)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: a fn to get shard_layout from block could help with readability .. a little

let prev_head = self.chain_store.head()?;
let is_caught_up = block_preprocess_info.is_caught_up;
let provenance = block_preprocess_info.provenance.clone();
let block_start_processing_time = block_preprocess_info.block_start_processing_time;
// TODO(#8055): this zip relies on the ordering of the apply_results.
// TODO(wacban): do the above todo
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a TODO to do a TODO? 🔥

let upper_bound = StatePartKey(sync_hash, shard_id + 1, 0);

// The upper bound shard id is not a valid ShardId. It should only be
// used as a uppoer bound for the shard id in the database.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// used as a uppoer bound for the shard id in the database.
// used as a upper bound for the shard id in the database.

@@ -800,7 +800,7 @@ pub fn report_recorded_column_sizes(trie: &Trie, apply_state: &ApplyState) {
// Tracing span to measure time spent on reporting column sizes.
let _span = tracing::debug_span!(
target: "runtime", "report_recorded_column_sizes",
shard_id = apply_state.shard_id,
shard_id = ?apply_state.shard_id,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You did already change it everywhere, but we could have added a Display impl for shard_id 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried, it didn't work. tracing library requires some tracing::Value trait to be implemented but it's actually not possible to add it for new types.

/// Historically the ShardId was always in the range 0..NUM_SHARDS and was used
/// as the shard index. This is no longer the case, and the ShardIndex should be
/// used instead.
pub type ShardIndex = usize;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any plan to transform this into a newtype in a future PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm good question. It's meant to be used as an index to arrays with chunk data so kinda exactly like usize. I think let's keep it as type alias for now. Please share your thoughts though.

@@ -589,16 +590,21 @@ impl EpochManagerAdapter for MockEpochManager {
&self,
_prev_hash: &CryptoHash,
shard_ids: Vec<ShardId>,
) -> Result<Vec<ShardId>, Error> {
Ok(shard_ids)
) -> Result<Vec<(ShardId, ShardIndex)>, Error> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to return an error if shard layout is V2?

@@ -1453,8 +1456,18 @@ impl Client {
) {
let chunk_header = partial_chunk.cloned_header();
self.chain.blocks_delay_tracker.mark_chunk_completed(&chunk_header);

// TODO(#10569) We would like a proper error handling here instead of `expect`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which raises the question: is anybody able to recover from this failure?

@@ -122,7 +122,7 @@ pub trait EpochManagerAdapter: Send + Sync {
&self,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we update the comment to reflect the new return type? I won't suggest changing the method name unless somebody sees any value in doing so

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also valid for other modified methods in this file

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants