forked from circlefin/malachite
-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
Description
The WAL StartedHeight
message handler returns early without sending a reply when the requested height equals the current height, causing the RecvErr
error in the consensus engine.
Root Cause
In code/crates/engine/src/wal.rs:
Msg::StartedHeight(height, reply_to) => {
if state.height == height {
debug!(%height, "WAL already at height, ignoring");
return Ok(()); // ❌ No reply sent to caller
}
state.height = height;
self.started_height(state, height, reply_to).await?;
}
When the WAL is already at the requested height, it logs "WAL already at height, ignoring"
, and exits with Ok(())
without sending a reply through the channel.
- Consensus receives
StartHeight
orRestartHeight
message. - Calls
wal_fetch(height)
and executes:
ractor::call!(self.wal, WalMsg::StartedHeight, height)?
- WAL handler does not send a reply if
state.height == height
, which leads to a timeout and theRecvErr
error. - Consensus gets stuck waiting for a response (code ref).
Consequences
- Consensus
ractor::call!
at consensus.rs never receives a reply. - The channel eventually closes or times out, causing
Messaging(RecvErr)
error. - Consensus remains stuck, unable to progress.