Skip to content

Conversation

@julienrbrt
Copy link
Member

@julienrbrt julienrbrt commented Oct 30, 2025

ref: #1914

A choice has been made to make this logic in the executor and avoid extending the reaper and the sequencer.
This is because, updating the repeer, means passing down the last fetched da height accross all components.
It adds a lot of complexity otherwise. Adding it in the sequencer may be preferable, but this makes the inclusion in a sync node less straightforward. This is what is being investigated.

Compared to the previous implementation, a forced transaction does not have any structure. It should be the raw structure from the execution client. This is to keep ev-node know nothing about the transaction. No signature checks, no validation of correctness. The execution client must make sure to reject gibberish transactions.

  • implement for executor
    • add size checks, to make sure batch doesn't go over max bytes
  • implement for syncer
  • requirements:
    • syncer must be able to detect if sequencer was malicious and hasn't included a forced tx
    • fetched inclusion of the txs bytes within the block data
    • if that happened, reject all future sequencer blocks
      • eventually, become based and keep produce block
  • tool for submitting forced txs in celestia easily -> feat(server): add force inclusion cmd ev-abci#295
  • make sure epochs are in sync between syncer and executor

---- for later, won't be included in this pr (ref #2797 (comment))

  • improve offline capabilities (if sequencer goes offline, full node become fully based)
    • when sequencer comes back only, needs to fetch if any block was produced :/

@github-actions
Copy link
Contributor

github-actions bot commented Nov 1, 2025

The latest Buf updates on your PR. Results from workflow CI and Release / buf-check (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed⏩ skipped✅ passed✅ passedNov 10, 2025, 3:58 PM

@claude
Copy link
Contributor

claude bot commented Nov 1, 2025

Claude finished @julienrbrt's task —— View job


Comprehensive Code Review: Forced Inclusion Mechanism

I've completed a thorough review of this forced inclusion implementation. This is a significant feature adding censorship resistance and based sequencer capabilities to ev-node. Overall, the implementation is well-architected and follows Go best practices, but there are several areas that need attention before merging.


Summary Assessment

Strengths:

  • Well-documented ADR with clear rationale and design decisions
  • Clean separation of concerns with modular architecture
  • Comprehensive test coverage for core functionality
  • Epoch-based fetching efficiently minimizes DA queries
  • Two-tier size validation prevents DoS attacks
  • Proper error wrapping and context propagation

Areas of Concern:

  • Some concurrency safety issues in critical paths
  • Missing validation for edge cases in epoch boundaries
  • Potential memory leaks with unbounded queues
  • Insufficient error handling in syncer verification
  • Performance considerations for high-throughput scenarios

Critical Issues (Must Fix)

1. Concurrency Safety in Based Sequencer

Location: sequencers/based/sequencer.go:138-164

Issue: The createBatchFromQueue method modifies s.txQueue without holding a mutex, but this can be called concurrently with GetNextBatch which also modifies the queue.

func (s *BasedSequencer) createBatchFromQueue(maxBytes uint64) *coresequencer.Batch {
    // No mutex held here!
    if len(s.txQueue) == 0 {
        return &coresequencer.Batch{Transactions: nil}
    }
    
    // ... modifies s.txQueue ...
}

Risk: Race conditions could lead to data corruption or panics from concurrent slice access.

Recommendation:

type BasedSequencer struct {
    // ... existing fields ...
    mu sync.RWMutex  // Add mutex for txQueue protection
}

func (s *BasedSequencer) GetNextBatch(...) {
    s.mu.Lock()
    defer s.mu.Unlock()
    // ... existing code ...
}

func (s *BasedSequencer) createBatchFromQueue(...) {
    // Caller must hold s.mu
}

2. Memory Leak in Pending Forced Inclusion Txs

Location: sequencers/single/sequencer.go:31-35, 358

Issue: The pendingForcedInclusionTxs slice can grow unbounded if forced transactions consistently exceed maxBytes. There's no limit or eviction policy.

type Sequencer struct {
    // ...
    pendingForcedInclusionTxs []pendingForcedInclusionTx  // Can grow unbounded
}

Scenario: If a malicious actor posts many large (but < 2MB) transactions to DA, they could exhaust sequencer memory.

Recommendation:

  • Add a maximum pending queue size (e.g., 1000 transactions or 100MB total)
  • Implement FIFO eviction when limit reached
  • Log warnings when approaching limits
  • Consider persisting pending txs to disk for crash recovery

3. Epoch Boundary Edge Case

Location: types/epoch.go:36-50

Issue: When daHeight < daStartHeight, the function returns boundaries that may not make sense:

if daHeight < daStartHeight {
    return daStartHeight, daStartHeight + daEpochSize - 1
}

This returns a future epoch when the height is in the past. This could cause issues if a node syncs from before daStartHeight.

Recommendation:

if daHeight < daStartHeight {
    // Return an empty/invalid epoch indicator
    return 0, 0  // Or return error
}

4. Missing Error Context in Syncer Verification

Location: block/internal/syncing/syncer.go:528-534

Issue: When forced inclusion verification fails, the error handling doesn't properly halt the node or transition to based sequencer mode as mentioned in the PR description.

if err := s.verifyForcedInclusionTxs(currentState, data); err != nil {
    s.logger.Error().Err(err).Uint64("height", nextHeight).Msg("forced inclusion verification failed")
    if errors.Is(err, errMaliciousProposer) {
        s.cache.RemoveHeaderDAIncluded(headerHash)
        return err  // But what happens next?
    }
}

Missing:

  • Node should halt or enter safe mode
  • Should trigger alerts/monitoring
  • Should provide recovery instructions
  • Follow-up actions mentioned in comments but not implemented

Recommendation: Add a critical error handler that:

  1. Stops accepting new blocks from the malicious sequencer
  2. Triggers monitoring alerts
  3. Provides clear operator guidance on next steps
  4. Optionally auto-restart as based sequencer (future work per PR comments)

High Priority Issues (Should Fix)

5. Hash Collision Risk in Syncer Verification

Location: block/internal/syncing/syncer.go:691-702

Issue: Using hashTx (presumably SHA256) of transaction bytes as map key is theoretically safe, but the code doesn't show the hashTx implementation. If it's a weak hash, collisions could allow censorship.

Recommendation:

  • Show/verify hashTx implementation uses strong cryptographic hash
  • Consider using string(tx) directly as key if memory permits (no collision risk)
  • Add comment explaining collision resistance properties

6. No Handling of Gas Limits

Location: sequencers/common/size_validation.go:3

Issue: TODO comment indicates gas validation is missing:

// TODO(@julienrbrt): technically we may need to check for block gas as well

Risk: A transaction could be small in bytes but consume excessive gas, causing block production failures after forced inclusion.

Recommendation:

  • Extend Executor interface with a ValidateTx(tx []byte) error method
  • Call this validation before adding forced txs to batches
  • Size validation alone is insufficient for EVM chains

7. DA Height Synchronization

Location: sequencers/single/sequencer.go:133-161

Issue: DA height updates happen inside GetNextBatch, but the logic for advancing height is complex and could get out of sync between sequencer and executor.

if forcedEvent.EndDaHeight > currentDAHeight {
    c.SetDAHeight(forcedEvent.EndDaHeight)
} else if forcedEvent.StartDaHeight > currentDAHeight {
    c.SetDAHeight(forcedEvent.StartDaHeight)
}

Concern: Why check both EndDaHeight and StartDaHeight? What if they're both <= currentDAHeight?

Recommendation:

  • Always use EndDaHeight if event was successfully processed
  • Document the invariants that must hold
  • Add assertion/validation that DA height never decreases

8. Epoch Fetching Granularity

Location: block/internal/da/forced_inclusion_retriever.go:109-125

Issue: The code fetches each intermediate DA height sequentially:

for epochHeight := epochStart + 1; epochHeight < epochEnd; epochHeight++ {
    result := r.client.RetrieveForcedInclusion(ctx, epochHeight)
    // ...
}

Performance: For large epochs (e.g., 100 blocks), this makes 100 sequential DA calls.

Recommendation:

  • Implement parallel fetching with goroutines + error group
  • Set reasonable concurrency limit (e.g., 10 concurrent requests)
  • This was mentioned in PR comments as future work but should be prioritized

Medium Priority Issues (Nice to Have)

9. Configuration Validation

Location: pkg/config/config.go:248-251

Issue: Validation only checks if based sequencer requires aggregator mode, but doesn't validate:

  • ForcedInclusionNamespace format/validity
  • DAEpochForcedInclusion reasonable bounds
  • Consistency between config and genesis

Recommendation:

func (c *Config) Validate() error {
    // ... existing checks ...
    
    if c.Node.BasedSequencer && c.DA.ForcedInclusionNamespace == "" {
        return fmt.Errorf("based sequencer requires forced_inclusion_namespace to be configured")
    }
    
    // Validate namespace format if using Celestia
    if c.DA.ForcedInclusionNamespace != "" {
        if err := validateNamespaceFormat(c.DA.ForcedInclusionNamespace); err != nil {
            return fmt.Errorf("invalid forced_inclusion_namespace: %w", err)
        }
    }
    
    return nil
}

10. Logging Consistency

Issue: Some critical operations use .Debug() when they should use .Info() or .Warn():

Examples:

  • forced_inclusion_retriever.go:88: DA height from future should be .Warn() not .Debug()
  • sequencers/single/sequencer.go:165: Retrieved forced txs should be .Info() if count > 0

Recommendation: Review and adjust log levels:

  • .Debug() - verbose details for development
  • .Info() - important state transitions, successful operations
  • .Warn() - recoverable errors, unusual situations
  • .Error() - errors requiring attention

11. Test Coverage Gaps

Missing Test Scenarios:

  1. Forced inclusion retriever:

    • What happens when epoch end is from future but epoch start isn't?
    • Partial epoch processing (break early due to future height)
    • Empty blobs in DA (zero-length transactions)
  2. Single sequencer:

    • Concurrent GetNextBatch calls (if allowed)
    • Queue prepend after multiple batches trimmed
    • Pending forced txs that never fit (always too large)
  3. Based sequencer:

    • Queue growing over multiple epochs
    • Forced txs arriving faster than batch processing
    • Configuration not having forced inclusion namespace
  4. Syncer verification:

    • Malicious sequencer including some but not all forced txs
    • Forced tx appearing in wrong order
    • Forced tx modified by sequencer

Recommendation: Add integration tests covering these scenarios before merge.

12. Documentation Completeness

Issues:

  1. ADR is excellent but code comments are sometimes sparse
  2. Interface methods lack godoc comments (e.g., ForcedInclusionRetriever)
  3. No operator guide for:
    • Setting appropriate DAEpochForcedInclusion values
    • Monitoring forced inclusion namespace
    • Responding to malicious sequencer detection

Recommendation:

  • Add comprehensive godoc to all exported types/methods
  • Create operator documentation in /docs folder
  • Add troubleshooting guide for common issues

Architecture & Design Feedback

Strengths

  1. Epoch-based fetching: Excellent design choice to minimize DA queries while maintaining timely inclusion. The deterministic epoch calculation prevents race conditions.

  2. Two-tier size validation: Using both absolute DA limits (2MB) and per-batch limits (MaxBytes) provides defense in depth against DoS.

  3. Separation of concerns: DA retriever, sequencer, and syncer are cleanly separated with well-defined interfaces.

  4. Backward compatibility: Optional forced inclusion (enabled only when namespace configured) allows gradual rollout.

  5. Transaction preservation: Pending queue for oversized forced txs ensures no valid transactions are lost.

Design Questions

  1. Why separate forced inclusion namespace?

    • ADR mentions reviewer questioned this
    • Separation makes sense to avoid filtering logic in ev-node
    • But increases DA costs for users (two namespace submissions)
    • Consider documenting cost-benefit analysis
  2. Why no forced tx ordering guarantees?

    • Forced txs from same epoch are unordered in the block
    • Could matter for DeFi applications (front-running)
    • Consider: should forced txs maintain DA submission order?
  3. Based sequencer simplifications:

    • Based sequencer ignores SubmitBatchTxs entirely
    • This is correct, but should there be a warning/error to the user?
    • Current implementation silently drops the transactions
  4. Genesis parameter for epoch size:

    • DAEpochForcedInclusion is in genesis (consensus parameter)
    • Can't be changed without hard fork
    • Is this the right trade-off vs config parameter?
    • Consider: should different chains have flexibility?

Security Analysis

Positive Security Properties

  1. Censorship resistance: Users have guaranteed bypass mechanism via DA
  2. Malicious sequencer detection: Full nodes verify and reject invalid blocks
  3. No timing attacks: Deterministic epoch boundaries prevent manipulation
  4. Blob size limits: Prevents resource exhaustion attacks

Security Concerns

  1. DOS via pending queue:

    • As mentioned in critical issues, unbounded pending queue is a vector
    • Mitigation: Add queue size limits
  2. DA spam:

    • Anyone can post to forced inclusion namespace
    • Sequencer must process all (bounded by size checks)
    • Mitigation: DA layer's native spam protection + blob size validation
    • Consider: Should there be application-level rate limiting?
  3. Forced tx validity:

    • Sequencer doesn't validate forced tx semantics
    • Invalid txs are passed to execution client
    • Execution client must reject malformed txs
    • Risk: What if execution client has vulnerability that crashes on malformed tx?
    • Recommendation: Add basic sanity checks (non-empty, minimum size, etc.)
  4. Epoch boundary manipulation:

    • Malicious DA could report inconsistent heights
    • Mitigated by: ev-node validates against multiple sources
    • But worth documenting this assumption

Performance Considerations

Current Performance Characteristics

  1. DA query frequency: Every DAEpochForcedInclusion blocks

    • Example: epoch=10, DA block time=15s = query every 150s
    • Reasonable for most use cases
  2. Batch construction overhead:

    • Sequential forced tx processing: O(n) where n = forced txs
    • Size validation: O(n)
    • Overall acceptable for expected volumes
  3. Memory usage:

    • Based sequencer txQueue: O(total forced tx size)
    • Single sequencer pending queue: O(oversized tx count)
    • Syncer verification map: O(block tx count)

Performance Optimization Opportunities

  1. Parallel epoch fetching: Already mentioned, high impact for large epochs

  2. Batch queue optimization:

    • Current implementation has O(n) prepend operation
    • Could use circular buffer for O(1) operations
    • Trade-off: complexity vs performance gain (probably not worth it unless proven bottleneck)
  3. Caching forced inclusion results:

    • Could cache epoch results to avoid refetch on restart
    • Needs cache invalidation logic
    • Benefit: Faster node startup
  4. Size calculation optimization:

    • Currently calculates total size by iterating transactions
    • Could maintain running total
    • Micro-optimization, low priority

Testing Assessment

Test Coverage

Well-tested components:

  • ✅ Epoch calculations (types/epoch_test.go) - comprehensive
  • ✅ Forced inclusion retriever - good coverage of happy and error paths
  • ✅ Size validation - edge cases covered
  • ✅ Single sequencer - batch handling tested

Under-tested components:

  • ⚠️ Based sequencer integration scenarios
  • ⚠️ Syncer malicious sequencer detection
  • ⚠️ Concurrent access patterns
  • ⚠️ Error recovery paths

Test Quality

Strengths:

  • Table-driven tests for epoch calculations
  • Good use of mocks for DA interface
  • Clear test naming conventions

Improvements needed:

  • Add fuzzing for epoch boundary calculations
  • Property-based testing for size validation
  • Stress tests for queue management
  • Integration tests with real DA (LocalDA for CI)

Code Quality

Go Best Practices

Followed:

  • ✅ Proper error wrapping with %w
  • ✅ Context propagation for cancellation
  • ✅ Structured logging with zerolog
  • ✅ Interface-based design for testability
  • ✅ No naked returns
  • ✅ Consistent naming conventions

Could improve:

  • ⚠️ Some long functions (e.g., processForcedInclusionTxs 89 lines)
  • ⚠️ Magic numbers (e.g., 2 * 1024 * 1024 should be constant)
  • ⚠️ Missing godoc comments on exported functions
  • ⚠️ Could use more helper functions to reduce complexity

Metrics & Observability

Current state:

  • Basic metrics exist for sequencer
  • Logging is comprehensive
  • Missing: forced inclusion specific metrics

Recommendations:

  • Add metrics:
    • forced_inclusion_txs_total (counter)
    • forced_inclusion_txs_pending (gauge)
    • forced_inclusion_epoch_fetch_duration (histogram)
    • forced_inclusion_verification_failures_total (counter)
  • Add tracing spans for forced inclusion path
  • Add alerts for malicious sequencer detection

Migration & Deployment

Breaking Changes

  1. Sequencer initialization: Requires new ForcedInclusionRetriever parameter
  2. Genesis file: New da_epoch_forced_inclusion field (defaults to 50)
  3. Config: New forced_inclusion_namespace field

Migration path:

  • ✅ All changes are backward compatible (optional features)
  • ✅ Defaults allow existing deployments to work unchanged
  • ✅ Can enable forced inclusion per-network

Deployment Recommendations

  1. Rollout strategy:

    • Deploy to testnet first with small epoch size (e.g., 5)
    • Monitor DA query patterns and performance
    • Gradually increase epoch size based on metrics
    • Deploy to mainnet with well-tested epoch size
  2. Configuration guidance:

    • For high-throughput chains: larger epoch (50-100)
    • For low-latency needs: smaller epoch (5-10)
    • Document trade-offs in operator guide
  3. Monitoring setup:

    • Alert on malicious sequencer detection
    • Monitor pending queue sizes
    • Track forced inclusion namespace usage
    • Dashboard for DA fetch latency

Recommendations Summary

Must Fix Before Merge (P0)

  1. ✅ Add mutex protection for BasedSequencer.txQueue (concurrency safety)
  2. ✅ Implement bounded pending queue for forced inclusion txs (memory leak)
  3. ✅ Fix epoch boundary calculation for heights before daStartHeight
  4. ✅ Improve error handling in syncer verification (malicious sequencer detection)

Should Fix Before Merge (P1)

  1. Verify hashTx implementation uses strong cryptographic hash
  2. Add ValidateTx to executor interface for gas validation (or document limitation)
  3. Simplify DA height update logic and document invariants
  4. Add critical missing test cases (see Test Coverage Gaps)

Can Be Follow-up Work (P2)

  1. Parallel epoch fetching for performance
  2. Enhanced configuration validation
  3. Operator documentation and runbooks
  4. Metrics and observability improvements
  5. Consistent logging level review

Final Assessment

This is a well-architected and important feature that significantly improves ev-node's censorship resistance and decentralization properties. The ADR is excellent, the code is generally high quality, and the test coverage is good.

However, there are critical concurrency safety and resource management issues that must be addressed before merge. The unbounded pending queue and unprotected shared state in the based sequencer are production risks.

Recommendation: Request changes to address P0 issues. Once those are resolved and tested, this PR will be ready to merge.

The team has done great work here, and with the critical fixes, this will be a solid foundation for forced inclusion and based sequencer capabilities.


Reviewed Files

Core implementation:

  • docs/adr/adr-019-forced-inclusion-mechanism.md
  • types/epoch.go
  • block/internal/da/forced_inclusion_retriever.go
  • sequencers/based/sequencer.go ⚠️ (concurrency issues)
  • sequencers/single/sequencer.go ⚠️ (memory leak risk)
  • sequencers/common/size_validation.go
  • block/internal/syncing/syncer.go ⚠️ (error handling)
  • block/internal/executing/executor.go
  • pkg/config/config.go
  • pkg/genesis/genesis.go
  • core/sequencer/sequencing.go

Test files reviewed:

  • types/epoch_test.go
  • block/internal/da/forced_inclusion_retriever_test.go
  • Multiple sequencer and syncer test files ✅

Total files changed: 47 files (+4351/-574 lines)
Review completion: Comprehensive review of core implementation and architecture


@julienrbrt julienrbrt changed the title [WIP] feat: forced inclusion for executor feat: forced inclusion Nov 5, 2025
Copy link
Contributor

@alpe alpe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice start!
Can you elaborate why you decided on a separate namespace for the force inclusion TX? The consumers have to read both ns anyway to stay up to date


event.StartDaHeight = epochHeight
event.Txs = append(event.Txs, result.Data...)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to prepare for malicious content. let's exit the loop early when a tx size threshold is reached. This can be a multiple of common.DefaultMaxBlobSize used by the executor

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense for the height check yes!. However i was thinking of doing no other checks and let the execution client deal with gibberish data (this is why i added that as requirement in the execution interface description)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to keep raw TX data in the namespace, there is not much we can do here to validate, indeed. A size check is an easy win but more would require extending the executor interface for a checkTX.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i agree, and this actually may be required to avoid congestions issue and losing txs.

@julienrbrt
Copy link
Member Author

Can you elaborate why you decided on a separate namespace for the force inclusion TX? The consumers have to read both ns anyway to stay up to date

This was a suggestion. Personally I think it makes sense, as we are filtering what's coming up in that namespace at fetching level directly in ev-node. What is posted in the force included namespace is handled directly by the execution client. ev-node only pass down bytes.

@julienrbrt julienrbrt marked this pull request as ready for review November 6, 2025 20:46
@julienrbrt julienrbrt marked this pull request as draft November 6, 2025 20:47
@github-actions
Copy link
Contributor

github-actions bot commented Nov 10, 2025

PR Preview Action v1.6.3

🚀 View preview at
https://evstack.github.io/docs-preview/pr-2797/

Built to branch main at 2025-11-24 13:31 UTC.
Preview will be ready when the GitHub Pages deployment is complete.

@codecov
Copy link

codecov bot commented Nov 10, 2025

Codecov Report

❌ Patch coverage is 76.37131% with 112 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (main@c161967). Learn more about missing BASE report.

Files with missing lines Patch % Lines
sequencers/single/sequencer.go 72.38% 28 Missing and 9 partials ⚠️
block/internal/da/forced_inclusion_retriever.go 79.51% 12 Missing and 5 partials ⚠️
block/internal/executing/executor.go 43.33% 11 Missing and 6 partials ⚠️
sequencers/based/sequencer.go 81.17% 11 Missing and 5 partials ⚠️
block/internal/syncing/syncer.go 80.00% 8 Missing and 1 partial ⚠️
block/components.go 0.00% 6 Missing and 1 partial ⚠️
core/sequencer/dummy.go 0.00% 3 Missing ⚠️
block/public.go 75.00% 2 Missing ⚠️
pkg/config/config.go 81.81% 1 Missing and 1 partial ⚠️
pkg/genesis/genesis.go 75.00% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2797   +/-   ##
=======================================
  Coverage        ?   65.44%           
=======================================
  Files           ?       85           
  Lines           ?     7777           
  Branches        ?        0           
=======================================
  Hits            ?     5090           
  Misses          ?     2121           
  Partials        ?      566           
Flag Coverage Δ
combined 65.44% <76.37%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@julienrbrt julienrbrt marked this pull request as ready for review November 10, 2025 16:14
@github-actions
Copy link
Contributor

github-actions bot commented Nov 10, 2025

The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed⏩ skipped✅ passed✅ passedNov 24, 2025, 1:30 PM

@julienrbrt
Copy link
Member Author

List of improvements to do in follow-ups:

  1. Improve DA fetching by parallelizing epoch fetching
  2. Simplify DA requests after [EPIC] Remove DA Interface #2796. Fetch DA latest height, instead of checking epoch boundaries
  3. Solve edge case where proposer misses blocks and comes back online with forced included blocks published

@julienrbrt julienrbrt marked this pull request as draft November 10, 2025 16:19
@julienrbrt
Copy link
Member Author

julienrbrt commented Nov 11, 2025

We discussed the above in the standup (#2797 (comment)), and a few ideas came.

1 - 2 . When making the call async, we need to make sure the executor and full node stay insync with an epoch. This can be done easily by making an epoch a few blocks behind the actual DA height.

  • We need to make sure all heights of that epoch are available when we fetch the epoch (there is already code for this)
  • We need to scale that block window based on an average fetching time (the higher the da epoch is, the higher the window is)
  1. We can re-use some code from [WIP] HA failover #2814 to automate node restarting (syncing -> base sequencer)
    • When the sequencer comes back online and missed an epoch, it needs to sync up until the head of the da layer
    • Based sequencers must check the forced included transaction namespace (@julienrbrt -- I picked this solution, otherwise it would need to fetch 2 namespaces instead of 1. alternative is to have the sequencer fetch only at the end of the epoch the header namespace) for a synced checkpoint from the da layer, and restart as sync node if it was found.

@julienrbrt julienrbrt marked this pull request as ready for review November 11, 2025 16:29
@julienrbrt julienrbrt marked this pull request as draft November 11, 2025 16:58
Copy link
Contributor

@alpe alpe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for answering all my questions and comments.
There is still the todo in the code to store unprocessed direct TX when the max block size is reached.


event.StartDaHeight = epochHeight
event.Txs = append(event.Txs, result.Data...)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to keep raw TX data in the namespace, there is not much we can do here to validate, indeed. A size check is an easy win but more would require extending the executor interface for a checkTX.

julienrbrt added a commit that referenced this pull request Nov 13, 2025
we decided to remove the sequencer go.mod, as ev-node can provide
directly the sequencer implementation (sequencers/single was already
depending on ev-node anyway)

this means no go.mod need to be added for the new based sequencers in
#2797
@julienrbrt julienrbrt marked this pull request as ready for review November 13, 2025 10:58
@julienrbrt
Copy link
Member Author

Once is PR is merged, we should directly after:

In the meantime, I have disabled the feature so it can be merged (0d790ef)

@julienrbrt
Copy link
Member Author

FYI the upgrade test will fail until tastora is updated.

Users can submit transactions in two ways:

### Systems Affected
1. **Normal Path**: Submit to sequencer's mempool/RPC (fast, low cost)
Copy link
Contributor

@damiannolan damiannolan Nov 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the mempool not used app side for abci? Does ev-node have a mempool? Or does "sequencer's mempool/RPC" here refer to the sequencer node as a single entity even if its running the app out-of-process as is with evm.

From what I understand, the reth/evm mempool is used for evm and the sequencer queries the pending txs pool/queue in GetTxs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is the execution layers mempool. you are correct

Comment on lines +531 to +541
### Full Node Verification Flow

```
1. Receive block from DA or P2P
2. Before applying block:
a. Fetch forced inclusion txs from DA at block's DA height
b. Build map of transactions in block
c. Verify all forced txs are in block
d. If missing: reject block, flag malicious proposer
3. Apply block if verification passes
```
Copy link
Contributor

@damiannolan damiannolan Nov 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense! I think my mental model was assuming that ev-node did not need to be run with ev-reth for full nodes. But on reflection I think I was incorrect or misunderstood.

I assume ev-node must always be run even for evm stack full nodes but with --evnode.node.aggregator=false.

Copy link
Member Author

@julienrbrt julienrbrt Nov 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, full node runs the whole stack. Light nodes on the other hand just fetch headers.

- Only at epoch boundaries
- Scan epoch range for forced transactions
3. Get batch from mempool queue
4. Prepend forced txs to batch
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if we wanted to zk prove forced inclusion txs we could query the forced inclusion namespace at each epoch and prepend them to the txs list that we compare with the execution client's state transition function 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i dont believe you would need to check the forcedinclusion namespace since the txs will be included in a block at some point. if you want to verify that the txs on the namespace were included then you would need to follow it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks! Does that mean that the ev-node sequencer will fetch txs from the FI namespace and then repost them in a SignedData payload to the data namespace?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct, that is how i understand it @julienrbrt, correct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is correct. This is how we do the verification as well on the sync node side.

github-merge-queue bot pushed a commit that referenced this pull request Nov 21, 2025
Rename `evm-single` to `evm` and `grpc-single` to `evgrpc` for clarity.

ref: #2797 (comment)
@julienrbrt julienrbrt changed the base branch from main to julien/extract-fi November 21, 2025 12:15
github-merge-queue bot pushed a commit that referenced this pull request Nov 21, 2025
Extract some logic from #2797.
Those refactors were done to ease force inclusion integration but they
can be extracted to be merged sooner
Base automatically changed from julien/extract-fi to main November 21, 2025 13:14
- **Censorship**: Mitigated by forced inclusion verification
- **DA Spam**: Limited by DA layer's native spam protection and two-tier blob size limits
- **Block Withholding**: Full nodes can fetch and verify from DA independently
- **Oversized Batches**: Prevented by strict size validation at multiple levels
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if a batch within an epoch is too big do we spread it out over many blocks?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2. Before applying block:
a. Fetch forced inclusion txs from DA at block's DA height
b. Build map of transactions in block
c. Verify all forced txs are in block
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can be done after the fact right? like if the block with forced inclusion gets created but the full node doesnt have the data then it wont block waiting for the data?

Copy link
Member Author

@julienrbrt julienrbrt Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea was to do it at the same time. We just need to ensure the data is already there, that is why there's a window introduced in #2842.

In this PR it is blocking (however the feature cannot be turned on until #2842 so that there's no issue merging this :p)

Comment on lines +510 to +513
1. Timer triggers GetNextBatch
2. Fetch forced inclusion txs from DA (via DA Retriever)
- Only at epoch boundaries
- Scan epoch range for forced transactions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what sort of latency does this introduce?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Negligible, if not none, as after #2842, it will be fetched async. Currently (in this PR), it is blocking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants