Skip to content

Conversation

@shanicky
Copy link
Contributor

Summary
This pull request introduces incremental partition assignment/unassignment and batch seek functionality to the simulated Kafka consumer in madsim-rdkafka, enhancing its fidelity to real Kafka client behavior. These changes enable more granular control over partition management and offset manipulation, supporting advanced simulation and client use cases.

Details

  • Key change 1: Added incremental_assign and incremental_unassign APIs to allow adding or removing specific partitions without affecting the full assignment, closely mirroring librdkafka semantics.
  • Key change 2: Implemented seek_partitions to atomically update offsets for multiple partitions in a batch, with strict validation to reject unsupported offset types (such as Offset::Stored and OffsetTail).
  • Key change 3: Updated internal state management to ensure thread safety and consistency, including clearing buffered messages on seek and maintaining idempotency on assignment changes.
  • Extensive new tests validate these APIs, including edge cases like unsupported offsets and non-existent partitions.

Architecture impact
The changes shift partition assignment from a bulk, all-or-nothing operation to a more flexible, incremental model. This enables workflows such as sticky rebalancing and dynamic partition allocation, and improves simulation realism for consumer group scenarios. Offset management is now more precise, with atomic batch operations and buffer invalidation ensuring correctness.

Technical highlights

  • Partition assignment/unassignment leverages set operations for efficient detection and modification, ensuring idempotency and safe removal.
  • Batch seek enforces that all partitions are currently assigned and only accepts explicit or absolute offsets, failing fast on invalid inputs.
  • Mutex-protected state ensures thread safety across assignment and message buffer updates.
  • The public consumer API is extended to expose these new capabilities, maintaining ergonomic client usage.

Testing strategy
Unit and integration tests exercise all new APIs and cover error scenarios, ensuring correctness under concurrent and edge conditions. Manual verification of polling behavior after seek and assignment changes is recommended for confidence in buffer state management.

Shanicky Chen and others added 3 commits July 7, 2025 03:07
…set, and wrapper exposure

Signed-off-by: Shanicky Chen <[email protected]>
…rts and type hints, clean up unused import

Signed-off-by: Shanicky Chen <[email protected]>
Enhance simulated Kafka consumer with advanced partition management
and offset seeking, improving feature parity and usability.

- Add incremental_assign and incremental_unassign methods to BaseConsumer
- Implement seek_partitions for atomic offset updates on assignments
- Expose new methods in Consumer API for both sync and async usage
- Introduce comprehensive tests for assignment, unassignment, and seeking
- Increase simulation fidelity for complex Kafka consumer workflows

Signed-off-by: Shanicky Chen <[email protected]>
@shanicky shanicky marked this pull request as ready for review November 5, 2025 07:47
Refactor consumer partition assignment and seek logic for correctness.
This prevents partial state mutations on invalid seek requests, ensuring
atomicity and internal consistency.

- Validate all partitions in seek before mutating state
- Only clear buffer and update offsets after successful validation
- Replace HashSet deduplication with iterator-based checks
- Add tests verifying buffer and offsets remain unchanged on seek error
- Improve code readability and robustness for edge cases

Signed-off-by: Shanicky Chen <[email protected]>
Refactor the closure used for checking topic-partition assignment to a
single-expression inline style for improved readability and idiomatic
Rust usage. No logic or behavior is changed.

- Replace multi-line closure block with inline closure in `.any()` call
- Improve code clarity and stylistic consistency
- Align with Rust code style and lint best practices
- Change is isolated to one function, with no API or logic impact

Signed-off-by: Shanicky Chen <[email protected]>
@BugenZhao BugenZhao requested a review from Copilot November 12, 2025 09:24
Copilot finished reviewing on behalf of BugenZhao November 12, 2025 09:29
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request adds incremental partition assignment/unassignment and batch seek functionality to the simulated Kafka consumer. The changes enable more granular partition management and multi-partition offset manipulation, improving simulation fidelity.

Key Changes

  • Added incremental_assign and incremental_unassign methods for modifying partition assignments without replacing the entire assignment
  • Implemented seek_partitions for atomically updating offsets across multiple partitions with validation
  • Extended test coverage with comprehensive tests for new APIs and edge cases

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
madsim-rdkafka/src/sim/topic_partition_list.rs Added elements() method to retrieve all partition list elements
madsim-rdkafka/src/sim/consumer.rs Implemented incremental assign/unassign and batch seek APIs with validation, plus comprehensive test suite

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 191 to 241
{
let current_tpl = self.tpl.lock();
for requested in &topic_partition_list.list {
match requested.offset {
Offset::Invalid => {
return Err(KafkaError::Seek(format!(
"invalid offset for {}:{}",
requested.topic, requested.partition
)));
}
Offset::Stored => {
return Err(KafkaError::Seek(format!(
"stored offset is not supported for {}:{}",
requested.topic, requested.partition
)));
}
Offset::OffsetTail(_) => {
return Err(KafkaError::Seek(format!(
"offset tail is not supported for {}:{}",
requested.topic, requested.partition
)));
}
_ => {}
}

current_tpl
.list
.iter()
.find(|elem| {
elem.topic == requested.topic && elem.partition == requested.partition
})
.ok_or_else(|| {
KafkaError::Seek(format!(
"partition {}:{} is not currently assigned",
requested.topic, requested.partition
))
})?;
}
}

self.msgs.lock().clear();
let mut current_tpl = self.tpl.lock();

for requested in &topic_partition_list.list {
let current = current_tpl
.list
.iter_mut()
.find(|elem| elem.topic == requested.topic && elem.partition == requested.partition)
.expect("partition must exist after validation");
current.offset = requested.offset;
}
Copy link

Copilot AI Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Race condition between validation and offset update. The lock is released after validation (line 229) and reacquired at line 232. Between these operations, another thread could call incremental_unassign to remove a partition that was just validated, causing the expect at line 239 to panic.

Consider holding a single lock for the entire validation and update operation, or re-validate after reacquiring the lock. The buffer clear at line 231 should also happen atomically with the offset updates to maintain consistency.

Copilot uses AI. Check for mistakes.
Refactor BaseConsumer::assign to acquire the tpl mutex only once.
This change improves thread safety and efficiency by performing
validation, message clearing, and offset updates atomically.

- Acquire tpl lock once for all assignment operations
- Move message clearing and offset updates into the critical section
- Reduce risk of race conditions and inconsistent state
- Simplify code for better readability and maintainability

Signed-off-by: Shanicky Chen <[email protected]>
@shanicky shanicky requested a review from Copilot November 12, 2025 09:36
Copilot finished reviewing on behalf of shanicky November 12, 2025 09:40
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@kwannoel kwannoel self-requested a review November 12, 2025 09:53
Refactor partition management in BaseConsumer for performance and
correctness. Replace linear scans with HashSet/HashMap-based lookups,
and clarify message buffering semantics.

- Use HashSet to deduplicate assignments and unassignments efficiently
- Employ HashMap for constant-time partition offset updates in seeking
- Improve documentation on buffered message behavior for assign/unassign
- Ensure idempotence and robustness in incremental_* operations
- Reduce code complexity and potential for subtle bugs

Signed-off-by: Shanicky Chen <[email protected]>
@shanicky shanicky requested a review from Copilot November 12, 2025 10:50
Copilot finished reviewing on behalf of shanicky November 12, 2025 10:51
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Refactor partition offset update logic to validate all requested
partitions before mutating internal state in BaseConsumer.

- Separate validation and mutation into two distinct phases
- Avoid clearing message queue if any partition assignment is invalid
- Use collected indices for efficient offset updates after validation
- Prevent partial state changes on error for safety and correctness
- Improve maintainability by consolidating redundant loops

Signed-off-by: Shanicky Chen <[email protected]>
@shanicky shanicky requested a review from Copilot November 12, 2025 11:02
Copilot finished reviewing on behalf of shanicky November 12, 2025 11:03
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

@BugenZhao BugenZhao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@BugenZhao
Copy link
Contributor

Is it ready to merge now?

@shanicky
Copy link
Contributor Author

Is it ready to merge now?

Yes

@BugenZhao BugenZhao merged commit 8c5a805 into madsim-rs:main Nov 25, 2025
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants