KAFKA-19683: Remove more dead tests and rewrote 3 tests in TaskManagerTest [2/N] #20544

shashankhs11 · 2025-09-16T23:50:28Z

Changes made

Additional setUpTaskManager() overloaded method -- Created this
temporarily to pass the CI pipelines so that I can work on the failing
tests incrementally
Rewrote 3 tests to use stateUpdater thread

shashankhs11 · 2025-09-16T23:52:54Z

streams/src/test/java/org/apache/kafka/streams/processor/internals/TaskManagerTest.java

            .withInputPartitions(taskId00Partitions).build();
        final TasksRegistry tasks = mock(TasksRegistry.class);
-        final TaskManager taskManager = setUpTaskManager(ProcessingMode.AT_LEAST_ONCE, tasks, true, true);
+        final TaskManager taskManager = setUpTaskManager(ProcessingMode.AT_LEAST_ONCE, tasks, true);


These lines added are all temporary. Once we rewrite all the tests, we can do this once in the setUp()

Honestly. these changes to setUpTaskManager are quite confusing and I don't understand why you did it.

I agree, it is definitely a bit confusing 😅

The reason I did this is because, I wanted to identify all the tests that would fail after we removed the stateUpdaterEnabled flag. I thought the safest way to rewrite these tests incrementally would be to add another overloaded method without the flag, so we don’t break the CI checks in the meantime. This would temporarily add in a lot of unnecessary code, but my plan was to clean it up once all the tests are updated.

Do you think this approach make sense? I would really appreciate your thoughts, and I’m open to any suggestions.

It is confusing. Maybe you want to rename it more explicitly (setUpTaskManagerWithStateUpdater or setUpTaskManagerWithoutStateUpdater)?

shashankhs11 · 2025-09-17T00:18:31Z

streams/src/test/java/org/apache/kafka/streams/processor/internals/TaskManagerTest.java

-    public void shouldCommitNonCorruptedTasksOnTaskCorruptedException() {
-        final ProcessorStateManager stateManager = mock(ProcessorStateManager.class);
-
-        final StateMachineTask corruptedTask = new StateMachineTask(taskId00, taskId00Partitions, true, stateManager);
-        final StateMachineTask nonCorruptedTask = new StateMachineTask(taskId01, taskId01Partitions, true, stateManager);
+    public void shouldNotCommitCorruptedTasksOnTaskCorruptedException() {


I renamed this test from shouldCommitNonCorruptedTasksOnTaskCorruptedException. Based on my understanding, the commit logic happens at the StreamThread level, but only the exception propagation happens in TaskManager with checkStateUpdater. So I decided to omit the check for commit logic and rewrite the test.

And hence I propose to rename to shouldNotCommitCorruptedTasksOnTaskCorruptedException

Please correct if I am wrong or If I misunderstood!

No, I don't think I agree with this

The key for this test is that non-corrupted tasks are still committed as usual, the the offsets for the corrupted tasks are reset.

assertTrue(nonCorruptedTask.commitPrepared); assertThat(nonCorruptedTask.partitionsForOffsetReset, equalTo(Collections.emptySet())); assertThat(corruptedTask.partitionsForOffsetReset, equalTo(taskId00Partitions)); // check that we should not commit empty map either verify(consumer, never()).commitSync(emptyMap()); verify(stateManager).markChangelogAsCorrupted(taskId00Partitions);

This is still a valid test!

But maybe we can skip the handle Assignment / complete restoration part if we immediatelly mock a RUNNING task?

I rewrote this test again in 6df4e79

shashankhs11 · 2025-09-17T00:29:26Z

I rewrote only 3 tests for now. I wanted to ensure that my approach is correct before proceeding further.
@lucasbru -- tagging for review

Copilot

Pull Request Overview

This PR removes dead tests and rewrites 3 existing tests in TaskManagerTest to use the stateUpdater thread pattern. An additional overloaded setUpTaskManager() method was temporarily created to pass CI pipelines while working on failing tests incrementally.

Removed 3 dead tests that were no longer needed
Rewrote 3 tests to use stateUpdater thread instead of direct task manipulation
Added temporary overloaded setup method for incremental CI fixes

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-09-18T08:24:35Z

streams/src/test/java/org/apache/kafka/streams/processor/internals/TaskManagerTest.java

    @BeforeEach
    public void setUp() {
-        taskManager = setUpTaskManager(StreamsConfigUtils.ProcessingMode.AT_LEAST_ONCE, null, false);
+        taskManager = setUpTaskManager(StreamsConfigUtils.ProcessingMode.AT_LEAST_ONCE, null, false, false);


The method call now has two boolean parameters without clear meaning. Consider using named parameters or method overloading to make the intent clearer. The current call setUpTaskManager(..., false, false) is ambiguous about what each boolean controls.

Copilot · 2025-09-18T08:24:35Z

streams/src/test/java/org/apache/kafka/streams/processor/internals/TaskManagerTest.java

            .withInputPartitions(taskId00Partitions).build();
        final TasksRegistry tasks = mock(TasksRegistry.class);
-        final TaskManager taskManager = setUpTaskManager(ProcessingMode.AT_LEAST_ONCE, tasks, true, true);
+        final TaskManager taskManager = setUpTaskManager(ProcessingMode.AT_LEAST_ONCE, tasks, true);


Multiple test methods are calling the 3-parameter setUpTaskManager method with true for the third parameter, but this creates ambiguity about which overloaded method is being called. The new 3-parameter method expects processingThreadsEnabled while the old 4-parameter method expects stateUpdaterEnabled as the third parameter. This could lead to confusion and potential bugs when the temporary method is removed.

Copilot · 2025-09-18T08:24:35Z

streams/src/test/java/org/apache/kafka/streams/processor/internals/TaskManagerTest.java

            .withInputPartitions(taskId01Partitions).build();
        final TasksRegistry tasks = mock(TasksRegistry.class);
-        final TaskManager taskManager = setUpTaskManager(ProcessingMode.AT_LEAST_ONCE, tasks, true, true);
+        final TaskManager taskManager = setUpTaskManager(ProcessingMode.AT_LEAST_ONCE, tasks, true);


Multiple test methods are calling the 3-parameter setUpTaskManager method with true for the third parameter, but this creates ambiguity about which overloaded method is being called. The new 3-parameter method expects processingThreadsEnabled while the old 4-parameter method expects stateUpdaterEnabled as the third parameter. This could lead to confusion and potential bugs when the temporary method is removed.

Copilot · 2025-09-18T08:24:36Z

streams/src/test/java/org/apache/kafka/streams/processor/internals/TaskManagerTest.java

    public void shouldLockActiveOnHandleAssignmentWithProcessingThreads() {
        final TasksRegistry tasks = mock(TasksRegistry.class);
-        final TaskManager taskManager = setUpTaskManager(ProcessingMode.AT_LEAST_ONCE, tasks, true, true);
+        final TaskManager taskManager = setUpTaskManager(ProcessingMode.AT_LEAST_ONCE, tasks, true);


Multiple test methods are calling the 3-parameter setUpTaskManager method with true for the third parameter, but this creates ambiguity about which overloaded method is being called. The new 3-parameter method expects processingThreadsEnabled while the old 4-parameter method expects stateUpdaterEnabled as the third parameter. This could lead to confusion and potential bugs when the temporary method is removed.

Copilot · 2025-09-18T08:24:36Z

streams/src/test/java/org/apache/kafka/streams/processor/internals/TaskManagerTest.java

            .withInputPartitions(taskId01Partitions).build();
        final TasksRegistry tasks = mock(TasksRegistry.class);
-        final TaskManager taskManager = setUpTaskManager(ProcessingMode.AT_LEAST_ONCE, tasks, true, true);
+        final TaskManager taskManager = setUpTaskManager(ProcessingMode.AT_LEAST_ONCE, tasks, true);


Multiple test methods are calling the 3-parameter setUpTaskManager method with true for the third parameter, but this creates ambiguity about which overloaded method is being called. The new 3-parameter method expects processingThreadsEnabled while the old 4-parameter method expects stateUpdaterEnabled as the third parameter. This could lead to confusion and potential bugs when the temporary method is removed.

Copilot · 2025-09-18T08:24:36Z

streams/src/test/java/org/apache/kafka/streams/processor/internals/TaskManagerTest.java

            .withInputPartitions(taskId01Partitions).build();
        final TasksRegistry tasks = mock(TasksRegistry.class);
-        final TaskManager taskManager = setUpTaskManager(ProcessingMode.AT_LEAST_ONCE, tasks, true, true);
+        final TaskManager taskManager = setUpTaskManager(ProcessingMode.AT_LEAST_ONCE, tasks, true);


Multiple test methods are calling the 3-parameter setUpTaskManager method with true for the third parameter, but this creates ambiguity about which overloaded method is being called. The new 3-parameter method expects processingThreadsEnabled while the old 4-parameter method expects stateUpdaterEnabled as the third parameter. This could lead to confusion and potential bugs when the temporary method is removed.

lucasbru

Thanks. I left some comments!

lucasbru · 2025-09-18T09:41:18Z

streams/src/test/java/org/apache/kafka/streams/processor/internals/TaskManagerTest.java

+
+        final TaskManager taskManager = setUpTaskManager(ProcessingMode.AT_LEAST_ONCE, tasks, false);
+
+        assertTrue(taskManager.checkStateUpdater(time.milliseconds(), noOpResetter));


For my understanding - why do we actually need to call checkStateUpdater here?

You're right! I think it's not actually required for this specific test case, but included it more as a safety check to ensure that the punctuation should happen only when the system is "ready". But, we can safely omit the line

lucasbru · 2025-09-18T09:41:46Z

streams/src/test/java/org/apache/kafka/streams/processor/internals/TaskManagerTest.java

    }

    @Test
    public void shouldPunctuateActiveTasks() {


this test seems to be testing what it should. question is just whether it can be simplified (see below)

We can safely omit this line

assertTrue(taskManager.checkStateUpdater(time.milliseconds(), noOpResetter));

Let's omit it then

lucasbru · 2025-09-18T09:46:57Z

streams/src/test/java/org/apache/kafka/streams/processor/internals/TaskManagerTest.java

-    public void shouldCommitNonCorruptedTasksOnTaskCorruptedException() {
-        final ProcessorStateManager stateManager = mock(ProcessorStateManager.class);
-
-        final StateMachineTask corruptedTask = new StateMachineTask(taskId00, taskId00Partitions, true, stateManager);
-        final StateMachineTask nonCorruptedTask = new StateMachineTask(taskId01, taskId01Partitions, true, stateManager);
+    public void shouldNotCommitCorruptedTasksOnTaskCorruptedException() {


No, I don't think I agree with this

The key for this test is that non-corrupted tasks are still committed as usual, the the offsets for the corrupted tasks are reset.

assertTrue(nonCorruptedTask.commitPrepared); assertThat(nonCorruptedTask.partitionsForOffsetReset, equalTo(Collections.emptySet())); assertThat(corruptedTask.partitionsForOffsetReset, equalTo(taskId00Partitions)); // check that we should not commit empty map either verify(consumer, never()).commitSync(emptyMap()); verify(stateManager).markChangelogAsCorrupted(taskId00Partitions);

This is still a valid test!

But maybe we can skip the handle Assignment / complete restoration part if we immediatelly mock a RUNNING task?

lucasbru · 2025-09-18T09:47:47Z

streams/src/test/java/org/apache/kafka/streams/processor/internals/TaskManagerTest.java

            .withInputPartitions(taskId00Partitions).build();
        final TasksRegistry tasks = mock(TasksRegistry.class);
-        final TaskManager taskManager = setUpTaskManager(ProcessingMode.AT_LEAST_ONCE, tasks, true, true);
+        final TaskManager taskManager = setUpTaskManager(ProcessingMode.AT_LEAST_ONCE, tasks, true);


Honestly. these changes to setUpTaskManager are quite confusing and I don't understand why you did it.

lucasbru · 2025-09-30T08:50:31Z

@shashankhs11 let me know when you need another review here

shashankhs11 · 2025-10-04T00:15:51Z

@lucasbru I have made the changes as suggested. Tagging for review

github-actions bot added triage PRs from the community streams tests Test fixes (including flaky tests) labels Sep 16, 2025

shashankhs11 commented Sep 16, 2025

View reviewed changes

shashankhs11 commented Sep 17, 2025

View reviewed changes

github-actions bot removed the triage PRs from the community label Sep 17, 2025

lucasbru self-assigned this Sep 18, 2025

lucasbru requested a review from Copilot September 18, 2025 08:21

Copilot AI reviewed Sep 18, 2025

View reviewed changes

lucasbru reviewed Sep 18, 2025

View reviewed changes

shashankhs11 requested a review from lucasbru September 19, 2025 17:58

shashankhs11 added 4 commits October 3, 2025 17:05

step2: more cleanup and rewrote 3 tests

9dd75df

cleanup unnecessary comments

9daf9cd

rewrite to test same as previous

06b65dc

explicit function renaming

85d1863

shashankhs11 force-pushed the KAFKA-19683-2 branch from 6df4e79 to 85d1863 Compare October 4, 2025 00:08

fix indentation

03483b6


		final TaskManager taskManager = setUpTaskManager(ProcessingMode.AT_LEAST_ONCE, tasks, false);

		assertTrue(taskManager.checkStateUpdater(time.milliseconds(), noOpResetter));

KAFKA-19683: Remove more dead tests and rewrote 3 tests in TaskManagerTest [2/N] #20544

Are you sure you want to change the base?

KAFKA-19683: Remove more dead tests and rewrote 3 tests in TaskManagerTest [2/N] #20544

Conversation

shashankhs11 commented Sep 16, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shashankhs11 Sep 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

shashankhs11 commented Sep 17, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Copilot AI Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

lucasbru left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lucasbru commented Sep 30, 2025

Uh oh!

shashankhs11 commented Oct 4, 2025

Uh oh!

Uh oh!

shashankhs11 commented Sep 16, 2025 •

edited by github-actions bot

Loading

shashankhs11 Sep 16, 2025 •

edited

Loading