KAFKA-19683: Remove more dead tests and rewrote 3 tests in TaskManagerTest [2/N] #20544

shashankhs11 · 2025-09-16T23:50:28Z

Changes made

Additional setUpTaskManager() overloaded method -- Created this
temporarily to pass the CI pipelines so that I can work on the failing
tests incrementally
Rewrote 3 tests to use stateUpdater thread

Reviewers: Lucas Brutschy lbrutschy@confluent.io

shashankhs11 · 2025-09-16T23:52:54Z

streams/src/test/java/org/apache/kafka/streams/processor/internals/TaskManagerTest.java

 .withInputPartitions(taskId00Partitions).build();
 final TasksRegistry tasks = mock(TasksRegistry.class);
- final TaskManager taskManager = setUpTaskManager(ProcessingMode.AT_LEAST_ONCE, tasks, true, true);
+ final TaskManager taskManager = setUpTaskManager(ProcessingMode.AT_LEAST_ONCE, tasks, true);


These lines added are all temporary. Once we rewrite all the tests, we can do this once in the setUp()

Honestly. these changes to setUpTaskManager are quite confusing and I don't understand why you did it.

I agree, it is definitely a bit confusing 😅

The reason I did this is because, I wanted to identify all the tests that would fail after we removed the stateUpdaterEnabled flag. I thought the safest way to rewrite these tests incrementally would be to add another overloaded method without the flag, so we don’t break the CI checks in the meantime. This would temporarily add in a lot of unnecessary code, but my plan was to clean it up once all the tests are updated.

Do you think this approach make sense? I would really appreciate your thoughts, and I’m open to any suggestions.

It is confusing. Maybe you want to rename it more explicitly (setUpTaskManagerWithStateUpdater or setUpTaskManagerWithoutStateUpdater)?

shashankhs11 · 2025-09-17T00:18:31Z

streams/src/test/java/org/apache/kafka/streams/processor/internals/TaskManagerTest.java

- public void shouldCommitNonCorruptedTasksOnTaskCorruptedException() {
- final ProcessorStateManager stateManager = mock(ProcessorStateManager.class);
-
- final StateMachineTask corruptedTask = new StateMachineTask(taskId00, taskId00Partitions, true, stateManager);
- final StateMachineTask nonCorruptedTask = new StateMachineTask(taskId01, taskId01Partitions, true, stateManager);
+ public void shouldNotCommitCorruptedTasksOnTaskCorruptedException() {


I renamed this test from shouldCommitNonCorruptedTasksOnTaskCorruptedException. Based on my understanding, the commit logic happens at the StreamThread level, but only the exception propagation happens in TaskManager with checkStateUpdater. So I decided to omit the check for commit logic and rewrite the test.

And hence I propose to rename to shouldNotCommitCorruptedTasksOnTaskCorruptedException

Please correct if I am wrong or If I misunderstood!

No, I don't think I agree with this

The key for this test is that non-corrupted tasks are still committed as usual, the the offsets for the corrupted tasks are reset.

assertTrue(nonCorruptedTask.commitPrepared); assertThat(nonCorruptedTask.partitionsForOffsetReset, equalTo(Collections.emptySet())); assertThat(corruptedTask.partitionsForOffsetReset, equalTo(taskId00Partitions)); // check that we should not commit empty map either verify(consumer, never()).commitSync(emptyMap()); verify(stateManager).markChangelogAsCorrupted(taskId00Partitions);

This is still a valid test!

But maybe we can skip the handle Assignment / complete restoration part if we immediatelly mock a RUNNING task?

I rewrote this test again in 6df4e79

shashankhs11 · 2025-09-17T00:29:26Z

I rewrote only 3 tests for now. I wanted to ensure that my approach is correct before proceeding further.
@lucasbru -- tagging for review

Copilot

Pull Request Overview

This PR removes dead tests and rewrites 3 existing tests in TaskManagerTest to use the stateUpdater thread pattern. An additional overloaded setUpTaskManager() method was temporarily created to pass CI pipelines while working on failing tests incrementally.

Removed 3 dead tests that were no longer needed
Rewrote 3 tests to use stateUpdater thread instead of direct task manipulation
Added temporary overloaded setup method for incremental CI fixes

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-09-18T08:24:35Z

streams/src/test/java/org/apache/kafka/streams/processor/internals/TaskManagerTest.java

 @BeforeEach
 public void setUp() {
- taskManager = setUpTaskManager(StreamsConfigUtils.ProcessingMode.AT_LEAST_ONCE, null, false);
+ taskManager = setUpTaskManager(StreamsConfigUtils.ProcessingMode.AT_LEAST_ONCE, null, false, false);


The method call now has two boolean parameters without clear meaning. Consider using named parameters or method overloading to make the intent clearer. The current call setUpTaskManager(..., false, false) is ambiguous about what each boolean controls.

Copilot · 2025-09-18T08:24:35Z

streams/src/test/java/org/apache/kafka/streams/processor/internals/TaskManagerTest.java

 .withInputPartitions(taskId00Partitions).build();
 final TasksRegistry tasks = mock(TasksRegistry.class);
- final TaskManager taskManager = setUpTaskManager(ProcessingMode.AT_LEAST_ONCE, tasks, true, true);
+ final TaskManager taskManager = setUpTaskManager(ProcessingMode.AT_LEAST_ONCE, tasks, true);


Multiple test methods are calling the 3-parameter setUpTaskManager method with true for the third parameter, but this creates ambiguity about which overloaded method is being called. The new 3-parameter method expects processingThreadsEnabled while the old 4-parameter method expects stateUpdaterEnabled as the third parameter. This could lead to confusion and potential bugs when the temporary method is removed.

Copilot · 2025-09-18T08:24:35Z

streams/src/test/java/org/apache/kafka/streams/processor/internals/TaskManagerTest.java

 .withInputPartitions(taskId01Partitions).build();
 final TasksRegistry tasks = mock(TasksRegistry.class);
- final TaskManager taskManager = setUpTaskManager(ProcessingMode.AT_LEAST_ONCE, tasks, true, true);
+ final TaskManager taskManager = setUpTaskManager(ProcessingMode.AT_LEAST_ONCE, tasks, true);


Multiple test methods are calling the 3-parameter setUpTaskManager method with true for the third parameter, but this creates ambiguity about which overloaded method is being called. The new 3-parameter method expects processingThreadsEnabled while the old 4-parameter method expects stateUpdaterEnabled as the third parameter. This could lead to confusion and potential bugs when the temporary method is removed.

Copilot · 2025-09-18T08:24:36Z

streams/src/test/java/org/apache/kafka/streams/processor/internals/TaskManagerTest.java

 public void shouldLockActiveOnHandleAssignmentWithProcessingThreads() {
 final TasksRegistry tasks = mock(TasksRegistry.class);
- final TaskManager taskManager = setUpTaskManager(ProcessingMode.AT_LEAST_ONCE, tasks, true, true);
+ final TaskManager taskManager = setUpTaskManager(ProcessingMode.AT_LEAST_ONCE, tasks, true);


Multiple test methods are calling the 3-parameter setUpTaskManager method with true for the third parameter, but this creates ambiguity about which overloaded method is being called. The new 3-parameter method expects processingThreadsEnabled while the old 4-parameter method expects stateUpdaterEnabled as the third parameter. This could lead to confusion and potential bugs when the temporary method is removed.

Copilot · 2025-09-18T08:24:36Z

streams/src/test/java/org/apache/kafka/streams/processor/internals/TaskManagerTest.java

 .withInputPartitions(taskId01Partitions).build();
 final TasksRegistry tasks = mock(TasksRegistry.class);
- final TaskManager taskManager = setUpTaskManager(ProcessingMode.AT_LEAST_ONCE, tasks, true, true);
+ final TaskManager taskManager = setUpTaskManager(ProcessingMode.AT_LEAST_ONCE, tasks, true);


Multiple test methods are calling the 3-parameter setUpTaskManager method with true for the third parameter, but this creates ambiguity about which overloaded method is being called. The new 3-parameter method expects processingThreadsEnabled while the old 4-parameter method expects stateUpdaterEnabled as the third parameter. This could lead to confusion and potential bugs when the temporary method is removed.

Copilot · 2025-09-18T08:24:36Z

streams/src/test/java/org/apache/kafka/streams/processor/internals/TaskManagerTest.java

 .withInputPartitions(taskId01Partitions).build();
 final TasksRegistry tasks = mock(TasksRegistry.class);
- final TaskManager taskManager = setUpTaskManager(ProcessingMode.AT_LEAST_ONCE, tasks, true, true);
+ final TaskManager taskManager = setUpTaskManager(ProcessingMode.AT_LEAST_ONCE, tasks, true);


Multiple test methods are calling the 3-parameter setUpTaskManager method with true for the third parameter, but this creates ambiguity about which overloaded method is being called. The new 3-parameter method expects processingThreadsEnabled while the old 4-parameter method expects stateUpdaterEnabled as the third parameter. This could lead to confusion and potential bugs when the temporary method is removed.

lucasbru

Thanks. I left some comments!

lucasbru · 2025-09-18T09:41:18Z

streams/src/test/java/org/apache/kafka/streams/processor/internals/TaskManagerTest.java

+
+ final TaskManager taskManager = setUpTaskManager(ProcessingMode.AT_LEAST_ONCE, tasks, false);
+
+ assertTrue(taskManager.checkStateUpdater(time.milliseconds(), noOpResetter));


For my understanding - why do we actually need to call checkStateUpdater here?

You're right! I think it's not actually required for this specific test case, but included it more as a safety check to ensure that the punctuation should happen only when the system is "ready". But, we can safely omit the line

streams/src/test/java/org/apache/kafka/streams/processor/internals/TaskManagerTest.java

lucasbru · 2025-09-18T09:46:57Z

streams/src/test/java/org/apache/kafka/streams/processor/internals/TaskManagerTest.java

- public void shouldCommitNonCorruptedTasksOnTaskCorruptedException() {
- final ProcessorStateManager stateManager = mock(ProcessorStateManager.class);
-
- final StateMachineTask corruptedTask = new StateMachineTask(taskId00, taskId00Partitions, true, stateManager);
- final StateMachineTask nonCorruptedTask = new StateMachineTask(taskId01, taskId01Partitions, true, stateManager);
+ public void shouldNotCommitCorruptedTasksOnTaskCorruptedException() {


No, I don't think I agree with this

The key for this test is that non-corrupted tasks are still committed as usual, the the offsets for the corrupted tasks are reset.

assertTrue(nonCorruptedTask.commitPrepared); assertThat(nonCorruptedTask.partitionsForOffsetReset, equalTo(Collections.emptySet())); assertThat(corruptedTask.partitionsForOffsetReset, equalTo(taskId00Partitions)); // check that we should not commit empty map either verify(consumer, never()).commitSync(emptyMap()); verify(stateManager).markChangelogAsCorrupted(taskId00Partitions);

This is still a valid test!

But maybe we can skip the handle Assignment / complete restoration part if we immediatelly mock a RUNNING task?

lucasbru · 2025-09-18T09:47:47Z

streams/src/test/java/org/apache/kafka/streams/processor/internals/TaskManagerTest.java

 .withInputPartitions(taskId00Partitions).build();
 final TasksRegistry tasks = mock(TasksRegistry.class);
- final TaskManager taskManager = setUpTaskManager(ProcessingMode.AT_LEAST_ONCE, tasks, true, true);
+ final TaskManager taskManager = setUpTaskManager(ProcessingMode.AT_LEAST_ONCE, tasks, true);


Honestly. these changes to setUpTaskManager are quite confusing and I don't understand why you did it.

lucasbru · 2025-09-30T08:50:31Z

@shashankhs11 let me know when you need another review here

shashankhs11 · 2025-10-04T00:15:51Z

@lucasbru I have made the changes as suggested. Tagging for review

Copilot

Pull Request Overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

streams/src/test/java/org/apache/kafka/streams/processor/internals/TaskManagerTest.java

lucasbru

LGTM, thanks!

lucasbru

Ah, still got two comments

streams/src/test/java/org/apache/kafka/streams/processor/internals/TaskManagerTest.java

lucasbru

Okay. The test "shouldCommitNonCorruptedTasksOnTaskCorruptedException" doesn't seem very thorough, but I see that it was the case before as well, so I think this is good to go.

shashankhs11 · 2025-10-08T20:34:24Z

Awesome! Thanks a lot for your time and patience, Lucas.
I have worked on more tests and will be making the next PR as soon as this PR has been merged. Does that sound good?

The test "shouldCommitNonCorruptedTasksOnTaskCorruptedException" doesn't seem very thorough

I've made a note of this. Maybe, we can come back to this at the end?

lucasbru · 2025-10-10T08:57:15Z

We seem to have problems with CI and Java 25. Can you try rebasing on latest trunk please?

shashankhs11 · 2025-10-10T09:45:33Z

Can you try rebasing on latest trunk please?

Done!

…rTest [2/N] (apache#20544) Changes made - Additional `setUpTaskManager()` overloaded method -- Created this temporarily to pass the CI pipelines so that I can work on the failing tests incrementally - Rewrote 3 tests to use stateUpdater thread Reviewers: Lucas Brutschy <lbrutschy@confluent.io>

github-actions bot added triage PRs from the community streams tests Test fixes (including flaky tests) labels Sep 16, 2025

shashankhs11 commented Sep 16, 2025

View reviewed changes

shashankhs11 commented Sep 17, 2025

View reviewed changes

github-actions bot removed the triage PRs from the community label Sep 17, 2025

lucasbru self-assigned this Sep 18, 2025

lucasbru requested a review from Copilot September 18, 2025 08:21

Copilot AI reviewed Sep 18, 2025

View reviewed changes

lucasbru reviewed Sep 18, 2025

View reviewed changes

shashankhs11 requested a review from lucasbru September 19, 2025 17:58

shashankhs11 force-pushed the KAFKA-19683-2 branch from 6df4e79 to 85d1863 Compare October 4, 2025 00:08

lucasbru requested a review from Copilot October 8, 2025 12:36

Copilot AI reviewed Oct 8, 2025

View reviewed changes

streams/src/test/java/org/apache/kafka/streams/processor/internals/TaskManagerTest.java Show resolved Hide resolved

lucasbru approved these changes Oct 8, 2025

View reviewed changes

lucasbru requested changes Oct 8, 2025

View reviewed changes

streams/src/test/java/org/apache/kafka/streams/processor/internals/TaskManagerTest.java Show resolved Hide resolved

streams/src/test/java/org/apache/kafka/streams/processor/internals/TaskManagerTest.java Show resolved Hide resolved

lucasbru added the ci-approved label Oct 8, 2025

shashankhs11 force-pushed the KAFKA-19683-2 branch from 03483b6 to 54ac38b Compare October 8, 2025 14:27

shashankhs11 requested a review from lucasbru October 8, 2025 14:47

lucasbru approved these changes Oct 8, 2025

View reviewed changes

shashankhs11 added 5 commits October 10, 2025 02:43

step2: more cleanup and rewrote 3 tests

974a18c

cleanup unnecessary comments

c447bbe

rewrite to test same as previous

17a6383

explicit function renaming

6514c49

fix indentation

f62530a

added back test for stateUpdater init

e4b7e3b

shashankhs11 force-pushed the KAFKA-19683-2 branch from 54ac38b to e4b7e3b Compare October 10, 2025 09:44

lucasbru merged commit 59f51fb into apache:trunk Oct 10, 2025
20 checks passed

shashankhs11 deleted the KAFKA-19683-2 branch October 11, 2025 13:36

lucasbru added the streams-thread-refactoring label Oct 23, 2025


		final TaskManager taskManager = setUpTaskManager(ProcessingMode.AT_LEAST_ONCE, tasks, false);

		assertTrue(taskManager.checkStateUpdater(time.milliseconds(), noOpResetter));

KAFKA-19683: Remove more dead tests and rewrote 3 tests in TaskManagerTest [2/N] #20544

KAFKA-19683: Remove more dead tests and rewrote 3 tests in TaskManagerTest [2/N] #20544

Uh oh!

Conversation

shashankhs11 commented Sep 16, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

shashankhs11 Sep 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shashankhs11 commented Sep 17, 2025

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Copilot AI Sep 18, 2025

Choose a reason for hiding this comment

Copilot AI Sep 18, 2025

Choose a reason for hiding this comment

Copilot AI Sep 18, 2025

Choose a reason for hiding this comment

Copilot AI Sep 18, 2025

Choose a reason for hiding this comment

Copilot AI Sep 18, 2025

Choose a reason for hiding this comment

Copilot AI Sep 18, 2025

Choose a reason for hiding this comment

lucasbru left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lucasbru commented Sep 30, 2025

shashankhs11 commented Oct 4, 2025

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

lucasbru left a comment

Choose a reason for hiding this comment

lucasbru left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

lucasbru left a comment

Choose a reason for hiding this comment

shashankhs11 commented Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

lucasbru commented Oct 10, 2025

shashankhs11 commented Oct 10, 2025

Uh oh!

Labels

2 participants

shashankhs11 commented Sep 16, 2025 •

edited by github-actions bot

Loading

shashankhs11 Sep 16, 2025 •

edited

Loading

shashankhs11 commented Oct 8, 2025 •

edited

Loading