Skip to content

Conversation

@patriknw
Copy link
Contributor

  • when rescaling the revision is bumped, and all previous processes are stopped
  • if a keep-alive message is in flight that can trigger a new start of a process that belongs to the previous revision, because the revision check was using local read consistency
  • change to use same consistency as the ShardedDaemonProcessCoordinator

@patriknw patriknw added the bug label Nov 25, 2025
@patriknw patriknw added this to the 2.10.13 milestone Nov 25, 2025
* when rescaling the revision is bumped, and all previous processes are stopped
* if a keep-alive message is in flight that can trigger a new start of a process
  that belongs to the previous revision, because the revision check was using
  local read consistency
* change to use same consistency as the ShardedDaemonProcessCoordinator
@patriknw patriknw force-pushed the wip-daemon-start-patriknw branch from 4e6dd6b to 9e7c9e9 Compare November 25, 2025 12:13
def apply(id: Int): Behavior[Command] = Behaviors.setup { ctx =>
ctx.log.info("Started [{}]", id)
val snitchRouter = ctx.spawn(Routers.group(SnitchServiceKey), "router")
snitchRouter ! ProcessActorEvent(id, "Started")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not reliable, sometimes dead letters. I have to find a better way to collect these events.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would it drop messages?

Copy link
Contributor

@aludwiko aludwiko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@octonato octonato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LTGM, with a question...

def apply(id: Int): Behavior[Command] = Behaviors.setup { ctx =>
ctx.log.info("Started [{}]", id)
val snitchRouter = ctx.spawn(Routers.group(SnitchServiceKey), "router")
snitchRouter ! ProcessActorEvent(id, "Started")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would it drop messages?

Copy link
Contributor

@johanandren johanandren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix looks good.

Not good idea why the group router would drop messages.

@patriknw patriknw marked this pull request as draft November 25, 2025 12:48
@patriknw
Copy link
Contributor Author

draft until the test is solid

@patriknw patriknw marked this pull request as ready for review December 1, 2025 15:55
@patriknw
Copy link
Contributor Author

patriknw commented Dec 1, 2025

I'll improve the test in a separate follow up PR. We have confirmed this in running systems

@patriknw patriknw merged commit 1c76943 into main Dec 1, 2025
9 checks passed
@patriknw patriknw deleted the wip-daemon-start-patriknw branch December 1, 2025 15:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants