Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When used for recovery, pin query to journal dispatcher #880

Merged
merged 5 commits into from
Apr 20, 2021

Conversation

ignasi35
Copy link
Contributor

Recovering an entity should not be clogged by the load on the read-side.

When using a query for recovery, it should run on the journal dispatcher no the read dispatcher.

Copy link
Member

@patriknw patriknw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, good to run those on the journal dispatcher

@ignasi35

This comment has been minimized.

@ignasi35 ignasi35 marked this pull request as ready for review April 16, 2021 09:38
@ignasi35
Copy link
Contributor Author

Note this targets release-0.x. Needs forward-port to master.

@ignasi35
Copy link
Contributor Author

ignasi35 commented Apr 16, 2021

Travis can't download the Jabba installer script. 🤦🏼‍♂️
I've raised #883 to install the JDKs in travis using apt.

@@ -88,6 +92,8 @@ trait CassandraRecovery extends CassandraTagRecovery with TaggedPreparedStatemen
someReadConsistency,
someReadRetryPolicy,
extractor = Extractors.persistentRepr(eventDeserializer, serialization))
// run the query on the journal dispatcher (not the queries dispatcher)
.withAttributes(ActorAttributes.dispatcher(sessionSettings.pluginDispatcher))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to be sure that this isn't just adding an outer asynchronous boundary. Might be more clear and safe to pass in the dispatcher as parameter to queries. eventsByPersistenceId?

@ignasi35 ignasi35 force-pushed the pin-query-to-journal-dispatcher branch from cc4abdb to 12bbf34 Compare April 16, 2021 12:02
@ignasi35 ignasi35 changed the base branch from release-0.x to inherit-scala-travis April 16, 2021 12:02
@ignasi35
Copy link
Contributor Author

I've had to rebase this on top of #881 and target the branch of #881 to get travis to work.

@ignasi35
Copy link
Contributor Author

The job that runs tests on JDK11 seems to be slow enough that the 50min timeout in travis fails the job.

@ignasi35

This comment has been minimized.

@@ -569,7 +569,8 @@ class CassandraReadJournal(system: ExtendedActorSystem, cfg: Config)
queryPluginConfig.fetchSize,
None,
s"currentEventsByPersistenceId-$persistenceId",
extractor = Extractors.persistentRepr(eventsByPersistenceIdDeserializer, serialization))
extractor = Extractors.persistentRepr(eventsByPersistenceIdDeserializer, serialization),
dispatcher = queryPluginConfig.pluginDispatcher)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the user part of the stream run through the public API queries will run on the default dispatcher this means an async boundary is introduced in every app that uses the query side, is that really something we want to do?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand the previous comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johanandren this is not changing that, the dispatchers parameter is delegated down to https://github.com/akka/akka-persistence-cassandra/pull/880/files#diff-279ff4d092baf64b28404c59938d5a8f4053dffe99b8679826391980f66a6f24R647

We must have a separate dispatcher at the inner stage because some operations are blocking.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, so if we run that inner EventsByPersistenceIdStage and mapAsync stage on an internal dispatcher, the user flow running on the default dispatcher will always introduce an async boundary since different dispatchers.

If that's unavoidable I guess that is fine, just that it is understood that is the decision.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's intended, because otherwise we would propagate the blocking out to user responsibility (and it would end up on the default-dispatcher). That doesn't change by this PR, or did I miss something?

See #870

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That same problem goes for this entire PR though. So it wouldn't be more "blindly" than the current changes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to keep things as small as possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only extracted a hardcoded dispatcher name and set the appropriate value in 5 places (CassandraJournal, EventsByTagMigration and CassandreRecovery (x3)).

I got the impression you were suggesting a full review of the whole codebase. It's different orders of magnitude.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should trust that Akka Streams does the right thing so additional automated tests shouldn't be needed here, but since we know there were problems with setting the attributes a manual verification with println is what I was suggesting. I can do that before approving this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually after reviewing this a bit more in detail I see how tricky it is because of the futureSource bug, setting the dispatcher needs to be repeated in lots of places.

Perhaps it's even preferrable to set the dispatcher where you'd expect to set it (outermost) so that this is fixed once that is fixed in Akka?

@ignasi35 ignasi35 force-pushed the pin-query-to-journal-dispatcher branch from 2343699 to d76e066 Compare April 20, 2021 09:36
@ignasi35 ignasi35 changed the base branch from inherit-scala-travis to migrate-installer-to-apt April 20, 2021 09:37
extractor = Extractors.sequenceNumber(eventDeserializer, serialization))
extractor = Extractors.sequenceNumber(eventDeserializer, serialization),
// run the query on the journal dispatcher (not the queries dispatcher)
dispatcher = sessionSettings.pluginDispatcher)
.map(_.sequenceNr)
.runWith(Sink.headOption)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here the .map and sink is running on default dispatcher, so one async boundary added that we probably do not want, since it is an entirely internal stream

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just did the tests @patriknw mentioned above and came back to comment this exact thing:

I did check that all instances of EventsByPersistenceIdStage run on the assigned dispatcher, though.

extractor = Extractors.taggedPersistentRepr(eventDeserializer, serialization))
extractor = Extractors.taggedPersistentRepr(eventDeserializer, serialization),
// run the query on the journal dispatcher (not the queries dispatcher)
dispatcher = sessionSettings.pluginDispatcher)
.mapAsync(1)(sendMissingTagWrite(tp, tagWrites.get))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here mapAsync and the "outer" map and runForeach runs on default dispatcher, we probably do not want that since it is an entirely internal stream.

This is one case where the bug in the futureSource operator does not propagate attributes like it should means the dispatcher needs to be set both on the complete "inner" stream and the "outer" stream.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one can be very important. Cause a performance regression. We should fix that here or in follow up before releasing.

@@ -569,7 +569,8 @@ class CassandraReadJournal(system: ExtendedActorSystem, cfg: Config)
queryPluginConfig.fetchSize,
None,
s"currentEventsByPersistenceId-$persistenceId",
extractor = Extractors.persistentRepr(eventsByPersistenceIdDeserializer, serialization))
extractor = Extractors.persistentRepr(eventsByPersistenceIdDeserializer, serialization),
dispatcher = queryPluginConfig.pluginDispatcher)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually after reviewing this a bit more in detail I see how tricky it is because of the futureSource bug, setting the dispatcher needs to be repeated in lots of places.

Perhaps it's even preferrable to set the dispatcher where you'd expect to set it (outermost) so that this is fixed once that is fixed in Akka?

@patriknw
Copy link
Member

patriknw commented Apr 20, 2021

Actually after reviewing this a bit more in detail I see how tricky it is because of the futureSource bug, setting the dispatcher needs to be repeated in lots of places.

Ok, we have an urgent need to fix this now and release. So let's follow up on the outer dispatcher things.

Created issue #886 for follow up.

Copy link
Member

@johanandren johanandren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, given that we follow up the async boundary introduction, especially the completely internal streams for recovery etc that could live entirely on the same dispatcher/stream island and now has gotten an async boundary/two actors.

@patriknw
Copy link
Member

I agree, for the internal usage, recovery, we have to fix it before releasing. Could cause performance regression otherwise.

@ignasi35 ignasi35 merged commit 30e08e0 into migrate-installer-to-apt Apr 20, 2021
@ignasi35 ignasi35 deleted the pin-query-to-journal-dispatcher branch April 20, 2021 13:11
@ignasi35
Copy link
Contributor Author

I'll forward port this first PR to master. #886 can then start on master or release-0.x.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants