-
Notifications
You must be signed in to change notification settings - Fork 135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle events out of timestamp order #706
Comments
Guys I'm experienced the same issue with TagWritter about "Expected events to be ordered by seqNr". FYI > I'm using cassandra as the journal and the akka-persistence plugin for it |
As stated in the first sentence of this ticket "Cassandra and this plugin rely on clocks to be in sync". This is something needed by Cassandra itself also. Clocks can never be perfectly in sync but within something like a second shouldn't be problem. Then the question is if it's possible to fail over persistent actor from one node to another within such short clock skew? @ffakenz When do you see this this problem? Is it in rolling update scenarios? Do you use Cluster Sharding? Which version of Akka? Which version of the Cassandra plugin?
A friendly remainder that this is an open-source project and that kind of demand can't be requested here. A more friendly tone would be appreciated. We try to do our best to help all community users, but if you need full professional support or immediate priority of critical issues you should become a Lightbend customer. |
Hey Patrik, thank you so much for your reply.
|
So you are saying that several different actors persist events with the same tag and then some of them are not seen by the eventsByTag query? Is it stuck or are the events just not showing up? Is the eventsByTag query restarted when this happens? Would be great if you can describe some more about how to trigger the problematic scenario. |
The scenario is the following:
|
The "Expected events to be ordered by seqNr" message also contains the persistenceId that it is having trouble with. Can you see if that corresponding actor was moved from one node to another by Cluster Sharding (rebalance)? It would be interesting if you can share the events in the The messages table has a
BTW, do you use clock synchronization of the nodes, with NTP or similar? |
Patrick, thanks for the quick reply and my apologize for the delay in my response, but I wanted to create a minimal project to reproduce the bug (https://github.com/ffakenz/Example/) The problem is not occurring during Cluster Sharding (rebalance). On the other hand, we are not using any clock synchronization of the nodes nor NTP or similar. The issue happens when you have the following combination:
I was able to overcome this issue by avoiding the use of persistAll (in the provided example there is only one event, but imagine my real prod code does try to persist multiple) Looking forward to your comments, and I hope we can get some more insights about this :) Once again thanks a lot for the help and support ! |
Thanks for taking the time to create an example. I will take a look at that tomorrow. |
@ffakenz Thanks for a very well structured example. I have found that the problem is in your code. Two PersonActors are started with the same persistenceId. If you add a log in the constructor of PersonActor
you will see
That is also detected by the plugin with this warning:
The problem is in your entityId/shardedId in the messages. You must make sure that a given entityId always resolves to the same value in extractShardId. Easiest would be to use On a related note you should also add
|
Patrick, thank you so much for taking the time and the patience to run the code ! |
You're welcome! |
Hi Akka team! I would like to leave a ping on this issue as our team has faced it lately. We are using cassandra persistence. Oddly, there is a part of documentation that describes how to circumvent this issue, and we have it configured to Is there any chance |
Cassandra and this plugin rely on clocks to be in sync otherwise the following can happen:
T2 can be < T1
Events by tag relies on the timestamp UUID always increasing so that restarting from an offset
won't miss events.
Events by tag also enforces that events for the same actor are delivered in seqNr order. This may or not be important for processing done via events tag.
In this scenario, both can not be achieved. Either the events are delivered out of timeuuid order or seqNr order.
Right now
TagWriter
will fail to persist the event as it fails to generate a tag pid sequence nr as it sees the events out of seqNr order when being sent from the journal.On a single node the UUIDs utility we use to generate type 1 UUIDs ensures that even if a clock goes backwards the time uuids only ever increase.
We could:
The plugin could generate a different timeuud for the events by tag query rather than use the one in the messages tablewrites to the tag_views table should be idempotent based on data in the messages tableThis is a very rare edge case so I'm reluctant to add a lot of complexity for it. The simplest solution would be to add a flag to allow these events to be put in the tag_views table but then not guarantee events are delivered in seqNr order
The text was updated successfully, but these errors were encountered: