-
Notifications
You must be signed in to change notification settings - Fork 99
Thoughts on Eventuate’s future
Eventuate is an event sourcing library with a programming model similar to Akka Persistence and additionally supports lazy event replication over multiple datacenters and event collaboration with causal consistency. Eventuate has been successfully implemented in several distributed applications at the Red Bull Media House (RBMH) and gained significant interest in the open source community. As we are getting closer to version 1.0, we also started to think about the direction and feature set of a potential successor, Eventuate 2.0 with codename Calliope.
Calliope will include and extend the feature set of Eventuate 1.0. Its API and implementation will however be based on Akka Streams. Events will be stored in Apache Kafka for event collaboration between processes and in pluggable archives for long-term persistence. Strong focus is on ease of use and better modularity. Calliope also aims to integrate well with applications that already use plain Apache Kafka for event stream processing.
The following subsections cover some goals of Calliope in more detail. Nevertheless, they should be regarded as preliminary and incomplete and should mainly serve as a basis for further discussions. The initial system model of Calliope will be described in a separate document.
Calliope features shall be provided by several smaller modules instead of a single feature-rich core as it is currently the case for Eventuate. Especially, scalable event storage, event replication, event stream re-sequencing and de-duplication, tracking of causality metadata, causal reliable broadcast, event sourcing, event stream processing and operation-based CRDTs shall become separate modules. User shall be able to pick only those features they need in their project. Some motivating examples for better modularity are:
- Many users only want to do event sourcing without event replication over multiple datacenters.
- Other users are primarily interested in Eventuate’s causal reliable broadcast but don’t want to use Eventuate’s event-sourcing APIs.
- Further users only want to use Eventuate’s operation-based CRDTs but without the overhead of Eventuate’s event-sourcing protocol.
At the moment, event storage order in Eventuate must be consistent with potential causality of events. This has the advantage that consumers can directly read events in correct causal order but this is a severe limitation to write scalability. Not all consumers need to read events in correct causal order, some may only need a per-writer order, others no order at all. This should be configurable on a per-consumer basis instead of being a system-wide default. Storage plugins shall have the option to store events in any order and Calliope shall provide generic re-sequencers that re-order event streams based on client-generated sequence numbers and other causality metadata, if needed.
One of the main reasons Eventuate doesn’t have a Kafka storage backend yet is that storage order cannot be guaranteed with the current version of Kafka (see also these tweets for some details). By giving up storage order restrictions, a Kafka integration is straightforward and writes to a single event log (= Kafka topic) can even be scaled horizontally by writing to multiple topic partitions. A Kafka storage backend is also one of the features that has been repeatedly requested by Eventuate users.
In addition to a Kafka storage backend, Calliope will also provide a lightweight storage backend based on LevelDB, LMDB or something comparable in order to allow application to down-scale event storage in environments with limited resources.
The API and implementation of Calliope will be based on Akka Streams. Applications implement Calliope features by including them as ready-to-use Akka Stream stages. Hence, applications will not only benefit from built-in back-pressure but also from a wider range of system integration options, for example, using Alpakka components for connectivity. Calliope will also use Akka Streams as implementation basis for its event sourcing and event replication modules.
For supporting Kafka as storage backend, Reactive Kafka can be reused as a basis for producing and consuming event streams to an from Kafka. Calliope-specific stream stages additionally support the management of causality metadata in producer streams and event re-sequencing in consumer streams, for example. Other stream stages might help applications to implement a causal reliable broadcast across multiple datacenters when used in combination with Calliope’s stream-based lazy event replication.
Calliope doesn’t aim to compete with other stream processing solutions, such as Spark Streaming or Kafka Streams, for example. Calliope’s primary focus is on event sourcing and event collaboration with causal consistency up to global scale. Applications that require more advanced stream processing techniques (windowing, joining, ...) shall do that with 3rd party stream processing solution for which Calliope and/or other integration libraries provide connectors. We believe that Apache Kafka and the Reactive Streams specification are a good technical basis for better interoperability with other systems.
Instead of providing a proprietary backend-storage-agnostic multi-datacenter replication solution, as in Eventuate 1.0, Calliope should reuse existing replication solutions such as Kafka Multi-Datacenter Replication. Calliope will additionally support applications to concurrently consume and causally re-order events from both local and replicated topics on consumer request. Calliope will also use Kafka Streams for keeping track of causality metadata snapshots for efficient recovery.
At the moment, Eventuate doesn’t allow two or more processes to share a local event log. Instead, they must connect their local event logs to a replicated event log before being able to exchange events across process boundaries. Calliope won’t have this restriction and will allow processes to share a local event log. This can significantly reduce event replication overhead within a single datacenter, for example.
Calliope will not only allow applications to track potential causality with vector clocks but also support tracking of explicit causality i.e. application-defined causality. This is a consequence from analyzing existing Eventuate applications where we found that causality relationships need to be tracked only between a very limited number of events. Compared to tracking potential causality this significantly reduces the amount of metadata that need to be transmitted and stored with events.
Applications that want to continue using Eventuate’s actor API for event sourcing and event stream processing should use Calliope’s backwards compatibility module that will be implemented on top of the streaming API. This module will also provide stream converters to support interoperability between Eventuate and Calliope event logs.