AEP 2: Drop Elasticsearch as Manager API database in Apiman 3 (keep for metrics, gateway, etc) #1365
msavy
started this conversation in
Enhancement proposals
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Proposal
In short, we propose to drop support for Elasticsearch as a general purpose database in Apiman, specifically the Apiman Manager API. Instead we will support only SQL/relational databases for this area.
Background
Way back at the start of Apiman, Eric and I were given a requirement that the whole platform should be able to run using Elasticsearch as a persistence layer (manager API, gateway API, gateway components, etc). This had the benefit of simplifying deployment significantly, especially for quickstart scenarios.
Back in the Elasticsearch 1.x days ES advertised itself as being usable as a general-purpose database, as long as you were willing to maintain the various mappings necessary yourself. It actually worked quite well, albeit we had to write a lot of custom code to make ES do exactly what we wanted without pulling in vast amounts of code.
However, in 2021 things are very different:
Foremost of all, we've been spending an increasingly large amount of effort in supporting Elasticsearch and maintaining mappings and queries. This is because Elasticsearch has regularly been making significant and incompatible releases between versions. This includes very significant changes to the way indexes and queries work. Every release you can guarantee there will be a major rework required which is going to consume a very considerable amount of development time.
Elasticsearch has become increasingly specialised for metrics and search type usecases, and they've dropped the claim that they work well as a general purpose database. It's increasingly like trying to bang a square peg into a round hole.
Naturally, users want to use the latest version of Elasticsearch, but rolling forwards causes all kinds of trouble for users on older versions (usually it doesn't work, and they must manually export-import or do some kind of reindex action). This is really not a good user experience and requires maintenance of complex code in Apiman to do stuff that comes "for free" with normal databases.
For metrics things are simpler, as we don't have a complex schema arrangement. So, we can cope with that just fine.
There are some other technical concerns too. ES doesn't have proper support for transactions, and instead more complex code is required to try to work with ES to ensure a safe state is left in the case of a failure (even then, there are likely some gaps here).
In short, ES has been considerably slowing down the evolution and development of the Apiman Manager backend for a long period of time
Why RDBMS is better for this aspect of the system
With almost ubiquitous access to docker/containers (even Windows!), it's no longer difficult to stand up a system that is composed of multiple different datastores (for example, using docker compose). And for simple demo purposes a relational database is probably just fine.
For those who aren't aware, the Apiman Manager API and the Apiman Gateway API (and other gateway components) are completely decoupled. The Manager API is essentially a fairly typical CRUD-like application with some fancy bells and whistles for handling some of our dynamic content and allowing it to run on lots of different platforms. It isn't performance critical/constrained, and is really the ideal use-case for a relational database: it's read-dominant; we do a bunch of different queries that are tabular in nature; the data is mostly structured a fairly regimented way that is amenable to SQL storage (with modest exceptions).
Notably, changing this aspect of the system will NOT impact the gateway's performance (that is a separate concern).
Migration
We will ensure you can migrate where possible by using the Apiman Export-Import tool.
Making it easier to start and try Apiman with a more 'production-grade' setup
In the near future we plan to ship some Docker Compose definitions that will allow you to bootstrap a system with a setup that better approximates a production setup. We will try to bake in some best practices into those definitions to reduce the number of places that people.
This will also mean we don't need to support 'Elasticsearch for everything' any more, as we can easily ship multiple persistent stores if we decide that is a better experience (noting that you can already use a relational database for everything in Apiman if you want).
Conclusion
Thoughts?
Leave any thoughts and feedback in this thread. Thanks!
Beta Was this translation helpful? Give feedback.
All reactions