-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do you have plans to support real time indexing? #10
Comments
Please note that a major Trinity update is in the works - it should be pushed to GH sometime next week, along with benchmarks, comparing Lucene and Trinity. You can implement a real-time indexing scheme pretty easily, by creating an When whatever you use to back your real-time index source(which is a proxy of sorts to that in-memory backing store), you can just flush it as e.g a lucene or google segment, re-create the index source collection to include that new segment and reset the in-memory index source and atomically replace the index collection (just a pointers swap). This is just one way to do it, and if it sounds complicated, it’s because I failed to describe it properly -- it is pretty trivial in practice really. |
The real time indexing requires concurrent access for SegmentIndexSource since updates and retrieval happen at the same time, additionally, the document should be able to be found immediately after it has been inserted which means the so called |
You shouldn't really use a SegmentIndexSource. This is for read-only segments. Instead, you should subclass IndexSource and create your own. I should probably bundle a simple such implementation as an example of how this could work. If you can wait for a while until I get this new major release into shape and push it to GH, I 'll add a reference impl. for such an IndexSource. |
@yingfeng I am sorry, it has taken longer than I expected to find some free time for those examples -- working on add more features still (a major release was pushed to GH some days ago). Will get to those examples soon thereafter. |
It's not that difficult to support such a feature, just by providing two in-memory segments is enough.
When one in-memory segment is full, just flush it to disk while the other in-memory segment will be used to support data ingestion at the same time. It requires a lock-less design to support higher concurrency, which is not that complicated using std::atomic semantics.
The text was updated successfully, but these errors were encountered: