Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store records in off-heap memory #528

Open
JohannesLichtenberger opened this issue Sep 5, 2022 · 16 comments
Open

Store records in off-heap memory #528

JohannesLichtenberger opened this issue Sep 5, 2022 · 16 comments

Comments

@JohannesLichtenberger
Copy link
Member

In order to keep GC to a minimum (despite of low latency Garbage Collectors as for instance Shenandoah), we could try to store the records/nodes off-heap using the foreign memory API for instance, and compare performance.

@JohannesLichtenberger
Copy link
Member Author

JohannesLichtenberger commented Nov 14, 2022

Probably we should simply store the trx intent log in a Chronicle map instead of a simple HashMap.

@JohannesLichtenberger
Copy link
Member Author

@abhinax4991
Copy link

abhinax4991 commented Apr 30, 2023

Hey @JohannesLichtenberger Guten tag , would like to work on this can you please assign this to me.

@JohannesLichtenberger
Copy link
Member Author

JohannesLichtenberger commented May 2, 2023

@abhin-dynamify I wonder if serialization of page (fragments) and deserialization will be an issue and if it's faster or even slower. Same for Caffeine caches (the lightweight buffer manager)... that said, with the Caffeine caches, if the maximum sizes are too large, I've had severe performance issues regarding the GC (especially with the ZGC, apparently maybe because it is not generational yet).

@JohannesLichtenberger
Copy link
Member Author

We can try and compare performance in a separate branch 👍

@JohannesLichtenberger
Copy link
Member Author

@abhin-dynamify hope it makes sense. Do you work on this?

@abhinax4991
Copy link

@abhin-dynamify hope it makes sense. Do you work on this?

yeah i am working on it

@JohannesLichtenberger
Copy link
Member Author

@abhin-dynamify did you have time?

@Sung-Heon
Copy link

@JohannesLichtenberger Can I try this?

@JohannesLichtenberger
Copy link
Member Author

In the KeyValueLeafPage, we store the slots as a byte array of byte arrays. Maybe we could instead use a single MemorySegment which may have to grow.

@JohannesLichtenberger
Copy link
Member Author

@Sung-Heon still interested?

@Sung-Heon
Copy link

Yes~!

@JohannesLichtenberger
Copy link
Member Author

We currently have a way too high allocation rate, I think (2,7Gb/s with a single read-only trx). Can you try to replace the slots byte[][] array in KeyValueLeafPage with a single MemorySegment?

@JohannesLichtenberger
Copy link
Member Author

We probably need an indirection array at the start of the page though with offset/lengths. Furthermore, the pages are variable sized, so the MemorySegment might have to be reassigned with a bigger one, copying all data. Also, for instance if variable sized data as Strings are reassigned and bigger as before... getting a bit tricky.

@JohannesLichtenberger
Copy link
Member Author

You can probably join the Discord channel.

@JohannesLichtenberger
Copy link
Member Author

JohannesLichtenberger commented Aug 20, 2024

@Sung-Heon do you have experience with this or database architecture stuff in general? It's of utmost importance to change this to reduce allocation pressure, I'd otherwise assign it to myself.

The records perhaps should be backed by MemorySegments, too. Once a new record/node is created it should be serialized to the backing MemorySegment, which in turn should be set as the slot data as part of the large page MemorySegment. We can probably read from the MemorySegment directly, too, at least most stuff...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants