-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallel output in tpchgen-cli (Nx faster, where N is number of cores) #58
Conversation
Ok, I need to stop for today, but this is looking promising |
3000cd5
to
aee04a0
Compare
16bed68
to
7fcaf3e
Compare
7fcaf3e
to
0af7609
Compare
0af7609
to
b9ce319
Compare
I plan to merge this PR tomorrow unless anyone else would like time to comment on it. |
I will take a look at it tonight ! Thanks @alamb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM; awesome stuff !!!
Co-authored-by: @clflushopt <172141496+clflushopt@users.noreply.github.com>
BTW with the other improvements on main, when merging it takes less than 5 seconds on my laptop to make TPCH SF 10 🚀 time target/release/tpchgen-cli -s 10 --output-dir=/tmp/tpchdbgen-rs
real 0m4.895s
user 0m36.096s
sys 0m2.910s |
Features:
-v
andRUST_LOG
Performance 🌶️
In single threaded mode, the generator can create 250MB/sec with each core which is 4GB/s on my 16 core laptop
This scales linearly with
Here are performance measurements for SF=100 on my Mac M3 (with 16 cores)
time target/release/tpchgen-cli -s 100 --output-dir=/tmp/tpchdbgen-rs
It actually turns out this can fully saturate the disk bandwidth on my Mac M3 laptop which caps out at 2GB/sec
If I hard code the generator to just throw the data away rather than writing, it keeps all the cores busy and writes at a blistering 4GB/sec