-
Notifications
You must be signed in to change notification settings - Fork 8
(POC): Remove some hashmap lookups in distribution lookup #59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @scsmithr -- this is looking great. Let me know when / it is ready for a review
@@ -8,7 +8,6 @@ authors = ["clflushopt", "alamb"] | |||
# See ../ARCHITECTURE.md for more details | |||
[dependencies] | |||
chrono = "0.4.40" | |||
indexmap = "2.7.1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
adverbs: remove_dist("adverbs")?, | ||
nouns: remove_dist("nouns")?, | ||
verbs: remove_dist("verbs")?, | ||
auxiliaries: remove_dist("auxillaries")?, // P.S: The correct spelling is `auxiliaries` which is what we use. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😆
I started going through this PR -- the idea is really nice. I will try and bash out the conversion for the remaining distributions |
Yeah my bad for the slowness here. Feel free to take it over. My only real concern is we lose on flexibility on what distributions we can use. But that's my programmer brain talking -- in reality we only care about that one distribution file. |
No worries -- this is a great step. I also have some ideas about how to avoid hash maps entirely. But I'll do that as a follow on PR potentially. Right now me and copilot are going to apply your pattern to all the remaining distributions :)
Yeah, I agree -- the TPCH data generator hasn't changed for at least 14 years. I don't think flexibility is particularly important |
Partially addresses #56
Also removes IndexMap
Wip. The timings are down, but there might be alternative approaches here that I may try to explore.
This approach would also remove some flexibility with what we allow in a distribution, but I don't think that matter for this.
Timings
Tiny SF to capture the initialization time, not actual data generation.
main
: 2.851s, 2.694s, 2.764sthis branch: 1.860s, 1.867s, 1.864s