-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Id assignment perf #186
Id assignment perf #186
Conversation
Codecov ReportAttention:
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## master #186 +/- ##
==========================================
+ Coverage 89.91% 90.41% +0.49%
==========================================
Files 125 124 -1
Lines 5812 5864 +52
==========================================
+ Hits 5226 5302 +76
+ Misses 586 562 -24 ☔ View full report in Codecov by Sentry. |
Looks good to me. I had a go at using You can make it slightly faster (1.25ms to 1.15ms) with |
08b43dd
to
7c1946c
Compare
The deepCopy() improvements reduce the 200 objects + common definitions copy from ~2ms (post ID changes) to about 500µs, which is probably good enough for most of our use cases. The downside is there are now two paths for adding things to a document (via add() or via deepCopy()), both of which need to be maintained if we add any more elements/parameters. I think it's reasonable as this should only happen rarely (when the standard changes) The variant unpacking is a bit messy. I thought of a couple of alternatives: Write a visitor: Still messy as you'd either need it to be a friend of all the element attourneys and Document or do something nasty by passing internals around by reference. Write a different copyElements function that returns a struct with element vectors rather than variants, or takes references to the document element vectors: Adds more boilerplate than just having a big conditional (as you still need the original copyElement interface for doing deepCopyTo), and benchmarked performance was identical. |
Skips all the checks for ID clashes by simply copying the elements assigning the new document as parent, and pushing into the relevant containers This should be fine as the checks have already been done by the original document, and we're starting from nothing.
Speeds up the deepCopy() and common definitions copy benchmarks by about 30% here
Uses the kitchen sink xml that features all ADM elements to generate a document Then deep copies that document, and checks that xml written from both the document and its copy match (they may differ from the original)
7c1946c
to
d752080
Compare
ID assignment was pretty slow for large documents
On my machine, adding 200 simple objects to a document took ~70ms, primarily due to the repeated document->lookup() calls. This adds a slightly more complex search for a free id, which reduces the cost of adding 200 objects to about 1.5ms.
A 64 object document (more realistic) went from 2ms -> 200µs, which makes it more reasonable in an SADM context.
This should improve copying of medium/large docs (unless they're primarily common defs, which skipped the lookup anyway)
Probably slower for small documents, but should still be fast as then operating on small vectors.
Could improve further by keeping element vectors sorted on id, but that's a more intrusive change that might affect behaviour as elements would always get written out sorted by ID. (flat_map from boost or C++23 would work, or writing a less generic equivalent isn't too tricky)