Skip to content

Fix excessive memory usage in bulk_load due to Vec over-capacity#220

Merged
urschrei merged 3 commits intogeorust:masterfrom
NathanHowell:fix/bulk-load-overcapacity
Feb 25, 2026
Merged

Fix excessive memory usage in bulk_load due to Vec over-capacity#220
urschrei merged 3 commits intogeorust:masterfrom
NathanHowell:fix/bulk-load-overcapacity

Conversation

@NathanHowell
Copy link
Contributor

@NathanHowell NathanHowell commented Feb 25, 2026

I came across rust-lang/rust-clippy#15753 while investigating excessive heap usage in one of my applications. The application was using approximately 40GB of RSS to process a 3GB OSM extract.

The OMT bulk loading algorithm employs Vec::split_off to partition elements into slabs. However, split_off leaves the original Vec with its full capacity even though it now holds only slab_size elements. Consequently, the first slab at each partitioning level inherits this over-capacity, and it propagates through the recursive partitioning process.

At the leaf level, Rust’s in-place collect optimization (triggered when size_of::<T>() == size_of::<RTreeNode<T>>()) reuses the allocation. This leads to the permanent storage of the over-sized buffer within the final tree node. For large inputs (e.g., 18.6M elements), this results in a geometric pattern of halving-size, doubling-count allocations, which wastes approximately 20GB of memory.

Fixed by calling shrink_to_fit() in two places:

  • ClusterGroupIterator: after split_off, before returning the slab

  • bulk_load_recursive: at the leaf level, before the into_iter().collect()

  • I agree to follow the project's code of conduct.

  • I added an entry to rstar/CHANGELOG.md if knowledge of this change could be valuable to users.

The OMT bulk loading algorithm uses Vec::split_off to partition elements
into slabs. split_off leaves the original Vec with its full capacity
despite now holding only slab_size elements. The first slab at each
partitioning level inherits this over-capacity, and it cascades through
the recursive partitioning.

At the leaf level, Rust's in-place collect optimization (triggered when
size_of::<T>() == size_of::<RTreeNode<T>>()) reuses the allocation,
permanently storing the over-sized buffer in the final tree node. For
large inputs (e.g. 18.6M elements) this wastes ~20 GB of memory in a
geometric pattern of halving-size, doubling-count allocations.

Fix by calling shrink_to_fit() in two places:
- ClusterGroupIterator: after split_off, before returning the slab
- bulk_load_recursive: at the leaf level, before the into_iter().collect()
Copy link
Member

@urschrei urschrei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! This is a good catch and looks like a solid PR – the 2x headroom allocation seems reasonable, and shrink_to_fit doesn't have any perf impact as far as I can see. I'm happy. Looks like you've got a formatting error, but that's a trivial fix.

@NathanHowell
Copy link
Contributor Author

@urschrei k, should be clean now.

btw the rstar-demo crate fails to build:

$ cargo build -p rstar-demo
   Compiling parry3d v0.25.3
error[E0308]: mismatched types
   --> /Users/nhowell/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/parry3d-0.25.3/src/transformation/mesh_intersection/mesh_intersection.rs:537:38
    |
537 |     match point_set.nearest_neighbor(&point_to_insert) {
    |                     ---------------- ^^^^^^^^^^^^^^^^ expected `TreePoint`, found `&TreePoint`
    |                     |
    |                     arguments to this method are incorrect
    |
note: method defined here
   --> /Users/nhowell/source/georust/rstar/rstar/src/rtree.rs:940:12
    |
940 |     pub fn nearest_neighbor(&self, query_point: <T::Envelope as Envelope>::Point) -> Option<&T> {
    |            ^^^^^^^^^^^^^^^^
help: consider removing the borrow
    |
537 -     match point_set.nearest_neighbor(&point_to_insert) {
537 +     match point_set.nearest_neighbor(point_to_insert) {
    |

For more information about this error, try `rustc --explain E0308`.
error: could not compile `parry3d` (lib) due to 1 previous error

Instead of split_off + mem::replace + shrink_to_fit, invert the
partition so slab elements end up at the tail and drain them into
a new Vec with exact capacity. This eliminates the need for
shrink_to_fit on every non-last slab while keeping the code simpler.
@NathanHowell NathanHowell force-pushed the fix/bulk-load-overcapacity branch from f45d858 to e258ca7 Compare February 25, 2026 19:23
@urschrei urschrei added this pull request to the merge queue Feb 25, 2026
Merged via the queue into georust:master with commit f93071f Feb 25, 2026
6 checks passed
@NathanHowell NathanHowell deleted the fix/bulk-load-overcapacity branch February 25, 2026 19:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants