Very high memory usage with large sets #26

e271828- · 2021-05-30T05:34:37Z

Hello, while importing a 115MB list of CIDRs into a pysubnettree set, I noticed that the final memory structure was ~10x the size of the list of strings.

This is counterintuitive: I'd have expected it to be radically smaller. What am I missing?

rsmmr · 2021-05-31T07:42:53Z

Well, I'd expect some overhead for sure, it's putting it all items into a tree structure that needs management. I'm not sure how much overhead I would expect honestly but I can see it being substantial given that the original string representation is about the most compact way possible. Either way, there's not much we can do about it I'm afraid if it's really just coming out of the underlying patricia tree. It could be a memory leak instead, then it'd be a bug obviously; do you know if everything gets cleaned up ok when the data structure is destroyed?

e271828- · 2021-07-10T12:56:33Z

Found another solution, so can't report back on cleanup. However, not sure I agree with you on string representation: an IPv4 address is a 32 bit number. str('123.123.123.123') = 15*8 = 120 bits. There were likely some adjacent ranges in my large set. Thus I would have expected a final memory size of 1/2 or so even with 2x overhead.

Was also assuming space-saving extensions for adjacencies occurred as a side effect of the trie at add time, but that does not appear to be the case. Probably something like an OET would be a better fit for larger lists.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Very high memory usage with large sets #26

Very high memory usage with large sets #26

e271828- commented May 30, 2021

rsmmr commented May 31, 2021

e271828- commented Jul 10, 2021 •

edited

Loading

Very high memory usage with large sets #26

Very high memory usage with large sets #26

Comments

e271828- commented May 30, 2021

rsmmr commented May 31, 2021

e271828- commented Jul 10, 2021 • edited Loading

e271828- commented Jul 10, 2021 •

edited

Loading