Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Very high memory usage with large sets #26

Open
e271828- opened this issue May 30, 2021 · 2 comments
Open

Very high memory usage with large sets #26

e271828- opened this issue May 30, 2021 · 2 comments

Comments

@e271828-
Copy link

Hello, while importing a 115MB list of CIDRs into a pysubnettree set, I noticed that the final memory structure was ~10x the size of the list of strings.

This is counterintuitive: I'd have expected it to be radically smaller. What am I missing?

@rsmmr
Copy link
Member

rsmmr commented May 31, 2021

Well, I'd expect some overhead for sure, it's putting it all items into a tree structure that needs management. I'm not sure how much overhead I would expect honestly but I can see it being substantial given that the original string representation is about the most compact way possible. Either way, there's not much we can do about it I'm afraid if it's really just coming out of the underlying patricia tree. It could be a memory leak instead, then it'd be a bug obviously; do you know if everything gets cleaned up ok when the data structure is destroyed?

@e271828-
Copy link
Author

e271828- commented Jul 10, 2021

Found another solution, so can't report back on cleanup. However, not sure I agree with you on string representation: an IPv4 address is a 32 bit number. str('123.123.123.123') = 15*8 = 120 bits. There were likely some adjacent ranges in my large set. Thus I would have expected a final memory size of 1/2 or so even with 2x overhead.

Was also assuming space-saving extensions for adjacencies occurred as a side effect of the trie at add time, but that does not appear to be the case. Probably something like an OET would be a better fit for larger lists.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants