Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slowness in series subtraction #12

Open
bluescarni opened this issue Oct 4, 2015 · 2 comments
Open

Slowness in series subtraction #12

bluescarni opened this issue Oct 4, 2015 · 2 comments
Assignees
Labels

Comments

@bluescarni
Copy link
Owner

The current routine for series addition/subtraction results in extreme slowness under particular circumstances. For instance, when one subtracts two identical series, the complexity will be quadratic in the number of terms, whereas the complexity should normally be linear instead.

This is related to an implementation detail inside the hash_set class, but it can be worked-around by re-coding the series addition/subtraction routines.

@bluescarni bluescarni added the bug label Oct 4, 2015
@bluescarni bluescarni self-assigned this Oct 4, 2015
@Sumith1896
Copy link
Contributor

This would be an interesting fix.

@bluescarni
Copy link
Owner Author

It looks like the bad behaviour is only in debug mode, due to this line:

https://github.com/bluescarni/piranha/blob/master/src/series.hpp#L1186

The problem is when we get the begin() iterator in that line:

m_container.begin();

What happens here is that when we subtract a series from itself we first create a copy of it (to init the return value of the operation), and then we start removing terms from the beginning of the copy until no terms are left (this is the general subtraction algorithm - it is implemented as a negated copy).

The key thing is that, for performance reasons, Piranha's hash set is implemented in such a way that the cost of asking for the begin() iterator is proportional to the number of empty buckets before the first non-empty bucket. So, as we remove more and more terms from the head of the series, the cost of asking for a begin() iterator increases linearly. Basically, each successive call to begin() costs more and more as we need to traverse an increasing number of empty buckets before finding the first non-empty one.

This is normally not a problem, as the number of non-empty buckets in a series is roughly proportional to the total number of buckets in normal operations. But it does become a problem in corner cases like this one, in which there will be eventually zero non-empty buckets.

This is one of the key differences of Piranha's hash set with respect to std::unordered_set. The latter provides a series of guarantees on the constant complexity of operations such as traversal, erasure, etc. which Piranha's hash set does not. See this post for more information:

http://bannalia.blogspot.de/2013/10/implementation-of-c-unordered.html

I am not inclined to switch hash_set to an implementation which provides such guarantees, as they have a non-negligible cost, both in terms of performance and code complexity, but also with respect to threading safety.

In the next few days I'll experiment with a new add/sub implementation that avoids this problem in debug mode, and I will do a code review in order to understand if corner cases like this can actually arise in practice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants