Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve memory allocation for SWAP* insertion ranges #657

Open
jcoupey opened this issue Jan 25, 2022 · 6 comments
Open

Improve memory allocation for SWAP* insertion ranges #657

jcoupey opened this issue Jan 25, 2022 · 6 comments

Comments

@jcoupey
Copy link
Collaborator

jcoupey commented Jan 25, 2022

In the SWAP* operator, we have to check validity for the addition of a range that consist of a portion of a route with the addition of a single job that is either appended at the front or the back of that range.

When sketching the implementation in #580, I couldn't come up with anything more efficient than just copying the initial range to account for both situations with the added job before and after:

if (s_rank < insertion_rank) {
std::copy(s_route.begin() + s_rank + 1,
s_route.begin() + insertion_rank,
std::back_inserter(insert.range));
insert.range.push_back(job_rank);
insert.first_rank = s_rank;
insert.last_rank = insertion_rank;
} else {
insert.range.push_back(job_rank);
std::copy(s_route.begin() + insertion_rank,
s_route.begin() + s_rank,
std::back_inserter(insert.range));
insert.first_rank = insertion_rank;
insert.last_rank = s_rank + 1;
}

This uses std::copy for the initial range and a push_back either before or after copying and is of course very expensive, especially since we have quite a lot of swap choices options to go through. I suspect this is responsible for most of the computing time devoted to this local search operator.

Now the initial need is "simply" to pass iterators for the candidate range to the various is_valid_for_* validity check functions down the line for tests so it feels wrong to have to copy everything for that purpose. On the other hand, I have not found a more efficient way to describe that range and avoid the copies.

Happy to receive any idea or suggestion on that one, maybe the new range library can help here?

@jcoupey
Copy link
Collaborator Author

jcoupey commented Feb 2, 2022

The same kind of same-range-except-begin-or-end situation happens in IntraRelocate:

if (t_rank < s_rank) {
_moved_jobs[0] = s_route[s_rank];
std::copy(s_route.begin() + t_rank,
s_route.begin() + s_rank,
_moved_jobs.begin() + 1);
} else {
std::copy(s_route.begin() + s_rank + 1,
s_route.begin() + t_rank + 1,
_moved_jobs.begin());
_moved_jobs.back() = s_route[s_rank];
}

and in various flavors across the Intra* operators. Based on some naïve tests, this accounts for the most part of the computing time related to those operators...

@jcoupey
Copy link
Collaborator Author

jcoupey commented Apr 12, 2022

The more I think about this the more I'm convinced this would be a big step forward to easily further reduce computing times. Taking a look at how ranges work, it seems we could achieve the desired effect using views::concat from the range-v3 implementation. This library has been prefiguring the standard for ranges in C++20 but for some reason it looks like concat didn't make it to the standard.

On the other hand, we don't really need to concat any arbitrary views, simply being able to iterate over an existing vector + a single value before or after. Probably we can come up with some custom structure owning a reference to the vector and the value and offering the required iteration interface.

@kkarbowiak
Copy link
Contributor

@jcoupey do you think this is still relevant and worth pursuing?

Copying some values into a std::vector should be very fast, especially for primitive types whare it can boil down to a memcpy operation. It might be easier for the compiler to optimise if we used std::vector::insert insted of std::copy. The only somewhat expensive operation is memory allocation. This could be mitigated in several ways. We could reuse the vector, so once some memory is allocated for one copy, it will be reused for subsequent ones (unless more memory is needed). The vector might need to be a static or thread-local variable. We could also preallocate some memory to minimise the need for subsequent reallocations.

On the other hand I did prepare a small class that behaves like a view allowing iteration over a vector with an additional element in front or at back, but am yet to integrate it with the rest of the code.

@jcoupey
Copy link
Collaborator Author

jcoupey commented Apr 25, 2024

do you think this is still relevant and worth pursuing?

@kkarbowiak I don't have a definitive answer here. Probably the impact is less higher than when I filed this ticket due to various changes (e.g. local search operator pruning).
Yet this still feels like it could clearly be improved. If we can get even a small efficiency gain here this could still make a significant difference overall as those situations are in very hot part of the codebase.

Thanks for the ideas, if you have anything that can be tested I'd be happy to do some benchmarking/profiling.

@kkarbowiak
Copy link
Contributor

do you think this is still relevant and worth pursuing?

@kkarbowiak I don't have a definitive answer here. Probably the impact is less higher than when I filed this ticket due to various changes (e.g. local search operator pruning). Yet this still feels like it could clearly be improved. If we can get even a small efficiency gain here this could still make a significant difference overall as those situations are in very hot part of the codebase.

Thanks for the ideas, if you have anything that can be tested I'd be happy to do some benchmarking/profiling.

Understood. Please have a look at #1106. This is a draft of a solution that should work. If it looks promising I will turn it into a proper PR.

@jcoupey jcoupey changed the title Avoid copies of SWAP* insertion ranges Improve memory allocation for SWAP* insertion ranges Sep 13, 2024
@jcoupey
Copy link
Collaborator Author

jcoupey commented Sep 13, 2024

Adjusting the title here as the initial idea proved to be irrelevant based on work in #1106. Options mentioned in #657 (comment) are still worth exploring though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants