-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Splines improve benchmark #496
base: main
Are you sure you want to change the base?
Conversation
blegouix
commented
Jun 18, 2024
- Capability to sweep on parameters determined at compile-time: Uniform/NonUniform, Degree, ExecutionSpace and in the futur solver Ginkgo/Lapack.
benchmarks/splines.cpp
Outdated
static void characteristics_advection(benchmark::State& state) | ||
{ | ||
// Preallocate 6 unitary benchs for each combination of uniform/non-uniform and spline degree we may want to benchmark (those are determined at compile-time, that's why we need to build explicitely 6 variants of the bench even if we call only one of them) | ||
std::array<std::function<void(benchmark::State&)>, 6> benchs; | ||
benchs[0] = characteristics_advection_unitary<std::false_type, 3>; | ||
benchs[1] = characteristics_advection_unitary<std::false_type, 4>; | ||
benchs[2] = characteristics_advection_unitary<std::false_type, 5>; | ||
benchs[3] = characteristics_advection_unitary<std::true_type, 3>; | ||
benchs[4] = characteristics_advection_unitary<std::true_type, 4>; | ||
benchs[5] = characteristics_advection_unitary<std::true_type, 5>; | ||
|
||
// Run the desired bench | ||
benchs[state.range(0) * 3 + state.range(1) - 3](state); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tpadioleau any opinion about this ? It is hacky but helps a lot to make the benchmarks flexible. I did not find yet any way to write it as "nested loops", maybe there is one still.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can imagine an improvement of TypeSeq with something like:
template<std::size_t N, typename List>
struct TypeByIndex;
template<std::typename Head, typename... Tail>
struct TypeByIndex<0, ddc::detail::TypeSeq<Head, Tail...>> {
using type = Head;
};
template<std::size_t N, typename Head, typename... Tail>
struct TypeByIndex<N, ddc::detail::TypeSeq<Head, Tail...>> {
using type = typename TypeAt<N-1, TypeList<Tail...>>::type;
};
And upon this a nested parameter pack expansions on TypeSeq<Kokkos::DefaultHostExecutionSpace, Kokkos::DefaultExecutionSpace>
, TypeSeq<std::false_type, std::true_type>
and TypeSeq<3, 4, 5>
should be feasible, but it may be a bit too complicated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tpadioleau any opinion about this ? It is hacky but helps a lot to make the benchmarks flexible. I did not find yet any way to write it as "nested loops", maybe there is one still.
Don't you think you need different tests for Ginkgo and Lapack ? They have different parameters ? cols_per_chunk_ref
and preconditionner_max_block_size_ref
do not make sense for Lapack right ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want to select a benchmark based on a runtime parameter could we use a string and store the different benchmarks in a std::map<std::string, benchmark> my_benchmarks
? you would just call my_benchmarks.at("host")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tpadioleau any opinion about this ? It is hacky but helps a lot to make the benchmarks flexible. I did not find yet any way to write it as "nested loops", maybe there is one still.
Don't you think you need different tests for Ginkgo and Lapack ? They have different parameters ?
cols_per_chunk_ref
andpreconditionner_max_block_size_ref
do not make sense for Lapack right ?
Atm the choice of Backend still has to be manually changed by editing the file. cols_per_chunk_ref
and preconditionner_max_block_size_ref
do not make sense for Lapack indeed, and this is weird to have them passed as optional arguments of SplineBuilder. But this is the current implementation, we could do something better with a kwArgs struct (similar to what we have in the FFT) but this requires more developments.
About the benchmark, the current way to use it is to comment/uncomment the benchmarks we want to run so we just have to not run the preconditioner or cols_per_chunk benchmark with Lapack.
If you want to select a benchmark based on a runtime parameter could we use a string and store the different benchmarks in a
std::map<std::string, benchmark> my_benchmarks
? you would just callmy_benchmarks.at("host")
Making a more interfaced benchmark system would be great but requires development. And I think it would not impact the way benchmarks are performed (this would be a software overlay build upon it). Maybe gtest already provides something to do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(my question was more about instanciating all the possible variations of the class we would like to benchmark, combining explicitly compile-time parameters like ExecutionSpace, Degree and Uniformity)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I think I get the problem, would this google benchmark feature https://github.com/google/benchmark/blob/main/docs/user_guide.md#templated-benchmarks-that-take-arguments help you ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so because I think it does not allow to "sweep" on those template parameters (i.e. degree={3,4,5}) so it would be at least less conveniant than my solution (if I understand well the capabilities of this feature).
…imulation/ddc into splines-improve-benchmark