You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I test Profile-Guided Optimization (PGO) on different kinds of software - the current results are available here (with a lot of other PGO-related information). Since PGO helps with achieving better performance with many compilers (like Rustc, GCC, Clang, etc.) I think trying to optimize Sage with PGO can be a good idea. I did some benchmarks and want to share my results.
Test environment
Fedora 39
Linux kernel 6.6.9
AMD Ryzen 9 5900x
48 Gib RAM
SSD Samsung 980 Pro 2 Tib
Compiler - Rustc 1.75
Sage version: the latest for now from the main branch on commit 72b536f61ebb6332c57cf57fab9fe53b1e878c1d
Disabled Turbo boost (for more stable results across runs)
Benchmarks
As a benchmark, I use built-in benchmarks with cargo bench command. For the PGO optimization phase, I use cargo-pgo with cargo pgo optimize bench. For the PGO training phase, I use the same benchmark with cargo pgo bench.
According to the tests, PGO makes things faster in Sage.
I need to note that enabling Link-Time Optimization (LTO) is generally a good idea too - this optimization works well together with PGO. I even performed some benchmarks where I compared LTO to Release: https://gist.github.com/zamazan4ik/6be63330d2c97b510fdfc6b7aa7988c5 . However, in some cases, there are performance regressions - that need to be investigated. LTO was enabled with codegen-units = 1 and lto = "fat" for the corresponding profiles in Cargo.toml file.
Further steps
I can suggest the following action points:
Perform more PGO benchmarks on Sage. If it shows improvements - add a note to the documentation about possible improvements in Sage performance with PGO.
Providing an easier way (e.g. a build option) to build scripts with PGO can be helpful for the end-users and maintainers since they will be able to optimize Sage according to their workloads.
Testing Post-Link Optimization techniques (like LLVM BOLT) would be interesting too (Clang and Rustc already use BOLT as an addition to PGO) but I recommend starting from the usual PGO.
Here are some examples of how PGO optimization is integrated into other projects:
Please treat the issue just as a benchmark report - it's not an actual error, crash, or something like that. I don't know how much you care about performance in Sage so I don't know how important these improvements are for the project. I hope we can use these benchmarks at least as an additional data point about PGO efficiency for compilers.
The text was updated successfully, but these errors were encountered:
Hello, thank you for the highly detailed writeup, I really appreciate the effort you put into writing this! These are really interesting results -- I'm very curious why there seem to be performance regressions with link-time optimizations. This is a great collection of resources for PGO in other projects as well! I will investigate adding PGO to Sage and also researching why LTO might cause it to suffer. Thanks again, fantastic issue!
Hi!
I test Profile-Guided Optimization (PGO) on different kinds of software - the current results are available here (with a lot of other PGO-related information). Since PGO helps with achieving better performance with many compilers (like Rustc, GCC, Clang, etc.) I think trying to optimize Sage with PGO can be a good idea. I did some benchmarks and want to share my results.
Test environment
main
branch on commit72b536f61ebb6332c57cf57fab9fe53b1e878c1d
Benchmarks
As a benchmark, I use built-in benchmarks with
cargo bench
command. For the PGO optimization phase, I use cargo-pgo withcargo pgo optimize bench
. For the PGO training phase, I use the same benchmark withcargo pgo bench
.Results
I got the following results:
According to the tests, PGO makes things faster in Sage.
I need to note that enabling Link-Time Optimization (LTO) is generally a good idea too - this optimization works well together with PGO. I even performed some benchmarks where I compared LTO to Release: https://gist.github.com/zamazan4ik/6be63330d2c97b510fdfc6b7aa7988c5 . However, in some cases, there are performance regressions - that need to be investigated. LTO was enabled with
codegen-units = 1
andlto = "fat"
for the corresponding profiles inCargo.toml
file.Further steps
I can suggest the following action points:
Testing Post-Link Optimization techniques (like LLVM BOLT) would be interesting too (Clang and Rustc already use BOLT as an addition to PGO) but I recommend starting from the usual PGO.
Here are some examples of how PGO optimization is integrated into other projects:
configure
scriptPlease treat the issue just as a benchmark report - it's not an actual error, crash, or something like that. I don't know how much you care about performance in Sage so I don't know how important these improvements are for the project. I hope we can use these benchmarks at least as an additional data point about PGO efficiency for compilers.
The text was updated successfully, but these errors were encountered: