Finally, the Version 1.0 release is here! The software has been stable and ready for production use for quite some time now and after being in beta for about a half a year, we are confident that the current version of the software deserves to mark the first major release of Kernel Tuner.
Version 1.0 integrates a lot of new functionality, including blazing fast search space construction, support for tuning HIP kernels on AMD GPUs, new functionality for mixed precision and accuracy tuning, experimental support for tuning OpenACC programs, a conda package installer for Kernel Tuner, and many more changes and additions.
I would like to thank every one involved in the development of Kernel Tuner of the past years! Special thanks to the Kernel Tuner developers team for their continued support of the project!
From the Changelog
- HIP backend to support tuning HIP kernels on AMD GPUs
- Experimental features for mixed-precision and accuracy tuning
- Experimental features for OpenACC tuning
- Major speedup due to new parser and using revamped python-constraint for searchspace building
- Implemented ability to use
PySMT
andATF
for searchspace building - Added Poetry for dependency and build management
- Switched from
setup.py
andsetup.cfg
topyproject.toml
for centralized metadata, added relevant tests - Updated GitHub Action workflows to use Poetry
- Updated dependencies, most notably NumPy is no longer version-locked as scikit-opt is no longer a dependency
- Documentation now uses
pyproject.toml
metadata, minor fixes and changes to be compatible with updated dependencies - Set up Nox for testing on all supported Python versions in isolated environments
- Added linting information, VS Code settings and recommendations
- Discontinued use of
OrderedDict
, as all dictionaries in the Python versions used are already ordered - Dropped Python 3.7 support
Merged Pull Requests
- HIP Backend by @MiloLurati in #199
- Accuracy tuning by @stijnh in #189
- Fix issue where HIP backend fails due to invalid arguments type by @stijnh in #216
- Searchspace improvements and project meta modernization by @fjwillemsen in #214
- Minor bugfix by @isazi in #219
- OpenACC support by @isazi in #197
- Fixed broken tests as per issue #217 by @fjwillemsen in #220
- Fix snap_to_nearest on non-numeric parameters by @stijnh in #221
- expand documentation on backends by @benvanwerkhoven in #213
- Add support for passing cupy arrays to "C" lang by @bouweandela in #226
- improve code quality of cache file related functions by @benvanwerkhoven in #240
- New readme by @benvanwerkhoven in #231
New Contributors
- @MiloLurati made their first contribution in #199
- @dependabot made their first contribution in #222
- @bouweandela made their first contribution in #226
Full Changelog: 0.4.5...1.0