Releases: foundation-model-stack/fms-hf-tuning
Releases · foundation-model-stack/fms-hf-tuning
v2.3.1
Summary of changes in this release
New feature updates around data handling and preprocessing:
- Enable loading of Parquet and Arrow Dataset files.
- Dataset mixing via sampling probabilities in data config.
- New additional_data_handlers arg in train function to be registered with the data preprocessor.
- Support multiple files, directories, pattern-based paths, HF Dataset IDs, and their combinations via
data_config
. - New support for both multi-turn and single-turn chat interactions.
New tracker:
- New MLFlow tracker
Additional Changes
- Refactor test artifacts into tests/artifacts , adding new data types, datasets, and predefined data configs for new unit tests.
- Resolve issues with deprecated training arguments.
Full list of Changes
- feat: Add support to handle Parquet Dataset files via data config by @Abhishek-TAMU in #401
- test: add arrow datasets and arrow unit tests by @willmj in #403
- feat: Perform dataset mixing via sampling probabilities in data config by @dushyantbehl in #408
- feat: Expose additional data handlers as an argument in train by @dushyantbehl in #409
- fix: Move deprecated positional arguments from SFTTrainer to SFTConfig by @Luka-D in #399
- fix: update dataclass objects directly instead of creating new variables by @kmehant in #418
- test: Add unit tests to test multiple files in single dataset by @Abhishek-TAMU in #412
- feat: Add multi and single turn chat support by @dushyantbehl in #415
- feat: Integrate MLflow tracker by @dushyantbehl in #425
- feat: Handle passing of multiple files, multiple folders, path with patterns, HF Dataset and combination by @Abhishek-TAMU in #424
- docs: Add documentation for data preprocessor release by @dushyantbehl in #423
New Contributors
Full Changelog: v2.2.0...v2.3.1
v2.3.0
v2.3.0-rc.1
What's Changed
- feat: Add support to handle Parquet Dataset files via data config by @Abhishek-TAMU in #401
- test: add arrow datasets and arrow unit tests by @willmj in #403
- feat: Perform dataset mixing via sampling probabilities in data config by @dushyantbehl in #408
- feat: Expose additional data handlers as an argument in train by @dushyantbehl in #409
- fix: Move deprecated positional arguments from SFTTrainer to SFTConfig by @Luka-D in #399
- fix: update dataclass objects directly instead of creating new variables by @kmehant in #418
- test: Add unit tests to test multiple files in single dataset by @Abhishek-TAMU in #412
- feat: Add multi and single turn chat support by @dushyantbehl in #415
- feat: Integrate MLflow tracker by @dushyantbehl in #425
- feat: Handle passing of multiple files, multiple folders, path with patterns, HF Dataset and combination by @Abhishek-TAMU in #424
New Contributors
Full Changelog: v2.2.1...v2.3.0
v2.2.1
Foundational Updates
- Addition of new data preprocessor framework as a base code for future enhancements, while maintaining full compatibility with existing features.
Additional Changes
- Added a Data Preprocessor ADR.
- Moved test datasets from
tests/data
totests/artifacts/testdata
.
Full list of Changes
- docs: Data Preprocessor ADR by @dushyantbehl in #374
- fix: bad name in generic tracker ADR by @dushyantbehl in #394
- fix: Move test datasets to tests/artifacts/testdata instead of tests/data by @dushyantbehl in #398
- feat: DataProcessor v1 by @dushyantbehl @Abhishek-TAMU @willmj in #381
- chore: Release v2.2.0 by @Abhishek-TAMU in #404
- fix: Limit trl version to <0.12 by @Abhishek-TAMU in #406
- chore: Release v2.2.0 after limiting TRL version by @Abhishek-TAMU in #407
Full Changelog: v2.1.2...v2.2.1
v2.2.0
v2.2.0-rc.1
What's Changed
- docs: Data Preprocessor ADR by @dushyantbehl in #374
- fix: bad name in generic tracker ADR by @dushyantbehl in #394
- fix: Move test datasets to tests/artifacts/testdata instead of tests/data by @dushyantbehl in #398
- feat: DataProcessor v1 by @dushyantbehl @Abhishek-TAMU @willmj in #381
Full Changelog: v2.1.2-rc.1...v2.2.0-rc.1
v2.1.2
v2.1.2-rc.1
What's Changed
- build(deps): set lower limit for transformers to 4.45 for granite 3.0 by @willmj in #387
- docs: Update supported models by @aluu317 in #389
Full Changelog: v2.1.1...v2.1.2-rc.1
v2.1.1
What's Changed
Dependency changes
- Pull in new versions of fms-acceleration-peft, fms-acceleration-foak with fixes for AutoGPTQ and gradient accumulation hooks and adds granite GPTQ model
- build(deps): set transformers below 4.46, waiting on fixes by @anhuong in #384
Additional changes
- docs: Update Supported Models List in README by @tharapalanivel in #382
Full Changelog: v2.1.0...v2.1.1
v2.1.1-rc.2
deps: set transformers below 4.46, waiting on fixes (#384) Signed-off-by: Anh Uong <[email protected]>