[Compression] Update sparsity calculation lifecycle when fetching the compressor #1332

dsikka · 2025-04-08T01:28:03Z

Summary

Updating the lifecycle based on feedback from research and meta
Very slow saving of models due to irrelevant stats computation #1293 is addressed @eldarkurtic

1. No longer always infer sparsity

If there is existing sparsity in the model that should be accounted for while compressing the model, the user is required to indicate this by providing a sparsity config. This is not a common case and primarily something we've seen from research. As a result, it does not need to be the default pathway
If no sparsity config is provided, only check for sparsity if a recipe with sparsity was applied OR the user explicitly would like to apply a sparsity compressor by setting skip_compression_stats parameter
Based on these changes, we no longer use the global_sparsity to dictate whether or not a sparse compressor is applied

2. Change the name of `skip_compression_stats` to `skip_sparsity_compression_stats` and set to True by default

3. Rename `sparsity_config` to `sparsity_metadata_config` for clarity

4. Add additional log outputs to make it a lot more clear as to what steps are running

5. Fix log outputs to prevent cutting off

Follow-ups:

Need to update passing saving arguments properly - they are currently not being passed apart from save_compressed and there is no clear API on what the arguments are/how they should be passed, both to save_pretrained and when providing an output dir to oneshot

Testing

Tested a sparse-only model
Tested a model with both sparsity and quantization
Tested a model with existing sparsity
Tested a model with only quantization

For all cases, the model compresses to the expected format.

github-actions · 2025-04-08T01:28:12Z

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed.

rahul-tuli

Nice, the functionality seems to be all there! LGTM pending a small nit and some clean up (docstrings and such)

src/llmcompressor/transformers/sparsification/compressed_tensors_utils.py

… check

tests/llmcompressor/transformers/sparsification/test_compress_tensor_utils.py

src/llmcompressor/transformers/finetune/session_mixin.py

tests/llmcompressor/transformers/finetune/test_oneshot_and_finetune_with_tokenizer.py

src/llmcompressor/transformers/finetune/session_mixin.py

…etune_with_tokenizer.py Co-authored-by: Brian Dellabetta <[email protected]>

update sparsity calculation lifecycle when fetching the compressor

8b6c56a

remove global check

0bb62e1

dsikka requested a review from rahul-tuli April 8, 2025 01:39

rahul-tuli reviewed Apr 8, 2025

View reviewed changes

src/llmcompressor/transformers/sparsification/compressed_tensors_utils.py Outdated Show resolved Hide resolved

dsikka added 2 commits April 8, 2025 14:14

update condition

bec4666

Merge branch 'main' into update_compression_saving

58f9beb

dsikka added the ready When a PR is ready for review label Apr 8, 2025

dsikka marked this pull request as ready for review April 8, 2025 14:16

brian-dellabetta previously approved these changes Apr 8, 2025

View reviewed changes

add/update logs; update tests; remove extra global sparsity condition…

9d4436d

… check

dsikka dismissed brian-dellabetta’s stale review via 9d4436d April 8, 2025 16:52

brian-dellabetta previously approved these changes Apr 8, 2025

View reviewed changes

tests/llmcompressor/transformers/sparsification/test_compress_tensor_utils.py Show resolved Hide resolved

update

666595f

dsikka dismissed brian-dellabetta’s stale review via 666595f April 8, 2025 19:44

brian-dellabetta previously approved these changes Apr 8, 2025

View reviewed changes

dsikka marked this pull request as draft April 8, 2025 21:18

revert saving for now

123b23f

dsikka dismissed brian-dellabetta’s stale review via 123b23f April 8, 2025 22:23

dsikka added 4 commits April 8, 2025 22:46

update

4198a2c

update example

cf5abdb

update tests

d4cf86a

Merge branch 'main' into update_compression_saving

6450fcc

dsikka marked this pull request as ready for review April 9, 2025 01:18

dsikka marked this pull request as draft April 9, 2025 02:46

dsikka marked this pull request as ready for review April 9, 2025 03:51

dsikka requested review from rahul-tuli and brian-dellabetta April 9, 2025 13:53

brian-dellabetta reviewed Apr 9, 2025

View reviewed changes

src/llmcompressor/transformers/finetune/session_mixin.py Show resolved Hide resolved

tests/llmcompressor/transformers/finetune/test_oneshot_and_finetune_with_tokenizer.py Outdated Show resolved Hide resolved

src/llmcompressor/transformers/finetune/session_mixin.py Show resolved Hide resolved

Update tests/llmcompressor/transformers/finetune/test_oneshot_and_fin…

f9d5b39

…etune_with_tokenizer.py Co-authored-by: Brian Dellabetta <[email protected]>

Merge branch 'main' into update_compression_saving

847aa95

brian-dellabetta approved these changes Apr 9, 2025

View reviewed changes

Merge branch 'main' into update_compression_saving

236a156

rahul-tuli approved these changes Apr 14, 2025

View reviewed changes

kylesayrs approved these changes Apr 14, 2025

View reviewed changes

dsikka merged commit 517a3ef into main Apr 14, 2025
8 checks passed

dsikka deleted the update_compression_saving branch April 14, 2025 18:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Compression] Update sparsity calculation lifecycle when fetching the compressor #1332

[Compression] Update sparsity calculation lifecycle when fetching the compressor #1332

dsikka commented Apr 8, 2025 •

edited

Loading

github-actions bot commented Apr 8, 2025

rahul-tuli left a comment

[Compression] Update sparsity calculation lifecycle when fetching the compressor #1332

[Compression] Update sparsity calculation lifecycle when fetching the compressor #1332

Conversation

dsikka commented Apr 8, 2025 • edited Loading

Summary

1. No longer always infer sparsity

2. Change the name of skip_compression_stats to skip_sparsity_compression_stats and set to True by default

3. Rename sparsity_config to sparsity_metadata_config for clarity

4. Add additional log outputs to make it a lot more clear as to what steps are running

5. Fix log outputs to prevent cutting off

Follow-ups:

Testing

github-actions bot commented Apr 8, 2025

rahul-tuli left a comment

Choose a reason for hiding this comment

dsikka commented Apr 8, 2025 •

edited

Loading

2. Change the name of `skip_compression_stats` to `skip_sparsity_compression_stats` and set to True by default

3. Rename `sparsity_config` to `sparsity_metadata_config` for clarity