Skip to content

[Compression] Update sparsity calculation lifecycle when fetching the compressor #1332

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Apr 14, 2025

Conversation

dsikka
Copy link
Collaborator

@dsikka dsikka commented Apr 8, 2025

Summary

1. No longer always infer sparsity

  • If there is existing sparsity in the model that should be accounted for while compressing the model, the user is required to indicate this by providing a sparsity config. This is not a common case and primarily something we've seen from research. As a result, it does not need to be the default pathway
  • If no sparsity config is provided, only check for sparsity if a recipe with sparsity was applied OR the user explicitly would like to apply a sparsity compressor by setting skip_compression_stats parameter
  • Based on these changes, we no longer use the global_sparsity to dictate whether or not a sparse compressor is applied

2. Change the name of skip_compression_stats to skip_sparsity_compression_stats and set to True by default

3. Rename sparsity_config to sparsity_metadata_config for clarity

4. Add additional log outputs to make it a lot more clear as to what steps are running

5. Fix log outputs to prevent cutting off

Follow-ups:

  • Need to update passing saving arguments properly - they are currently not being passed apart from save_compressed and there is no clear API on what the arguments are/how they should be passed, both to save_pretrained and when providing an output dir to oneshot

Testing

  • Tested a sparse-only model
  • Tested a model with both sparsity and quantization
  • Tested a model with existing sparsity
  • Tested a model with only quantization

For all cases, the model compresses to the expected format.

Copy link

github-actions bot commented Apr 8, 2025

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed.

@dsikka dsikka requested a review from rahul-tuli April 8, 2025 01:39
Copy link
Collaborator

@rahul-tuli rahul-tuli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, the functionality seems to be all there! LGTM pending a small nit and some clean up (docstrings and such)

@dsikka dsikka added the ready When a PR is ready for review label Apr 8, 2025
@dsikka dsikka marked this pull request as ready for review April 8, 2025 14:16
@dsikka dsikka marked this pull request as draft April 8, 2025 21:18
@dsikka dsikka marked this pull request as ready for review April 9, 2025 01:18
@dsikka dsikka marked this pull request as draft April 9, 2025 02:46
@dsikka dsikka marked this pull request as ready for review April 9, 2025 03:51
…etune_with_tokenizer.py

Co-authored-by: Brian Dellabetta <[email protected]>
@dsikka dsikka merged commit 517a3ef into main Apr 14, 2025
8 checks passed
@dsikka dsikka deleted the update_compression_saving branch April 14, 2025 18:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready When a PR is ready for review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants