Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Allow loading and serializing with tensorizer #2

Draft
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

sangstar
Copy link

@sangstar sangstar commented Dec 19, 2024

This draft PR aims to unintrusively integrate tensorizer in to exllamav2's model loading machinery, along with adding tests for correctness and avoiding regressions.

Currently in a draft PR stage. The tests have been most recently updated, but the core logic needs to be made less intrusive and less smelly.

Still to add:

  • Add comments to test file for better clarity
  • Make way tensorizer is exposed in config machinery less intrusive
  • Decide whether to use io_handler for all Tensorizer I/O stuff or retire it altogether
  • Consider rethinking the way packaging tensorizer configurable stuff is done, whether it needs a dedicated arguments class or if just packing them in their config class is appropriate
  • Decide if .state_dict should be a public attribute
  • Allow passing TensorDeserializer args to calls to TensorDeserializer with some wrapper

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant