Skip to content

Conversation

@gpetters-amd
Copy link
Contributor

There are still two outstanding issues I'd like some comments on, but otherwise this should be basically done.

huggingface_hub.snapshot_download(
repo_id=self.hf_model_name, cache_dir=cache_dir
)
# TODO: Convert to gguf, delete cache
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way that sharktank recommends for generating the .gguf file is to use a CLI tool from llama.cpp. Is that still the best way to extract that, or do we have a way to do it using sharktank?

model = PagedLlamaModelV1(dataset.root_theta, llama_config)

fxb = FxProgramsBuilder(model)
self.torch_ir = export(fxb)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why, but this is producing an empty module. Any idea what I'm missing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant