-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hugging Face Transformer Deployment Tutorial #49
Conversation
Quick_Deploy/HuggingFaceTransformers/base_text_classification_model.py
Outdated
Show resolved
Hide resolved
Quick_Deploy/HuggingFaceTransformers/base_text_classification_model.py
Outdated
Show resolved
Hide resolved
Quick_Deploy/HuggingFaceTransformers/base_text_generation_model.py
Outdated
Show resolved
Hide resolved
… add README, restructure repo
Quick_Deploy/HuggingFaceTransformers/text_generation/config.pbtxt
Outdated
Show resolved
Hide resolved
Quick_Deploy/HuggingFaceTransformers/text_generation/config.pbtxt
Outdated
Show resolved
Hide resolved
All generation scripts were removed and replaced with static files. This new tutorial covers deploying falcon7b, persimmon-8b, and mistral 7b. Down the road, these models may get there own READMEs in a "Popular Models Guide" folder cc @jbkyang-nvi. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great tutorial overall! Only minor comments 🚀
@nnshah1. I preemptively removed Mistral from the tutorial. I can always revert if necessary. |
Incorporated some feedback from Dora incorporating how to gather performance metrics, load cached models, and adding comments. |
CC @nv-braf @matthewkotila in case there is any feedback regarding the PA/MA section. |
PA stuff LGTM 👍 |
Tutorials to show how hugging face transformers can be quickly deployed in Triton.