-
Notifications
You must be signed in to change notification settings - Fork 996
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TensorRT-LLM Requests #632
Comments
Please add CohereAI!! CohereForAI/c4ai-command-r-plus |
Llama 3 would be great (both 8B and 70B): #1470 Maybe quantized to 8 or even 4 bit. |
currently llama 3 throws a bunch of errors converting to TensorRT LLM any ideal about the support for llama 3 |
Phi-3-mini should be amazing! Such a small 3.8B model could run quantized on many GPUs, with as little as 4GB VRAM. |
+1 for Phi-3 |
+1 for Command R Plus! CohereForAI/c4ai-command-r-plus |
hello @ncomly-nvidia, I am a student interested in the project! I want to ask if there are any good-first-issue feature request for Features & Optimizations recently? 🤣 |
+1 for OpenBMB/MiniCPM-V-2 |
Any news on support for jetson platform? Thanks in advance. |
Requesting support for Meta's m4t v2 model, like how whisper support is provided. |
How is it going for Jetson AGX ? It would be nice if all is compatible before Jetson Thor launch |
LLaMa 3.2 multimodal vision models anytime soon? |
cc @laikhtewari for vis. |
congrats Nvidia: https://www.jetson-ai-lab.com/tensorrt_llm.html |
You can refer to the v0.12-jetson branch. |
Hi all, this issue will track the feature requests you've made to TensorRT-LLM & provide a place to see what TRT-LLM is currently working on.
Last update:
Jan 14th, 2024
🚀 = in development
Models
Decoder Only
Encoder / Encoder-Decoder
Multi-Modal
Other
Features & Optimizations
implementation done - documentation in progress
KV Cache
Quantization
Sampling
frequnecy_penalty
- Support forfrequency_penalty
#275repetition
&presence
penalties - Support for combiningrepetition_penalty
,presence_penalty
#274Workflow
Front-ends
Integrations
Usage / Installation
Platform Support
The text was updated successfully, but these errors were encountered: