-
Notifications
You must be signed in to change notification settings - Fork 996
Issues: NVIDIA/TensorRT-LLM
[Issue Template]Short one-line summary of the issue #270
#783
opened Jan 1, 2024 by
juney-nvidia
Open
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Issues with installing on Windows
bug
Something isn't working
#2489
opened Nov 23, 2024 by
PyroGenesis
1 of 4 tasks
In streaming output mode, some Chinese characters are decoded as garbled characters
#2488
opened Nov 23, 2024 by
HongfengDu
int4 not faster than fp16 and fp8
bug
Something isn't working
#2487
opened Nov 22, 2024 by
ShuaiShao93
4 tasks
Inconsistency with penaltyKernels.cu
bug
Something isn't working
#2486
opened Nov 22, 2024 by
buddhapuneeth
2 of 4 tasks
QwenVL build failed.
bug
Something isn't working
#2483
opened Nov 22, 2024 by
Wonder-donbury
2 of 4 tasks
Medusa performance degrades with batch size larger than 1
performance issue
Issue about performance number
#2482
opened Nov 22, 2024 by
SoundProvider
How to install tensorrt-llm in python3.11?
installation
question
Further information is requested
#2481
opened Nov 22, 2024 by
janelu9
Can't uniquely locate model_spec module
installation
triaged
Issue has been triaged by maintainers
#2480
opened Nov 21, 2024 by
weizhi-wang
error: make -C docker release_build : Command 'git submodule update --init --recursive' returned non-zero exit status 128
installation
triaged
Issue has been triaged by maintainers
#2479
opened Nov 21, 2024 by
xddun
1 of 4 tasks
undefined reference to `__libc_single_threaded'
bug
Something isn't working
installation
triaged
Issue has been triaged by maintainers
#2475
opened Nov 21, 2024 by
hoangvictor
1 of 4 tasks
Does tensorRT-LLM support serving 4bit quantised unsloth Llama model
quantization
Issue about lower bit quantization, including int8, int4, fp8
question
Further information is requested
triaged
Issue has been triaged by maintainers
#2472
opened Nov 20, 2024 by
jayakommuru
Error: convert_checkpoint in TensorRT-LLM for Llama3.2 3B when tested on multiple versions
Investigating
not a bug
Some known limitation, but not a bug.
triaged
Issue has been triaged by maintainers
#2471
opened Nov 20, 2024 by
DeekshithaDPrakash
2 of 4 tasks
build trtllm very slow and raise an error
bug
Something isn't working
triaged
Issue has been triaged by maintainers
#2469
opened Nov 20, 2024 by
anaivebird
2 of 4 tasks
Error convert_checkpoint in TensorRT-LLM 0.13.0 for Llama3.2 3B
Investigating
triaged
Issue has been triaged by maintainers
#2467
opened Nov 19, 2024 by
yspch2022
Performance issue with batching
bug
Something isn't working
performance issue
Issue about performance number
#2466
opened Nov 19, 2024 by
ShuaiShao93
1 of 4 tasks
Upgrade transformers to 4.45.2
question
Further information is requested
triaged
Issue has been triaged by maintainers
#2465
opened Nov 19, 2024 by
cupertank
Error when convert Deekseek-V2-Lite model
bug
Something isn't working
triaged
Issue has been triaged by maintainers
#2463
opened Nov 19, 2024 by
WhatGhost
2 of 4 tasks
How to make sure enable_kv_cache_reuse working correctly?
bug
Something isn't working
#2462
opened Nov 19, 2024 by
chwma0
2 of 4 tasks
enableBlockReuse is available for multimodal ?
question
Further information is requested
triaged
Issue has been triaged by maintainers
#2459
opened Nov 19, 2024 by
YSF-A
Is it possible load quantized model from huggingface?
quantization
Issue about lower bit quantization, including int8, int4, fp8
question
Further information is requested
triaged
Issue has been triaged by maintainers
#2458
opened Nov 19, 2024 by
pei0033
Windows C++ Executor on v0.10.0
bug
Something isn't working
triaged
Issue has been triaged by maintainers
#2457
opened Nov 19, 2024 by
rifkybujana
2 of 4 tasks
KV cache re-use impact on average sequence latency?
performance issue
Issue about performance number
question
Further information is requested
triaged
Issue has been triaged by maintainers
#2456
opened Nov 18, 2024 by
mkserge
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.