-
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[V1] Feedback Thread #12568
Comments
👍 I have not done a proper benchmark but V1 feels superior, i.e. higher throughput + lower latency, TTFT. I have encountered a possible higher memory consumption issue, but am overall very pleased with the vllm community's hard work on V1. |
Does anyone know about this bug with n>1? Thanks |
Logging is in progress. Current main has a lot more and we will maintain compatibility with V0. Thanks! |
Quick feedback [VLLM_USE_V1=1]:
|
Thanks, both are in progress |
are logprobs output (and specifically prompt logprobs with echo=True) expected to be working with current V1 (0.7.0)? |
Maybe there is a better place to discuss this but the implementation for models that use more than one extra modality is quite non-intuitive. |
Still in progress |
Please leave comments here about your usage of V1, does it work? does it not work? which feature do you need in order to adopt it? any bugs?
For bug report, please file it separately and link the issue here.
For in depth discussion, please feel free to join #sig-v1 in the vLLM Slack workspace.
The text was updated successfully, but these errors were encountered: