You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm seconding this: vLLM is a self-deployed LLM inference engine, but it does support model serving over an OpenAI-compatible API, which is what @jueming0312 is asking about. If this is a goal of the project, I would suggest publishing a package that bundles the vLLM server code with the MInference patch.
Describe the issue
Can I run "python -m vllm.entrypoints.openai.api_server" to load MInference capabilities in VLLM?
The text was updated successfully, but these errors were encountered: