Docker Image #217

ArrichM · 2024-02-22T15:25:45Z

ArrichM
Feb 22, 2024

Thank you so much for the great work you have done with sglang, it has so far been a really great experience for our use cases.

I created a Docker image to run an sglang server with flashinfer installed. The image is built on top of the official vllm image . To run it, use:

docker run --runtime nvidia --gpus all -v ~/.cache/huggingface:/root/.cache/huggingface -p 8000:8000 --ipc=host arrichm/sglang:latest --model-path mistralai/Mistral-7B-v0.1 --host 0.0.0.0

Here:
--model-path: specifies the model you want to serve
--host 0.0.0.0: Makes the endpoint accessible from outside the container.

If not already installed, you need to install the NVIDIA Container Toolkit before running.

tomheno · 2024-05-11T13:21:10Z

tomheno
May 11, 2024

Hello @ArrichM
I know this is a bit outdate, but would it be possible for you to provide the used DockerFile ?

Thank in advance !

0 replies

merrymercy · 2024-07-27T11:07:04Z

merrymercy
Jul 27, 2024
Maintainer

Done https://github.com/sgl-project/sglang?tab=readme-ov-file#method-3-using-docker

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Docker Image #217

{{title}}

Replies: 2 comments

{{title}}

{{title}}

Select a reply

Docker Image #217

ArrichM Feb 22, 2024

Replies: 2 comments

tomheno May 11, 2024

merrymercy Jul 27, 2024 Maintainer

ArrichM
Feb 22, 2024

tomheno
May 11, 2024

merrymercy
Jul 27, 2024
Maintainer