Skip to content

Questions about benchmark in the blog #1124

Closed Answered by zhyncs
jtmer asked this question in Q&A
Discussion options

You must be logged in to vote
python -m sglang.launch_server --model-path meta-llama/Meta-Llama-3.1-8B-Instruct --enable-torch-compile --disable-radix-cache
python -m vllm.entrypoints.openai.api_server --model meta-llama/Meta-Llama-3.1-8B-Instruct --disable-log-requests

python3 -m sglang.bench_serving --backend sglang --dataset-name random --num-prompts 6000 --random-input 256 --random-output 512
python3 -m sglang.bench_serving --backend vllm --dataset-name random --num-prompts 6000 --random-input 256 --random-output 512

And then, compare the output throughput.

Replies: 3 comments 2 replies

Comment options

You must be logged in to vote
2 replies
@jtmer
Comment options

@zhyncs
Comment options

Comment options

You must be logged in to vote
0 replies
Answer selected by jtmer
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants