Skip to content

What is data parallelism and how can I use it to speed up my processing? #756

Closed Answered by Ying1123
w013nad asked this question in Q&A
Discussion options

You must be logged in to vote

The workload in your benchmark scripts is too lightweight. There are also some bottlenecks in your benchmark scripts which make it unable to correctly measure the time cost of large batch sizes. I suggest using the built-in benchmark scripts in sglang. To show the ideal speedup data parallelsim, we need heavy workloads to fully saturate the server.

You can try this benchmark command.

python3 -m sglang.bench_serving --backend sglang --num-p 500 --disable-stream --dataset-name random --random-input 4096 --random-output 256 --random-range-ratio 1.0 --host localhost --port 30000

With tp=1, dp=1 (python -m sglang.launch_server --model-path meta-llama/Meta-Llama-3-8B), I got

Request throughput…

Replies: 3 comments

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Answer selected by merrymercy
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants