Skip to content

Memory management for large batch sizes #187

Closed Answered by hnyls2002
pj-ml asked this question in Q&A
Discussion options

You must be logged in to vote

@pj-ml, could you try reducing the argument --mem-fraction-static? When the batch size increases, the backend may require more GPU memory for temporary usage.

https://github.com/sgl-project/sglang/?tab=readme-ov-file#additional-arguments

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by pj-ml
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants