You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Phenomenon description: In some cases, the code completion function experiences some lag and does not return completion content for a long time, sometimes for more than 10 seconds.
Known information: The llama-server process's CPU usage can reach 100%, while the GPU average utilization rate is around 20%, with a peak of no more than 40%.
Machine configuration: Single GPU, NVIDIA V100*8, 64C/256G
Are there any optimization deployment configurations to improve performance?
thanks!
The text was updated successfully, but these errors were encountered:
Phenomenon description: In some cases, the code completion function experiences some lag and does not return completion content for a long time, sometimes for more than 10 seconds.
Known information: The llama-server process's CPU usage can reach 100%, while the GPU average utilization rate is around 20%, with a peak of no more than 40%.
Machine configuration: Single GPU, NVIDIA V100*8, 64C/256G
Are there any optimization deployment configurations to improve performance?
thanks!
The text was updated successfully, but these errors were encountered: