Skip to content

Commit

Permalink
Slightly reduce sleep time when batching queries
Browse files Browse the repository at this point in the history
This can give a small speedup for free, since usually batched queries
all come in within <0.5s

Signed-off-by: Haifeng Qian <[email protected]>
  • Loading branch information
odelalleau authored and haifengqian committed Dec 3, 2024
1 parent 2f2b0c8 commit 3409659
Showing 1 changed file with 1 addition and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -193,7 +193,7 @@ def chat_completion(self, data):

MegatronGenerate.inputs.append(conversation)
MegatronGenerate.tasks.put(None) # The tasks queue is only used as a "counter"
time.sleep(1)
time.sleep(0.5) # process one batch every 0.5s
else:
queryid = 0
end_strings = ['<|endoftext|>', special_tokens['turn_start'], special_tokens['label_start']]
Expand Down

0 comments on commit 3409659

Please sign in to comment.