Skip to content

Commit 049e123

Browse files
ukoxyzcopybara-github
authored andcommitted
Supports dynamic number of samples in continuous batching.
Prefill is done once, followed by multiple inserts. Using an aux field to store finished samples, and performs postprocessing after all samples are received for a request. Releases resource for batching after each sample is done. typically, num_live_batches should be set to num_slots // prefill_batch_size or larger. Since prefill produces first token, we require the method to provide a new function to resample initial tokens. PiperOrigin-RevId: 671886877 Change-Id: Id4ddec1f99e8e13d755bbeb5343aab8ca12f688e
1 parent 138be5f commit 049e123

File tree

3 files changed

+187
-53
lines changed

3 files changed

+187
-53
lines changed

0 commit comments

Comments
 (0)