You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Supports dynamic number of samples in continuous batching.
Prefill is done once, followed by multiple inserts.
Using an aux field to store finished samples, and performs postprocessing after all samples are received for a request.
Releases resource for batching after each sample is done. typically, num_live_batches should be set to num_slots // prefill_batch_size or larger.
Since prefill produces first token, we require the method to provide a new function to resample initial tokens.
PiperOrigin-RevId: 671886877
Change-Id: Id4ddec1f99e8e13d755bbeb5343aab8ca12f688e
0 commit comments