-
Notifications
You must be signed in to change notification settings - Fork 238
tflite
tensorflow_asr tflite \
--config-path=/path/to/config.yml.j2 \
--h5=/path/to/weight.h5 \
--bs=1 \ # Batch size
--beam-width=0 \ # Beam width, set >0 to enable beam search
--output=/path/to/output.tflite
## See others params
tensorflow_asr tflite --helpInput of each tflite depends on the models' parameters and configs.
The inputs, inputs_length and previous_tokens are still the same as bellow for all models.
schemas.PredictInput(
inputs=tf.TensorSpec([batch_size, None], dtype=tf.float32),
inputs_length=tf.TensorSpec([batch_size], dtype=tf.int32),
previous_tokens=tf.TensorSpec.from_tensor(self.get_initial_tokens(batch_size)),
previous_encoder_states=tf.TensorSpec.from_tensor(self.get_initial_encoder_states(batch_size)),
previous_decoder_states=tf.TensorSpec.from_tensor(self.get_initial_decoder_states(batch_size)),
)For models that don't have encoder states or decoder states, the default values are tf.zeros([], dtype=self.dtype) tensors for previous_encoder_states and previous_decoder_states. This is just for tflite conversion because tflite does not allow None value in input_signature. However, the output next_encoder_states and next_decoder_states are still None, so we can simply ignore those outputs.
schemas.PredictOutputWithTranscript(
transcript=self.tokenizer.detokenize(outputs.tokens),
tokens=outputs.tokens,
next_tokens=outputs.next_tokens,
next_encoder_states=outputs.next_encoder_states,
next_decoder_states=outputs.next_decoder_states,
)This is for supporting streaming inference.
Each output corresponds to the input = each chunk of audio signal.
Then we can overwrite previous_tokens, previous_encoder_states and previous_decoder_states with next_tokens, next_encoder_states and next_decoder_states for the next chunk of audio signal.
And continue until the end of the audio signal.
See examples/inferences/tflite.py for more details.