Skip to content
This repository has been archived by the owner on Oct 13, 2022. It is now read-only.

Bug in computing encoder padding mask #240

Open
csukuangfj opened this issue Aug 2, 2021 · 1 comment
Open

Bug in computing encoder padding mask #240

csukuangfj opened this issue Aug 2, 2021 · 1 comment

Comments

@csukuangfj
Copy link
Collaborator

csukuangfj commented Aug 2, 2021

It happens only when --concatenate-cuts=True.

See the problematic code below (line 692):

for idx in range(supervision_segments.size(0)):
# Note: TorchScript doesn't allow to unpack tensors as tuples
sequence_idx = supervision_segments[idx, 0].item()
start_frame = supervision_segments[idx, 1].item()
num_frames = supervision_segments[idx, 2].item()
lengths[sequence_idx] = start_frame + num_frames

When --concatenate-cuts=True, several utterances may be concatenated into one sequence.
So lengths[sequence_idx] may correspond to multiple utterances. Later utterances will OVERWRITE
the value of lengths[sequence_idx] set by earlier utterances if the sequence with sequence_id contains
at least two utterances.

@csukuangfj
Copy link
Collaborator Author

I found this bug while writing tests for encoder_padding_mask. Liyong and I disabled --concatenate-cuts during training,
so it is not a problem for us.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant