Confusion about the ABX error rate #9

zhixhan · 2024-09-26T03:19:05Z

Thanks for your amazing work.

I evaluate the released xcodec model on LibriSpeech test-clean set using ABX error rate metric. I perform the evaluation with the continuous representations before RVQ and after RVQ, but get the result 9.9% and 13.2% for within ABX and cross ABX respectively, which are much higher than those reported in the paper. However, I get the consistent results 3.6 and 4.7 for SpeechTokenzier in the same way.

Could you please give me some suggestions? Thank you so much!

zhenye234 · 2024-09-26T13:46:55Z

Could you please specify the version of the xcodec model?

zhixhan · 2024-09-29T02:29:33Z

Could you please specify the version of the xcodec model?

Thank you for your reply. I test with the model named xcodec_hubert_librispeech

zhenye234 · 2024-09-29T14:45:36Z

Maybe you can try the continuous representation here

xcodec/models/soundstream_semantic.py

Line 114 in 60cf204

o_semantic = self.decoder_semantic(quantized_semantic )

zhixhan · 2024-09-30T03:01:09Z

Maybe you can try the continuous representation here

xcodec/models/soundstream_semantic.py

Line 114 in 60cf204

o_semantic = self.decoder_semantic(quantized_semantic )

Thank you for your reply! I have tested the XCodec model with o_semnatic representation and got ABX error rate 4.4% and 5.5%, which is still a little different from the result reported in your paper. (3.3% and 4.3%)

When I extracted the o_semnatic representation with SoundStream.forward method, I got the error "e_acoustic and e_semantic have different shape in dim2" at https://github.com/zhenye234/xcodec/blob/main/models/soundstream_semantic.py#L102. Thus, I added the pad operation the same as in the encode method. Although I don't think this is the cause of the inconsistent results, I don't make any other changes to the source code. Do you have any other suggestions? Thanks for your reply again.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Confusion about the ABX error rate #9

Confusion about the ABX error rate #9

zhixhan commented Sep 26, 2024

zhenye234 commented Sep 26, 2024

zhixhan commented Sep 29, 2024

zhenye234 commented Sep 29, 2024

zhixhan commented Sep 30, 2024

Confusion about the ABX error rate #9

Confusion about the ABX error rate #9

Comments

zhixhan commented Sep 26, 2024

zhenye234 commented Sep 26, 2024

zhixhan commented Sep 29, 2024

zhenye234 commented Sep 29, 2024

zhixhan commented Sep 30, 2024