Predict on CPU #15

patriotyk · 2023-08-08T14:59:39Z

As I understand form code there is hardcoded CUDA support. So I have changed device to cpu and replaced model.cuda() with model.cpu() But when I run predict I got strange error:

RuntimeError: Argument #4: Padding size should be less than the corresponding input dimension, but got: padding (256, 256) at dimension 2 of input [1, 2, 160]

I don't know is it problem with cpu or something else.

The text was updated successfully, but these errors were encountered:

m-mandel · 2023-08-08T20:53:26Z

Hi,
Thank you for trying our code!
This might be because of the dimensions of your input. If I recall correctly, we assume that the wav file the path directs to is a single-channel (mono) wav file. Could it be that your input is stereo instead of mono?

Also, the audio file should not be too short. I think that at least 1 second long.
Let me know if this helps, and anything else to help me reproduce the bug myself.

Best,
M

patriotyk · 2023-08-09T06:59:57Z

Yes, you are right, my input was stereo, thank you. Now it works, but output is much worse than original.

m-mandel · 2023-08-10T05:27:47Z

Which ckpt were you using? what are the source and target sample rates?

patriotyk · 2023-08-10T09:04:38Z

I use this checkpoint https://drive.google.com/drive/folders/1JK9VqgfQsWEPOFUkp9Y5OR62G9i3disf

Source is 12kHz and output file generated in 16kHz. As I understand I run incorrect command:

python predict.py dset=4-16 experiment=aero_4-16_512_256

but it should be

python predict.py dset=12-48 experiment=aero_12-48_512_256

but in this repository is only 4-16experiments, and no any 12-48 experiment files. Did you forget to add it? Or I should create them manually?

m-mandel · 2023-08-10T17:25:41Z

Yes, you are right - you need to modify the configuration file. If I recall correctly, the only thing you need to change are the sampling rates.
From:

lr_sr: 4000 # low resolution sample rate, added to support BWE. Should be included in training cfg
hr_sr: 16000 # high resolution sample rate. Should be included in training cfg

to:

lr_sr: 12000 # low resolution sample rate, added to support BWE. Should be included in training cfg
hr_sr: 48000 # high resolution sample rate. Should be included in training cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Predict on CPU #15

Predict on CPU #15

patriotyk commented Aug 8, 2023

m-mandel commented Aug 8, 2023

patriotyk commented Aug 9, 2023

m-mandel commented Aug 10, 2023

patriotyk commented Aug 10, 2023

m-mandel commented Aug 10, 2023

Predict on CPU #15

Predict on CPU #15

Comments

patriotyk commented Aug 8, 2023

m-mandel commented Aug 8, 2023

patriotyk commented Aug 9, 2023

m-mandel commented Aug 10, 2023

patriotyk commented Aug 10, 2023

m-mandel commented Aug 10, 2023