Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Predict on CPU #15

Open
patriotyk opened this issue Aug 8, 2023 · 5 comments
Open

Predict on CPU #15

patriotyk opened this issue Aug 8, 2023 · 5 comments

Comments

@patriotyk
Copy link

As I understand form code there is hardcoded CUDA support. So I have changed device to cpu and replaced model.cuda() with model.cpu() But when I run predict I got strange error:

RuntimeError: Argument #4: Padding size should be less than the corresponding input dimension, but got: padding (256, 256) at dimension 2 of input [1, 2, 160]

I don't know is it problem with cpu or something else.

@m-mandel
Copy link
Collaborator

m-mandel commented Aug 8, 2023

Hi,
Thank you for trying our code!
This might be because of the dimensions of your input. If I recall correctly, we assume that the wav file the path directs to is a single-channel (mono) wav file. Could it be that your input is stereo instead of mono?

Also, the audio file should not be too short. I think that at least 1 second long.
Let me know if this helps, and anything else to help me reproduce the bug myself.

Best,
M

@patriotyk
Copy link
Author

Yes, you are right, my input was stereo, thank you. Now it works, but output is much worse than original.

@m-mandel
Copy link
Collaborator

Which ckpt were you using? what are the source and target sample rates?

@patriotyk
Copy link
Author

I use this checkpoint https://drive.google.com/drive/folders/1JK9VqgfQsWEPOFUkp9Y5OR62G9i3disf

Source is 12kHz and output file generated in 16kHz. As I understand I run incorrect command:

python predict.py dset=4-16 experiment=aero_4-16_512_256 

but it should be

python predict.py dset=12-48 experiment=aero_12-48_512_256

but in this repository is only 4-16experiments, and no any 12-48 experiment files. Did you forget to add it? Or I should create them manually?

@m-mandel
Copy link
Collaborator

Yes, you are right - you need to modify the configuration file. If I recall correctly, the only thing you need to change are the sampling rates.
From:

lr_sr: 4000 # low resolution sample rate, added to support BWE. Should be included in training cfg
hr_sr: 16000 # high resolution sample rate. Should be included in training cfg

to:

lr_sr: 12000 # low resolution sample rate, added to support BWE. Should be included in training cfg
hr_sr: 48000 # high resolution sample rate. Should be included in training cfg

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants