CycleVAE-VC implementation with Pytorch

This repository provides UNOFFICIAL CycleVAE-VC implementations with Pytorch.

You can combine your own vocoder to get great converted speech!!

Source of the figure: https://arxiv.org/pdf/1907.10185.pdf

The goal of this repository is to provide VC model trained with completely non-parallel data. Also this repository is to provide many-to-many conversion model.

I modified the model from @patrickltobing 's implementation as below. In the original model, AR structure is used for ConvRnn network. However, it takes quite a long time to train with that model. So I used RNN-based model to train faster.

What's new

2020/06/11 [NEW!] Support ParallelWaveGAN in vocoder branch.
2020/06/02 Support one-to-one conversion model.

Requirements

This repository is tested on Ubuntu 19.10 with a RTX2080ti with the following environment.

Python 3.7+
Cuda10.2
CuDNN 7+

Setup

You can setup this repository with the following commands.

$ cd tools
$ make

Please check if the venv directory is successfully located under the tools directory.

Usage

Before training the model, be sure to locate your wav files under specific directory. I assume that the structure of the wav directory is:

wav
├── train
│   ├── jvs001
│   └── jvs002
└── val
    ├── jvs001
    └── jvs002

Step0: path

This script is not designed for servers, which uses slurm .
If you are using slurm or you have some GPUs, then you have to add environment variables in path.sh
To set environment variables and activate virtual environment, run
```
. path.sh
```

Step1: set min/max f0

Run the next command to generate figures
```
. run.sh --stage 0
```
and the figures will generated into ./figure directory.
If you don't have speaker config file in ./config/speaker , then you have to do the following
1. Copy ./config/speaker/default.conf to ./config/speaker/<spk_name>.conf
2. Set speaker-dependent variables there.
  
  The structure of the config file is:
```
<minf0>
<maxf0>
<npow>
```

Step2: Feature extract and training model.

Run the next command to extract features and train the model.
```
. run.sh --stage 12
```
- stage1: Feature Extract
- Stage2: Training
Flags in training stage
- conf_path : Path to the training config file. Default: ./config/vc.conf
- model_name : Name of the saved model. Model name will be <model_name>.<num_iter>.pt .
- log_name : Logging directory to save events files from tensorboard

Step3: Convert voice

Run the next command to convert voice.
```
. run.sh --stage 3
```
Flags in conversion stage
- test_dir : Directory to save source wav files.
- exp_dir : Directory to save converted wav files.
- checkpoint : Path to the trained model.
- log_name : Name of the log file.

Results

training steps
sounds
- demo wav files are acquired from https://voice-statistics.github.io/
- You can find converted wav files in ./for_readme/wav

Features to be implemented in the future

Support gin-config

References

Acknowledgement

The author would like to thank Patrick Lumban Tobing for his repository.

Author

Someki Masao (@Masao-Someki)

e-mail : [email protected]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CycleVAE-VC implementation with Pytorch

What's new

Requirements

Setup

Usage

Step0: path

Step1: set min/max f0

Step2: Feature extract and training model.

Step3: Convert voice

Results

Features to be implemented in the future

References

Acknowledgement

Author

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
config		config
for_readme		for_readme
src		src
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
path.sh		path.sh
run.sh		run.sh

License

Masao-Someki/CycleVAE_VC

Folders and files

Latest commit

History

Repository files navigation

CycleVAE-VC implementation with Pytorch

What's new

Requirements

Setup

Usage

Step0: path

Step1: set min/max f0

Step2: Feature extract and training model.

Step3: Convert voice

Results

Features to be implemented in the future

References

Acknowledgement

Author

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages