Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved Install Instructions #642

Closed
wants to merge 8 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
141 changes: 141 additions & 0 deletions INSTALL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
# Installation and Setup Guide for Real Time Voice Cloning

## Windows (Partially tested)

### 1. Prepare prerequisites
* Install [Python](https://www.python.org) 3.7. Any other version will not work.
tomodachi94 marked this conversation as resolved.
Show resolved Hide resolved
* Install [PyTorch](https://pytorch.org/get-started/locally/) (>=1.0.1).
* Install [ffmpeg](https://ffmpeg.org/download.html#get-packages).
* (Highly recommended, as RTVC uses outdated dependancies) Setup a virtual environment with [`venv`](https://docs.python.org/3/library/venv.html) by running `python -m venv .venv`.
* Activate the virtual environment with `.venv/Scripts/activate`.
* Do note you will need to run this command before running anything in the toolbox if you used this to install dependancies, otherwise it will return an error saying it's missing some dependancies.
* Run `pip install -r requirements.txt` to install the remaining necessary packages.

### 2. Download Pretrained Models
Download the latest [here](https://github.com/CorentinJ/Real-Time-Voice-Cloning/wiki/Pretrained-models).

### 3. (Optional) Test Configuration
Before you download any dataset, you can begin by testing your configuration with:

`python demo_cli.py`

If all tests pass, you're good to go.

### 4. (Optional) Download Datasets
For playing with the toolbox alone, I only recommend downloading [`LibriSpeech/train-clean-100`](https://www.openslr.org/resources/12/train-clean-100.tar.gz). Extract the contents as `<datasets_root>/LibriSpeech/train-clean-100` where `<datasets_root>` is a directory of your choosing. Other datasets are supported in the toolbox, see [here](https://github.com/CorentinJ/Real-Time-Voice-Cloning/wiki/Training#datasets). You're free not to download any dataset, but then you will need your own data as audio files or you will have to record it with the toolbox.

### 5. Launch the Toolbox
You can then try the toolbox:

`python demo_toolbox.py -d <datasets_root>`
or
`python demo_toolbox.py`

depending on whether you downloaded any datasets. If you are running an X-server or if you have the error `Aborted (core dumped)`, see [this issue](https://github.com/CorentinJ/Real-Time-Voice-Cloning/issues/11#issuecomment-504733590).

### 6. (Optional) Enable GPU Support
Note: Enabling GPU support is a lot of work. You will want to set this up if you are going to train your own models. Somebody took the time to make [a better guide](https://poorlydocumented.com/2019/11/installing-corentinjs-real-time-voice-cloning-project-on-windows-10-from-scratch/) on how to install everything. I recommend using it.

This command installs additional GPU dependencies and recommended packages: `pip install -r requirements_gpu.txt`

Additionally, you will need to ensure GPU drivers are properly installed and that your CUDA version matches your PyTorch and Tensorflow installations.

## Ubuntu 20.04 install instructions (tested)
### 1. Add repositories
```
sudo add-apt-repository universe
sudo add-apt-repository ppa:deadsnakes/ppa
```

### 2. Install software
```
snap install ffmpeg
sudo apt install python3.6 python3.6-dev python3 python3-pip git
pip3 install virtualenv
```

Additional steps are needed to overcome bugs with portaudio and QT:

* Portaudio bugfix: https://stackoverflow.com/a/60824906

```
sudo apt install libasound2-dev
git clone -b alsapatch https://github.com/gglockner/portaudio
cd portaudio
./configure && make
sudo make install
sudo ldconfig
cd ..
```

* QT bugfix: https://askubuntu.com/a/1069502

```
sudo apt install --reinstall libxcb-xinerama0
```

### 3. Make a virtual environment and activate it
```
~/.local/bin/virtualenv --python python36 rtvc
source rtvc/bin/activate
```

### 4. Download RTVC
```
git clone --depth 1 https://github.com/CorentinJ/Real-Time-Voice-Cloning.git
```

### 5. Install requirements
```
cd Real-Time-Voice-Cloning
pip install torch
pip install -r requirements.txt
pip install webrtcvad
```

### 6. Get pretrained models
```
wget https://www.dropbox.com/s/5udq50bkpw2hipy/pretrained.zip?dl=1 -O pretrained.zip
unzip pretrained.zip
```

### 7. Launch toolbox
```
python demo_toolbox.py
```

## Mac OSX (Untested)
### Install prerequisites
- [Python 3.7](https://www.python.org/downloads/mac-osx/)
- homebrew with `/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"`
- ffmpeg with `brew install ffmpeg`

### 3. Make a virtual environment and activate it
```
~/.local/bin/virtualenv --python python36 rtvc
source rtvc/bin/activate
```

### 4. Download RTVC
```
git clone --depth 1 https://github.com/CorentinJ/Real-Time-Voice-Cloning.git
```

### 5. Install requirements
```
cd Real-Time-Voice-Cloning
pip install torch
pip install -r requirements.txt
pip install webrtcvad
```

### 6. Get pretrained models
```
wget https://www.dropbox.com/s/5udq50bkpw2hipy/pretrained.zip?dl=1 -O pretrained.zip
unzip pretrained.zip
```

### 7. Launch toolbox
```
python demo_toolbox.py
```
40 changes: 2 additions & 38 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,41 +30,5 @@ SV2TTS is a three-stage deep learning framework that allows to create a numerica
**25/06/19:** Experimental support for low-memory GPUs (~2gb) added for the synthesizer. Pass `--low_mem` to `demo_cli.py` or `demo_toolbox.py` to enable it. It adds a big overhead, so it's not recommended if you have enough VRAM.


## Setup

### 1. Install Requirements

**Python 3.6 or 3.7** is needed to run the toolbox.

* Install [PyTorch](https://pytorch.org/get-started/locally/) (>=1.0.1).
* Install [ffmpeg](https://ffmpeg.org/download.html#get-packages).
* Run `pip install -r requirements.txt` to install the remaining necessary packages.

### 2. Download Pretrained Models
Download the latest [here](https://github.com/CorentinJ/Real-Time-Voice-Cloning/wiki/Pretrained-models).

### 3. (Optional) Test Configuration
Before you download any dataset, you can begin by testing your configuration with:

`python demo_cli.py`

If all tests pass, you're good to go.

### 4. (Optional) Download Datasets
For playing with the toolbox alone, I only recommend downloading [`LibriSpeech/train-clean-100`](https://www.openslr.org/resources/12/train-clean-100.tar.gz). Extract the contents as `<datasets_root>/LibriSpeech/train-clean-100` where `<datasets_root>` is a directory of your choosing. Other datasets are supported in the toolbox, see [here](https://github.com/CorentinJ/Real-Time-Voice-Cloning/wiki/Training#datasets). You're free not to download any dataset, but then you will need your own data as audio files or you will have to record it with the toolbox.

### 5. Launch the Toolbox
You can then try the toolbox:

`python demo_toolbox.py -d <datasets_root>`
or
`python demo_toolbox.py`

depending on whether you downloaded any datasets. If you are running an X-server or if you have the error `Aborted (core dumped)`, see [this issue](https://github.com/CorentinJ/Real-Time-Voice-Cloning/issues/11#issuecomment-504733590).

### 6. (Optional) Enable GPU Support
Note: Enabling GPU support is a lot of work. You will want to set this up if you are going to train your own models. Somebody took the time to make [a better guide](https://poorlydocumented.com/2019/11/installing-corentinjs-real-time-voice-cloning-project-on-windows-10-from-scratch/) on how to install everything. I recommend using it.

This command installs additional GPU dependencies and recommended packages: `pip install -r requirements_gpu.txt`

Additionally, you will need to ensure GPU drivers are properly installed and that your CUDA version matches your PyTorch and Tensorflow installations.
## Installation and Setup
See [INSTALL.md](INSTALL.md) for complete installation and setup instructions.