The pipeline of this repo supports multiple voice uploading options,you can choose one or more options depending on the data you have.
- Short audios packed by a single
.zip
file, whose file structure should be as shown below:
Your-zip-file.zip
├───Character_name_1
├ ├───xxx.wav
├ ├───...
├ ├───yyy.mp3
├ └───zzz.wav
├───Character_name_2
├ ├───xxx.wav
├ ├───...
├ ├───yyy.mp3
├ └───zzz.wav
├───...
├
└───Character_name_n
├───xxx.wav
├───...
├───yyy.mp3
└───zzz.wav
Note that the format of the audio files does not matter as long as they are audio files。
Quality requirement: >=2s, <=10s, contain as little background sound as possible.
Quantity requirement: at least 10 per character, 20+ per character is recommended.
-
Long audio files named by character names, which should contain single character voice only. Background sound is acceptable since they will be automatically removed. File name format
{CharacterName}_{random_number}.wav
(E.G.Diana_234135.wav
,MinatoAqua_234252.wav
), must be.wav
files. -
Long video files named by character names, which should contain single character voice only. Background sound is acceptable since they will be automatically removed. File name format
{CharacterName}_{random_number}.mp4
(E.G.Taffy_332452.mp4
,Dingzhen_957315.mp4
), must be.mp4
files.
Note:CharacterName
must be English characters only,random_number
is to identify multiple files for one character, which is compulsory to add. It could be a random integer between 0~999999. -
A
.txt
containing multiple lines of{CharacterName}|{video_url}
, which should be formatted as follows:
Char1|https://xyz.com/video1/
Char2|https://xyz.com/video2/
Char2|https://xyz.com/video3/
Char3|https://xyz.com/video4/
One video should contain single speaker only. Currently supports videos links from bilibili, other websites are yet to be tested. Having questions regarding to data format? Fine data samples of all format from here.