This repository provides the official PyTorch implementation for the following paper:
Say Anything with Any Style
Shuai Tan, et al.
In AAAI, 2024.
Given a source image and a style reference clip, SAAS generates stylized talking faces driven by audio. The lip motions are synchronized with the audio, while the speaking styles are controlled by the style clips. We also support video-driven style editing by inputting a source video. The pipeline of our SAAS is as follows:
We train and test based on Python 3.8 and Pytorch. To install the dependencies run:
conda create -n SAAS python=3.8
conda activate SAAS
- python packages
pip install -r requirements.txt
-
Run the demo in audio-driven setting:
python audio_driven/train_test/inference.py --img_path path/to/image --wav_path path/to/audio --img_3DMM_path path/to/img_3DMM --style_path path/to/style --save_path path/to/save
The result will be stored in save_path.
-
Run the demo in video-driven setting:
python video_driven/inference.py --img_path path/to/image --wav_path path/to/audio --video_3DMM_path path/to/video_3DMM --style_path path/to/style --save_path path/to/save
The result will be stored in save_path.
img_path used should be first cropped using scripts crop_image.py
-
Download checkpoints for video-driven setting and put them into ./checkpoints.
-
Our audio encoder can be viewed as the combination of SadTalker' Audio encoder and our video-encoder. You can download the checkpoint of SadTalker' Audio encoder and our video-encoder to support audio-driven setting.
Some code are borrowed from following projects:
- Learning2Listen
- PIRenderer
- Deep3DFaceRecon_pytorch
- SadTalker
- Style-ERD
- GFPGAN
- FOMM video preprocessing
Thanks for their contributions!
If you find this codebase useful for your research, please use the following entry.
@inproceedings{tan2024say,
title={Say Anything with Any Style},
author={Tan, Shuai and Ji, Bin and Ding, Yu and Pan, Ye},
booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
volume={38},
number={5},
pages={5088--5096},
year={2024}
}