Say Anything with Any Style

This repository provides the official PyTorch implementation for the following paper:
Say Anything with Any Style
Shuai Tan, et al.
In AAAI, 2024.

Given a source image and a style reference clip, SAAS generates stylized talking faces driven by audio. The lip motions are synchronized with the audio, while the speaking styles are controlled by the style clips. We also support video-driven style editing by inputting a source video. The pipeline of our SAAS is as follows:

Requirements

We train and test based on Python 3.8 and Pytorch. To install the dependencies run:

conda create -n SAAS python=3.8
conda activate SAAS

python packages

pip install -r requirements.txt

Inference

Run the demo in audio-driven setting:

python audio_driven/train_test/inference.py --img_path path/to/image --wav_path path/to/audio --img_3DMM_path path/to/img_3DMM --style_path path/to/style --save_path path/to/save

The result will be stored in save_path.

Run the demo in video-driven setting:

python video_driven/inference.py --img_path path/to/image --wav_path path/to/audio --video_3DMM_path path/to/video_3DMM --style_path path/to/style --save_path path/to/save

The result will be stored in save_path.

img_path used should be first cropped using scripts crop_image.py

Download checkpoints for video-driven setting and put them into ./checkpoints.
Our audio encoder can be viewed as the combination of SadTalker' Audio encoder and our video-encoder. You can download the checkpoint of SadTalker' Audio encoder and our video-encoder to support audio-driven setting.

Acknowledgement

Some code are borrowed from following projects:

Learning2Listen
PIRenderer
Deep3DFaceRecon_pytorch
SadTalker
Style-ERD
GFPGAN
FOMM video preprocessing

Thanks for their contributions!

We would like to thank Xinya Ji, Yifeng Ma and Zhiyao Sun for their generous help.

Citation

If you find this codebase useful for your research, please use the following entry.

@inproceedings{tan2024say,
  title={Say Anything with Any Style},
  author={Tan, Shuai and Ji, Bin and Ding, Yu and Pan, Ye},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={38},
  number={5},
  pages={5088--5096},
  year={2024}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Say Anything with Any Style

Requirements

Inference

Acknowledgement

Citation

Files

README.md

Latest commit

History

README.md

File metadata and controls

Say Anything with Any Style

Requirements

Inference

Acknowledgement

Citation