Skip to content

Latest commit

 

History

History
71 lines (56 loc) · 3.18 KB

README.md

File metadata and controls

71 lines (56 loc) · 3.18 KB

Say Anything with Any Style

This repository provides the official PyTorch implementation for the following paper:
Say Anything with Any Style
Shuai Tan, et al.
In AAAI, 2024.

visualization

Given a source image and a style reference clip, SAAS generates stylized talking faces driven by audio. The lip motions are synchronized with the audio, while the speaking styles are controlled by the style clips. We also support video-driven style editing by inputting a source video. The pipeline of our SAAS is as follows:

visualization

Requirements

We train and test based on Python 3.8 and Pytorch. To install the dependencies run:

conda create -n SAAS python=3.8
conda activate SAAS
  • python packages
pip install -r requirements.txt

Inference

  • Run the demo in audio-driven setting:

    python audio_driven/train_test/inference.py --img_path path/to/image --wav_path path/to/audio --img_3DMM_path path/to/img_3DMM --style_path path/to/style --save_path path/to/save

    The result will be stored in save_path.

  • Run the demo in video-driven setting:

    python video_driven/inference.py --img_path path/to/image --wav_path path/to/audio --video_3DMM_path path/to/video_3DMM --style_path path/to/style --save_path path/to/save

    The result will be stored in save_path.

    img_path used should be first cropped using scripts crop_image.py

  • Download checkpoints for video-driven setting and put them into ./checkpoints.

  • Our audio encoder can be viewed as the combination of SadTalker' Audio encoder and our video-encoder. You can download the checkpoint of SadTalker' Audio encoder and our video-encoder to support audio-driven setting.

Acknowledgement

Some code are borrowed from following projects:

Thanks for their contributions!

Citation

If you find this codebase useful for your research, please use the following entry.

@inproceedings{tan2024say,
  title={Say Anything with Any Style},
  author={Tan, Shuai and Ji, Bin and Ding, Yu and Pan, Ye},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={38},
  number={5},
  pages={5088--5096},
  year={2024}
}