A voice application with two modes of operation:
- A notes app for transcribing spoken notes
- A chat app for end-to-end voice conversations with LLMs
-
Clone the model repositories to the
models
directory:- Speech-to-text: https://huggingface.co/openai/whisper-medium
- Chat: https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct
- Text-to-Speech: https://huggingface.co/coqui/XTTS-v2
-
For the python client the requirements in
client/requirements.txt
must be installed. For PyAudio the dev libraries of PortAudio need to be installed (cf.server/stt/Dockerfile
)
- Server: Assuming Docker Compose is installed: Execute
./run.sh
in theserver
directory. - Python-Client:
python -m client.client
(invoice_note
directory)