WhisperTRT

This project optimizes OpenAI Whisper with NVIDIA TensorRT and implements the Wyoming Protocol for Home Assistant integration..

When executing the base.en model on NVIDIA Jetson Orin Nano, WhisperTRT runs ~3x faster while consuming only ~60% the memory compared with PyTorch.

By default, this uses the base (multilingual) model.

WhisperTRT roughly mimics the API of the original Whisper model, making it easy to use. The Wyoming goodies are based off wyoming-faster-whisper with minimal tweaks to use WhisperTRT instead of faster-whisper.

While WhisperTRT was originally built for and tested on the Jetson Orin Nano, this project was built in Docker on an x86 Ubuntu 24.04 VM with a 4070 Ti.

Check out the performance and usage details below!

Performance

All benchmarks are generated by calling profile_backends.py, processing a 20-second audio clip.

Execution Time

Execution time in seconds to transcribe 20 seconds of speech on Jetson Orin Nano. See profile_backend.py for details.

	whisper (Jetson)	faster_whisper (Jetson)	whisper_trt (Jetson)	whisper (4070 Ti)	faster_whisper (4070 Ti)	whisper_trt (4070 Ti)
tiny.en	1.74 sec	0.85 sec	0.64 sec	0.40 sec	0.35 sec	0.07 sec
base.en	2.55 sec	Unavailable	0.86 sec	0.71 sec	0.34 sec	0.10 sec

Memory Consumption

Memory consumption to transcribe 20 seconds of speech on Jetson Orin Nano. See profile_backend.py for details.

	whisper (Jetson)	faster_whisper (Jetson)	whisper_trt (Jetson)	whisper (4070 Ti)	faster_whisper (4070 Ti)	whisper_trt (4070 Ti)
tiny.en	569 MB	404 MB	488 MB	672 MB	522 MB	544 MB
base.en	666 MB	Unavailable	439 MB	726 MB	514 MB	548 MB

Usage

NOTE: ARM64 dGPU and iGPU containers may take a while to start on first launch after installation or updates. I do not have ARM64 or Jetson devices so several packages such as torch and torch2trt fail to install properly because CUDA is not detected when using QEMU/buildx. If you know how to get around this please reach out to me.

Pre-requisites:

Install and configure Docker
Install and configure the Nvidia Container Toolkit

Docker Compose (recommended)

For AMD64 with discrete GPUs:

services:
  wyoming-whisper-trt:
    image: captnspdr/wyoming-whisper-trt:latest-amd64
    container_name: wyoming-whisper-trt
    ports:
      - 10300:10300
    restart: unless-stopped
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

For ARM64 with discrete GPUs:

services:
  wyoming-whisper-trt:
    image: captnspdr/wyoming-whisper-trt:latest-arm64
    container_name: wyoming-whisper-trt
    ports:
      - 10300:10300
    restart: unless-stopped
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

For ARM64 with an iGPU like Jetson devices:

services:
  wyoming-whisper-trt:
    image: captnspdr/wyoming-whisper-trt:latest-igpu
    container_name: wyoming-whisper-trt
    restart: unless-stopped
    network_mode: host
    runtime: nvidia
    environment:
      - NVIDIA_VISIBLE_DEVICES=all
      - NVIDIA_DRIVER_CAPABILITIES=compute,utility

Docker (Latest tag on Docker Hub)

Clone this repository
Browse to the repository root folder
Run the following command based on your platform:

For AMD64 with dGPU:

docker run --gpus all --name wyoming-whisper-trt -d -p 10300:10300 captnspdr/wyoming-whisper-trt:latest-amd64

For ARM64 with dGPU:

docker run --gpus all --name wyoming-whisper-trt -d -p 10300:10300 captnspdr/wyoming-whisper-trt:latest-arm64

For ARM64 with iGPU:

docker run --gpus all --name wyoming-whisper-trt -d -p 10300:10300 captnspdr/wyoming-whisper-trt:latest-igpu

Docker (Latest GitHub commit, ARM64 and AMD64 with dGPU)

Clone this repository
Browse to the repository root folder
Run docker compose -f docker-compose-github.yaml up -d

Docker (Latest GitHub commit, ARM64 with iGPU)

Clone this repository
Browse to the repository root folder
Run docker compose -f docker-compose-github-igpu.yaml up -d

Name		Name	Last commit message	Last commit date
Latest commit History 631 Commits
.github		.github
.vscode		.vscode
examples		examples
script		script
tests		tests
torch2trt @ 4e820ae		torch2trt @ 4e820ae
whisper_trt		whisper_trt
wyoming_whisper_trt		wyoming_whisper_trt
.gitignore		.gitignore
.gitmodules		.gitmodules
.isort.cfg		.isort.cfg
.projectile		.projectile
.whitesource		.whitesource
CHANGELOG.md		CHANGELOG.md
Dockerfile		Dockerfile
Dockerfile.arm64		Dockerfile.arm64
Dockerfile.igpu		Dockerfile.igpu
GitReleaseManager.yaml		GitReleaseManager.yaml
LICENSE.md		LICENSE.md
MANIFEST.in		MANIFEST.in
README.md		README.md
SECURITY.md		SECURITY.md
docker-compose-github-igpu.yaml		docker-compose-github-igpu.yaml
docker-compose-github.yaml		docker-compose-github.yaml
docker-compose-igpu.yaml		docker-compose-igpu.yaml
docker-compose.yaml		docker-compose.yaml
mypy.ini		mypy.ini
pylintrc		pylintrc
renovate.json		renovate.json
requirements.txt		requirements.txt
requirements_dev.txt		requirements_dev.txt
run.sh		run.sh
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WhisperTRT

Performance

Execution Time

Memory Consumption

Usage

Pre-requisites:

Docker Compose (recommended)

Docker (Latest tag on Docker Hub)

Docker (Latest GitHub commit, ARM64 and AMD64 with dGPU)

Docker (Latest GitHub commit, ARM64 with iGPU)

See also:

About

Releases 9

Packages

Contributors 7

Languages

License

Jonah-May-OSS/wyoming-whisper-trt

Folders and files

Latest commit

History

Repository files navigation

WhisperTRT

Performance

Execution Time

Memory Consumption

Usage

Pre-requisites:

Docker Compose (recommended)

Docker (Latest tag on Docker Hub)

Docker (Latest GitHub commit, ARM64 and AMD64 with dGPU)

Docker (Latest GitHub commit, ARM64 with iGPU)

See also:

About

Resources

License

Security policy

Stars

Watchers

Forks

Releases 9

Packages 0

Contributors 7

Languages

Packages