Skip to content

NodeJS Bindings for Whisper - the CPU version of OpenAI's Whisper, as initially crafted in C++ by ggerganov.

License

Notifications You must be signed in to change notification settings

ChetanXpro/nodejs-whisper

Repository files navigation

nodejs-whisper

Node.js bindings for OpenAI's Whisper model.

MIT License

Features

  • Automatically convert the audio to WAV format with a 16000 Hz frequency to support the whisper model.
  • Output transcripts to (.txt .srt .vtt .json .wts .lrc)
  • Optimized for CPU (Including Apple Silicon ARM)
  • Timestamp precision to single word
  • Split on word rather than on token (Optional)
  • Translate from source language to english (Optional)
  • Convert audio format to wav to support whisper model

Installation

  1. Install make tools
sudo apt update
sudo apt install build-essential
  1. Install nodejs-whisper with npm
  npm i nodejs-whisper
  1. Download whisper model
  npx nodejs-whisper download
  • NOTE: user may need to install make tool

Windows Installation

  1. Install MinGW-w64 or MSYS2 (which includes make tools)

  2. Install nodejs-whisper with npm

npm i nodejs-whisper
  1. Download whisper model
npx nodejs-whisper download
  • Note: Make sure mingw32-make or make is available in your system PATH.

Usage/Examples

See example/index.ts (can be run with $ npm run test)

import path from 'path'
import { nodewhisper } from 'nodejs-whisper'

// Need to provide exact path to your audio file.
const filePath = path.resolve(__dirname, 'YourAudioFileName')

await nodewhisper(filePath, {
	modelName: 'base.en', //Downloaded models name
	autoDownloadModelName: 'base.en', // (optional) autodownload a model if model is not present
	removeWavFileAfterTranscription: false, // (optional) remove wav file once transcribed
	withCuda: false // (optional) use cuda for faster processing
	logger: console // (optional) Logging instance, defaults to console
	whisperOptions: {
		outputInCsv: false, // get output result in csv file
		outputInJson: false, // get output result in json file
		outputInJsonFull: false, // get output result in json file including more information
		outputInLrc: false, // get output result in lrc file
		outputInSrt: true, // get output result in srt file
		outputInText: false, // get output result in txt file
		outputInVtt: false, // get output result in vtt file
		outputInWords: false, // get output result in wts file for karaoke
		translateToEnglish: false, // translate from source language to english
		wordTimestamps: false, // word-level timestamps
		timestamps_length: 20, // amount of dialogue per timestamp pair
		splitOnWord: true, // split on word rather than on token
	},
})

// Model list
const MODELS_LIST = [
	'tiny',
	'tiny.en',
	'base',
	'base.en',
	'small',
	'small.en',
	'medium',
	'medium.en',
	'large-v1',
	'large',
	'large-v3-turbo',
]

Types

 interface IOptions {
	modelName: string
	removeWavFileAfterTranscription?: boolean
	withCuda?: boolean
	autoDownloadModelName?: string
	whisperOptions?: WhisperOptions
	logger?: Console
}

 interface WhisperOptions {
	outputInCsv?: boolean
	outputInJson?: boolean
	outputInJsonFull?: boolean
	outputInLrc?: boolean
	outputInSrt?: boolean
	outputInText?: boolean
	outputInVtt?: boolean
	outputInWords?: boolean
	translateToEnglish?: boolean
	timestamps_length?: number
	wordTimestamps?: boolean
	splitOnWord?: boolean
}

Run locally

Clone the project

  git clone https://github.com/ChetanXpro/nodejs-whisper

Go to the project directory

  cd nodejs-whisper

Install dependencies

  npm install

Start the server

  npm run dev

Build project

  npm run build

Made with

Feedback

If you have any feedback, please reach out to us at [email protected]

Authors