A speech-to-text & text analysis tool built using Flask, React, HuggingFace and Vosk.
Chatterbox is a web application that is intended for local / offline purposes. It supports a React-Flask framework, and supports two main features:
- Speech-To-Text Conversion using Vosk Speech Recognition
- This can be done by uploading
.mp3
or.mp4
files
- This can be done by uploading
- Text Analysis using multiple NLP methods with HuggingFace Transformers models
- This can be done by uploading
.txt
or.pdf
files
- This can be done by uploading
Chatterbox has been tested on Windows and Linux (Ubuntu 20.04 LTS)
Chatterbox has also been successfully deployed on a Kubernetes cluster using Minishift.
Ensure the following softwares are downloaded
- FFmpeg
- Github Desktop
- Node.js
- Python >3.8
- Vscode
$ sudo apt install git
$ sudo apt install python
$ sudo apt install ffmpeg
$ sudo snap install node --classic
Download the pytorch_model.bin
for each model and move each file to its respective backend/models
folder
- twitter-roberta-base-sentiment (sentiment analysis)
backend/models/roberta-SA
- bart-large-cnn (summary)
backend/models/bart-summary
- all-MiniLM-L6-v2 (topic modelling)
backend/models/all-MiniLM-L6-v2
cd backend
python -m venv .venv
// ensure (.venv) is showing in ur command prompt, else run this command in the parent directory
.venv\scripts\activate // For Windows
$ source .venv/bin/activate // For Linux
pip install -r requirements.txt
pip install pytorch
// some modules may need to be individually pip installed, check for missing modules & pip install respective modules
cd frontend
npm install
cd backend
flask run
cd frontend
npm start
Individual documentation can be found for the following components:
- Creating Shellscript for ChatterBox Quickstart (For Linux) located in
/ubuntu bash launcher
folder - Audio Processing (Vosk) located in
/References/Audio_Processing.md
- Natural Language Processing (NLP) located in
/References/NLP.md
- React Frontend located in
/References/React.md