This Streamlit application demonstrates the capabilities of OpenAI's Whisper ASR (Automatic Speech Recognition) system. Users can record or upload audio files in various formats and receive transcriptions generated by Whisper in real-time. Transcriptions can be saved as text files and downloaded for further use.
- Record audio directly from the app's interface
- Upload audio files in .mp3, .mp4, .wav, or .m4a format
- Display transcriptions in real-time
- Save and download transcriptions as text files
Follow these steps to install and run the application:
-
Clone the repository
git clone https://github.com/your-repo-url/streamlit-whisper-transcription.git cd streamlit-whisper-transcription
-
Create a virtual environment and install dependencies
python -m venv venv source venv/bin/activate # For Windows, use 'venv\Scripts\activate' pip install -r requirements.txt
-
Create a .env file in the project's root directory and add your OpenAI API key:
OPENAI_API_KEY=sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Run the Streamlit app with the following command:
streamlit run app.py
Open your web browser and navigate to http://localhost:8501 to access the application.
Build the Docker image:
docker build -t streamlit-whisper-transcription .
Run the Docker image:
docker run -p 8501:8501 streamlit-whisper-transcription
Open your web browser and navigate to http://localhost:8501 to access the application.
This app is a demonstration of the potential of OpenAI's Whisper ASR system for audio transcription. The accuracy of the transcriptions depends on various factors such as the quality of the audio file, the language spoken, and background noise. The app is not intended for use in production environments and should be used for demonstration purposes only.
If you'd like to contribute to this project, please feel free to submit a pull request or create an issue for any bugs or feature requests. Your contributions are welcome and appreciated!