This repository provides a set of tools and scripts for transcribing YouTube videos, extracting timestamps, generating subtitles, and clipping audio based on the subtitles. It aims to automate the process of extracting valuable information from YouTube videos and making it easily accessible.
- Automatic speech recognition to transcribe audio from YouTube videos.
- Extraction of timestamps from the transcribed text.
- Generation of subtitles in SRT format.
- Searching for specific words in the subtitles and extracting matching subtitles.
- Conversion of subtitles to CSV for further analysis.
- Clipping audio files based on subtitle timestamps.
- Powered by AI.
To use the tools and scripts in this repository, you need the following:
- Python 3.7 or higher
torch
andtransformers
libraries for speech recognitionpydub
library for audio processingpandas
library for data manipulationyoutube-dl
library for downloading YouTube videos
Make sure you have these dependencies installed before running the scripts.
-
Clone the repository to your local machine:
git clone https://github.com/your-username/Sariqat-al-Lahzat.git
pip install torch transformers pydub pandas youtube-dl
-
Run the scripts in the repository to perform different tasks such as transcribing, generating subtitles, searching for specific words, and clipping audio. Make sure to provide the necessary input files and parameters as required by each script.
-
Customize and extend the functionality of the scripts according to your specific needs.