Overview The Deep Audio Classifier project leverages TensorFlow to create an advanced audio classification system. This system employs convolutional and recurrent neural networks (CNNs and RNNs) to accurately categorize audio samples into predefined classes. The project showcases the potential of deep learning in audio processing, achieving remarkable precision across diverse datasets.
Features Data Preprocessing: Converts voice recordings into wave format using TensorFlow, ensuring standardized input. Model Training: Utilizes CNNs with multiple layers for robust feature extraction and pattern recognition. High Precision: Achieves precision exceeding 95%, surpassing industry benchmarks. Versatile Applications: Includes a specialized classifier for counting bird sounds, demonstrating the model's adaptability. Detailed Results: Outputs results in a comprehensive CSV file for further analysis. Project Structure data/: Directory containing the audio datasets. preprocessing/: Scripts for converting audio recordings to wave format and extracting features. models/: Definitions of the CNN and RNN architectures used in the project. training/: Scripts for training the models, including hyperparameter tuning. evaluation/: Scripts for evaluating model performance using various metrics. deployment/: Scripts for deploying the trained models and running inference. results/: Directory where results, including the CSV file, are stored.
Dataset Link: https://www.kaggle.com/datasets/kenjee/z-by-hp-unlocked-challenge-3-signal-processing
Google Collab File: https://colab.research.google.com/drive/1HtfEXgsOfQA1KI-Tr6lnl_P74BPMxbN-#scrollTo=KjSttlTDJMHg