A multimodal AI system for evaluating communication skills using audio, video, and LLM-based feedback.
- Audio analysis (transcription, speaking rate, pauses, fluency metrics)
- Video analysis (eye contact, posture, gesture, alignment)
- Fusion layer combining multimodal features
- LLM-powered structured feedback (via Groq API)
- Validated output schema for frontend integration
- Python 3.9–3.10
- Groq API SDK (requires
GROQ_API_KEY) - Core libraries:
librosa,parselmouth,pydub,soundfile,nltk,numpy,pydantic,joblib,groq,mediapipe,opencv-python
git clone https://github.com/<GadiMahi/truthsense-ml.git
cd truthsense-mlpython3 -m venv .venv
source .venv/bin/activate # macOS/Linux
.venv\Scripts\activate # Windowspip install -r requirements.txt| Mahi Gadi | Utkarsh Malaiya |
Made with ❤ by GDSC-VIT
