Key Features • Architecture • Getting Started • Deployment • Usage • Contributing • License
Voice-TimeLogger-Agent is an AI-powered system that automates consultant work hours tracking using voice messages. Consultants can record a quick voice message after each meeting, and the system automatically extracts key details (customer name, start/end times, total hours, and notes) and stores them in Google Sheets.
- Voice Processing - Convert voice recordings to text using OpenAI Whisper API
- AI Data Extraction - Extract structured meeting data from transcribed text using LLM
- Google Sheets Integration - Automatically store meeting details in Google Sheets
- API Endpoint - Simple REST API for uploading voice recordings
- Notifications - Optional email or Slack notifications for new meeting entries
- n8n Workflow - Visual automation workflow for processing audio files
- Dockerized Deployment - Easy deployment using Docker and Docker Compose
- FastAPI - Modern, fast API framework for Python
- OpenAI Whisper - State-of-the-art speech recognition model
- GPT Models - Advanced language model for data extraction
- Google Sheets API - For structured data storage
- n8n - Workflow automation platform
- Docker - Container platform for deployment
- API Service - Handles audio file uploads and processing
- Speech Service - Transcribes audio using OpenAI Whisper API
- Extraction Service - Extracts structured data using GPT models
- Storage Service - Stores meeting data in Google Sheets
- Notification Service - Sends notifications via email or Slack
- n8n Workflow - Provides visual workflow management and automation
- Docker and Docker Compose
- OpenAI API key
- Google Cloud Platform account with Google Sheets API enabled
- Google service account JSON key file with Google Sheets access
- Clone the repository:
git clone https://github.com/username/voice-timelogger-agent.git
cd voice-timelogger-agent- Configure environment variables:
cp .env.example .env
# Edit .env with your API keys and configuration- Add Google service account credentials:
mkdir -p credentials
# Copy your Google service account JSON file to credentials/google-service-account.json- Run the setup script:
chmod +x setup.sh
./setup.sh- Start the services:
docker-compose up -dThe API will be available at http://localhost:8000 and n8n at http://localhost:5678.
For local development and testing, follow the Installation steps above.
To deploy on AWS EC2:
- Prepare your local environment (steps 1-3 from Installation)
- Make the deployment script executable:
chmod +x deploy-to-ec2.sh- Run the deployment script:
./deploy-to-ec2.sh ./path/to/your-key.pem ec2-user@your-ec2-ipFor more detailed deployment instructions, see Deployment Documentation.
Upload an audio recording directly to the API:
curl -X POST "http://localhost:8000/api/v1/speech/upload" \
-H "Content-Type: multipart/form-data" \
-F "file=@/path/to/recording.mp3" \
-F "notify=true"- Access n8n at http://localhost:5678
- Import the workflow from
n8n_workflow/voice-timelogger-workflow.json - Configure the webhook URL with credentials
- Upload audio through the webhook URL:
curl -X POST "http://localhost:5678/webhook/YOUR_WEBHOOK_PATH" \
-F "file=@/path/to/recording.mp3" \
-F "notify=true"The application provides the following REST API endpoints:
| Endpoint | Method | Description |
|---|---|---|
/api/v1/speech/upload |
POST | Upload and process an audio recording |
/api/v1/speech/transcribe |
POST | Transcribe an audio file without extraction |
/api/v1/speech/process |
POST | Process an audio file without storage |
/api/v1/meetings/extract |
POST | Extract meeting data from text |
/health |
GET | Health check endpoint |
For a complete API reference, see the Postman Collection.
-
The consultant records a voice message after a meeting, saying something like:
"I had a meeting with Acme Corporation today from 2:00 PM to 3:30 PM. We discussed their new product launch strategy." -
The voice message is uploaded to the API or n8n webhook.
-
The system transcribes the audio using OpenAI Whisper API.
-
An AI model extracts structured data from the transcription:
{ "customer_name": "Acme Corporation", "meeting_date": "2025-04-06", "start_time": "2:00 PM", "end_time": "3:30 PM", "total_hours": "1h 30m", "notes": "We discussed their new product launch strategy." } -
The extracted data is stored in Google Sheets.
-
Optional notification is sent via email or Slack.
The AI-powered extraction system can handle various formats and expressions:
- Different time formats: "2pm", "14:00", "2 o'clock"
- Relative dates: "yesterday", "today", "last Tuesday"
- Calculation of total hours when not explicitly stated
- Extracting client names from contextual mentions
- Separating meeting notes from factual information
The application is configured using environment variables. See .env.example for a complete list of available options:
# API settings
DEBUG=false
API_HOST=0.0.0.0
API_PORT=8000
# OpenAI API settings
OPENAI_API_KEY=your-openai-api-key-here
DEFAULT_LLM_MODEL=gpt-4o-mini
# Google Sheets settings
GOOGLE_CREDENTIALS_FILE=./credentials/google-service-account.json
GOOGLE_SPREADSHEET_ID=your-google-spreadsheet-id-here
# Notification settings
ENABLE_EMAIL_NOTIFICATIONS=false
ENABLE_SLACK_NOTIFICATIONS=false
NOTIFICATIONS_DEFAULT=false
# Additional notification settings for email and Slack
...The n8n workflow provided in this repository automates the process of:
- Receiving audio files through a webhook
- Processing them through the API
- Storing the extracted meeting data in Google Sheets
- Sending email notifications for new meetings
See n8n Workflow Documentation for detailed instructions.
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.


