Voice-TimeLogger-Agent

AI-powered automation system for consultant time tracking

Key Features • Architecture • Getting Started • Deployment • Usage • Contributing • License

Voice-TimeLogger-Agent is an AI-powered system that automates consultant work hours tracking using voice messages. Consultants can record a quick voice message after each meeting, and the system automatically extracts key details (customer name, start/end times, total hours, and notes) and stores them in Google Sheets.

Key Features

Voice Processing - Convert voice recordings to text using OpenAI Whisper API
AI Data Extraction - Extract structured meeting data from transcribed text using LLM
Google Sheets Integration - Automatically store meeting details in Google Sheets
API Endpoint - Simple REST API for uploading voice recordings
Notifications - Optional email or Slack notifications for new meeting entries
n8n Workflow - Visual automation workflow for processing audio files
Dockerized Deployment - Easy deployment using Docker and Docker Compose

Architecture

Technology Stack

FastAPI - Modern, fast API framework for Python
OpenAI Whisper - State-of-the-art speech recognition model
GPT Models - Advanced language model for data extraction
Google Sheets API - For structured data storage
n8n - Workflow automation platform
Docker - Container platform for deployment

Component Overview

API Service - Handles audio file uploads and processing
Speech Service - Transcribes audio using OpenAI Whisper API
Extraction Service - Extracts structured data using GPT models
Storage Service - Stores meeting data in Google Sheets
Notification Service - Sends notifications via email or Slack
n8n Workflow - Provides visual workflow management and automation

Getting Started

Prerequisites

Docker and Docker Compose
OpenAI API key
Google Cloud Platform account with Google Sheets API enabled
Google service account JSON key file with Google Sheets access

Installation

Clone the repository:

git clone https://github.com/username/voice-timelogger-agent.git
cd voice-timelogger-agent

Configure environment variables:

cp .env.example .env
# Edit .env with your API keys and configuration

Add Google service account credentials:

mkdir -p credentials
# Copy your Google service account JSON file to credentials/google-service-account.json

Run the setup script:

chmod +x setup.sh
./setup.sh

Start the services:

docker-compose up -d

The API will be available at http://localhost:8000 and n8n at http://localhost:5678.

Deployment

Local Deployment

For local development and testing, follow the Installation steps above.

EC2 Deployment

To deploy on AWS EC2:

Prepare your local environment (steps 1-3 from Installation)
Make the deployment script executable:

chmod +x deploy-to-ec2.sh

Run the deployment script:

./deploy-to-ec2.sh ./path/to/your-key.pem ec2-user@your-ec2-ip

For more detailed deployment instructions, see Deployment Documentation.

Usage

Direct API Usage

Upload an audio recording directly to the API:

curl -X POST "http://localhost:8000/api/v1/speech/upload" \
  -H "Content-Type: multipart/form-data" \
  -F "file=@/path/to/recording.mp3" \
  -F "notify=true"

n8n Workflow

Access n8n at http://localhost:5678
Import the workflow from n8n_workflow/voice-timelogger-workflow.json
Configure the webhook URL with credentials
Upload audio through the webhook URL:

curl -X POST "http://localhost:5678/webhook/YOUR_WEBHOOK_PATH" \
  -F "file=@/path/to/recording.mp3" \
  -F "notify=true"

API Endpoints

The application provides the following REST API endpoints:

Endpoint	Method	Description
`/api/v1/speech/upload`	POST	Upload and process an audio recording
`/api/v1/speech/transcribe`	POST	Transcribe an audio file without extraction
`/api/v1/speech/process`	POST	Process an audio file without storage
`/api/v1/meetings/extract`	POST	Extract meeting data from text
`/health`	GET	Health check endpoint

For a complete API reference, see the Postman Collection.

How It Works

Audio Processing Workflow

The consultant records a voice message after a meeting, saying something like:

"I had a meeting with Acme Corporation today from 2:00 PM to 3:30 PM. We discussed their new product launch strategy."

The voice message is uploaded to the API or n8n webhook.
The system transcribes the audio using OpenAI Whisper API.

An AI model extracts structured data from the transcription:

{
  "customer_name": "Acme Corporation",
  "meeting_date": "2025-04-06",
  "start_time": "2:00 PM",
  "end_time": "3:30 PM",
  "total_hours": "1h 30m",
  "notes": "We discussed their new product launch strategy."
}

The extracted data is stored in Google Sheets.
Optional notification is sent via email or Slack.

Data Extraction Capabilities

The AI-powered extraction system can handle various formats and expressions:

Different time formats: "2pm", "14:00", "2 o'clock"
Relative dates: "yesterday", "today", "last Tuesday"
Calculation of total hours when not explicitly stated
Extracting client names from contextual mentions
Separating meeting notes from factual information

Configuration

The application is configured using environment variables. See .env.example for a complete list of available options:

# API settings
DEBUG=false
API_HOST=0.0.0.0
API_PORT=8000

# OpenAI API settings
OPENAI_API_KEY=your-openai-api-key-here
DEFAULT_LLM_MODEL=gpt-4o-mini

# Google Sheets settings
GOOGLE_CREDENTIALS_FILE=./credentials/google-service-account.json
GOOGLE_SPREADSHEET_ID=your-google-spreadsheet-id-here

# Notification settings
ENABLE_EMAIL_NOTIFICATIONS=false
ENABLE_SLACK_NOTIFICATIONS=false
NOTIFICATIONS_DEFAULT=false

# Additional notification settings for email and Slack
...

n8n Workflow

The n8n workflow provided in this repository automates the process of:

Receiving audio files through a webhook
Processing them through the API
Storing the extracted meeting data in Google Sheets
Sending email notifications for new meetings

See n8n Workflow Documentation for detailed instructions.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add some amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 122 Commits
config		config
docs		docs
examples/audio		examples/audio
n8n_workflow		n8n_workflow
postman		postman
scripts		scripts
src		src
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
deploy-to-ec2.sh		deploy-to-ec2.sh
docker-compose.yml		docker-compose.yml
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
setup.py		setup.py
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Voice-TimeLogger-Agent

AI-powered automation system for consultant time tracking

Key Features

Architecture

Technology Stack

Component Overview

Getting Started

Prerequisites

Installation

Deployment

Local Deployment

EC2 Deployment

Usage

Direct API Usage

n8n Workflow

API Endpoints

How It Works

Audio Processing Workflow

Data Extraction Capabilities

Configuration

n8n Workflow

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

OthmanMohammad/Voice-TimeLogger-Agent

Folders and files

Latest commit

History

Repository files navigation

Voice-TimeLogger-Agent

AI-powered automation system for consultant time tracking

Key Features

Architecture

Technology Stack

Component Overview

Getting Started

Prerequisites

Installation

Deployment

Local Deployment

EC2 Deployment

Usage

Direct API Usage

n8n Workflow

API Endpoints

How It Works

Audio Processing Workflow

Data Extraction Capabilities

Configuration

n8n Workflow

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages