Skip to content

๐ŸŽค Voice-controlled AI coding assistant with OpenAI GPT integration. Create files, analyze code, and build web apps through natural speech commands.

Notifications You must be signed in to change notification settings

dhrumilbhut/voice-coding-assistant

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

15 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐ŸŽค Voice Coding Assistant

Transform your voice into code with AI-powered development assistance

Python OpenAI

A sophisticated voice-controlled coding assistant that leverages OpenAI's GPT models to help developers create, analyze, and manage code through natural speech interaction. Simply speak your requirements, and watch as complete projects come to life!

โœจ Features

  • ๐ŸŽ™๏ธ Voice-to-Code: Convert speech directly into functional code
  • ๐Ÿ”Š Text-to-Speech: AI responses spoken aloud with OpenAI's streaming TTS
  • ๐Ÿ—๏ธ Hybrid Architecture: Both Simple REST API and True MCP Protocol support
  • ๐Ÿ”Œ True MCP Server: Full JSON-RPC 2.0 compliant Model Context Protocol implementation
  • ๐ŸŽฏ Model Selection: Users can choose from multiple OpenAI models (gpt-4o-mini, gpt-4o, gpt-3.5-turbo) based on their cost/performance needs
  • ๐Ÿ“ Smart Project Organization: Automatically creates organized project folders in ai_projects/
  • ๐Ÿ—‚๏ธ Custom Locations: Users can specify project locations - "Create in my_workspace folder"
  • ๐Ÿ›ก๏ธ Rate Limiting: Prevents abuse and controls API costs by limiting requests per user/IP
  • ๐Ÿ› ๏ธ Multi-Tool Support: File creation, code analysis, command execution
  • ๐Ÿง  Chain-of-Thought Reasoning: Multi-step planning for complex tasks

๐Ÿš€ Quick Start

Prerequisites

  • Python 3.8+
  • OpenAI API key
  • Microphone (optional - works in text mode too)

Installation

  1. Clone the repository

    git clone https://github.com/your-username/voice-coding-assistant.git
    cd voice-coding-assistant
  2. Install dependencies

    pip install -r requirements.txt
  3. Set up environment variables (CLI only)

    cp .env.example .env
    # Edit .env and add your OpenAI API key (for CLI mode only)
  4. Run the assistant (CLI mode)

    python main.py
  5. ๐Ÿ”ฅ Run Hybrid Mode (Recommended)

    python hybrid_server.py

    Starts both Simple REST API (8000) and MCP Server (8001)

  6. Or run individually:

    # Simple REST API only
    uvicorn server:app --reload
    
    # MCP Server only  
    python mcp_server.py
  7. ๐Ÿงช Test the APIs

    python test_apis.py

๐ŸŽฏ Usage Examples

CLI Mode

Run python main.py for interactive voice/text coding assistance.

Simple REST API

Send a POST request to /api/ask:

๐Ÿ“‹ Click to view REST API curl example
curl -X POST "http://127.0.0.1:8000/api/ask" \
   -H "Content-Type: application/json" \
   -d '{
      "user_input": "Create a Python function to add two numbers",
      "api_key": "sk-...your-openai-key...",
      "model": "gpt-4o-mini",
      "context": {}
   }'

๐ŸŽฏ Model Selection: Users can specify which OpenAI model to use by including a "model" parameter. Defaults to "gpt-4o-mini" if not specified.

True MCP (Model Context Protocol) Server

The project includes a real MCP-compliant server following the JSON-RPC 2.0 protocol:

๐Ÿ”Œ 1. Initialize the MCP connection
curl -X POST "http://127.0.0.1:8001/mcp/rpc" \
   -H "Content-Type: application/json" \
   -d '{
      "jsonrpc": "2.0",
      "id": 1,
      "method": "initialize",
      "params": {
         "protocolVersion": "2024-11-05",
         "capabilities": {"tools": {}},
         "clientInfo": {"name": "my-client", "version": "1.0.0"}
      }
   }'
๐Ÿ› ๏ธ 2. List available tools
curl -X POST "http://127.0.0.1:8001/mcp/rpc" \
   -H "Content-Type: application/json" \
   -d '{
      "jsonrpc": "2.0",
      "id": 2,
      "method": "tools/list"
   }'
โšก 3. Call a tool
curl -X POST "http://127.0.0.1:8001/mcp/rpc" \
   -H "Content-Type: application/json" \
   -d '{
      "jsonrpc": "2.0",
      "id": 3,
      "method": "tools/call",
      "params": {
         "name": "create_file",
         "arguments": {
            "file_path": "hello.py",
            "content": "print(\"Hello MCP!\")",
            "api_key": "sk-...your-key..."
         }
      }
   }'
๐Ÿค– 4. Use the AI assistant
curl -X POST "http://127.0.0.1:8001/mcp/rpc" \
   -H "Content-Type: application/json" \
   -d '{
      "jsonrpc": "2.0",
      "id": 4,
      "method": "assistant/ask",
      "params": {
         "user_input": "Create a todo app",
         "api_key": "sk-...your-key...",
         "model": "gpt-4o-mini",
         "context": {}
      }
   }'

๐ŸŽฏ Model Selection: Both APIs support user-selectable models via the "model" parameter. This allows API key owners to control cost and performance trade-offs.

Note: Each request must include a valid OpenAI API key. The simple API is rate-limited (10/minute), while MCP allows 30/minute for more complex workflows.

Create a Todo App

๐ŸŽค "Create a todo app with HTML, CSS, and JavaScript"

Result: Complete todo application in ai_projects/todo_app/ folder with:

  • index.html - Responsive HTML structure
  • style.css - Modern CSS styling
  • script.js - Interactive JavaScript functionality

Create with Custom Location

๐ŸŽค "Create a calculator app in my_projects folder"

Result: Calculator application in my_projects/calculator_app/ folder

Analyze Code

๐ŸŽค "Analyze the main.py file"

Result: Detailed code analysis with metrics and insights

Build a Calculator

๐ŸŽค "Create a calculator app"

Result: Functional calculator in calculator_app/ folder

๐Ÿ“‚ Project Structure

voice-coding-assistant/
โ”œโ”€โ”€ main.py              # Main application entry point (CLI)
โ”œโ”€โ”€ assistant_core.py    # Core assistant logic (shared by CLI and APIs)
โ”œโ”€โ”€ server.py            # Simple REST API server
โ”œโ”€โ”€ mcp_server.py        # True MCP-compliant JSON-RPC server
โ”œโ”€โ”€ hybrid_server.py     # Runs both APIs simultaneously  
โ”œโ”€โ”€ tools.py             # Tool functions (file ops, analysis)
โ”œโ”€โ”€ test_apis.py         # Example usage for both APIs
โ”œโ”€โ”€ requirements.txt     # Python dependencies
โ”œโ”€โ”€ .env.example         # Environment variables template
โ”œโ”€โ”€ README.md            # This file
โ””โ”€โ”€ ai_projects/         # Organized AI-generated projects
   โ”œโ”€โ”€ todo_app/
   โ”œโ”€โ”€ calculator_app/
   โ”œโ”€โ”€ web_app/
   โ”œโ”€โ”€ game_app/
   โ””โ”€โ”€ python_project/

๐Ÿ› ๏ธ Available Tools

Tool Description Example Use
create_file Create new files with content Building HTML, CSS, JS files
read_file Read existing file contents Code review and analysis
write_file Update existing files Modifying configurations
analyze_code Analyze code structure Getting code metrics
run_command Execute system commands Git operations, builds

๐ŸŽจ Smart Project Detection & Custom Locations

The assistant automatically detects project types and creates organized folders in ai_projects/:

Project Type Keywords Default Folder Custom Location Example
๐Ÿ“ Todo Apps todo, task, checklist ai_projects/todo_app/ my_workspace/todo_app/
๐Ÿงฎ Calculator calc, calculator, math ai_projects/calculator_app/ desktop/tools/calculator_app/
๐ŸŒค๏ธ Weather Apps weather, forecast ai_projects/weather_app/ projects/weather_app/
๐Ÿ’ผ Portfolio portfolio, resume, cv ai_projects/portfolio_app/ websites/portfolio_app/
๐Ÿ›’ E-commerce shop, store, cart ai_projects/ecommerce_app/ business/ecommerce_app/
๐ŸŽฎ Games game, puzzle, play ai_projects/game_app/ my_games/game_app/
๐ŸŒ Web Apps General HTML/CSS/JS ai_projects/web_app/ webdev/web_app/
๐Ÿ Python Python files ai_projects/python_project/ scripts/python_project/

๐Ÿ—‚๏ธ Custom Location Examples

Users can specify custom locations using natural language:

๐ŸŽค "Create a todo app in my_projects folder"
๐Ÿ“ Result: my_projects/todo_app/

๐ŸŽค "Put the calculator in desktop/tools"
๐Ÿ“ Result: desktop/tools/calculator_app/

๐ŸŽค "Save in location: custom_workspace"
๐Ÿ“ Result: custom_workspace/[detected_project_type]/

๐ŸŽค "Create in directory: user_apps"  
๐Ÿ“ Result: user_apps/[detected_project_type]/

๐ŸŽฏ Location Detection Patterns

The system recognizes various ways users specify custom locations:

  • "Create ... in [folder]"
  • "Put ... in [folder]"
  • "Save in location: [folder]"
  • "Create in directory: [folder]"
  • "Location: [folder]"
  • "Folder: [folder]"

๐Ÿ—๏ธ Hybrid Architecture Overview

๐ŸŽค Voice Coding Assistant - Hybrid Architecture
โ”œโ”€โ”€ ๐Ÿ“ก Simple REST API (Port 8000)     โ”œโ”€โ”€ ๐Ÿ”Œ MCP Server (Port 8001)
โ”‚   โ”œโ”€โ”€ POST /api/ask                  โ”‚   โ”œโ”€โ”€ JSON-RPC 2.0 Protocol
โ”‚   โ”œโ”€โ”€ Rate Limit: 10/min             โ”‚   โ”œโ”€โ”€ Rate Limit: 30/min
โ”‚   โ””โ”€โ”€ Perfect for MVPs               โ”‚   โ””โ”€โ”€ Standards Compliant
โ”‚                                      โ”‚
โ””โ”€โ”€ ๐Ÿง  Shared Core Logic (assistant_core.py)
    โ”œโ”€โ”€ OpenAI GPT Integration
    โ”œโ”€โ”€ Chain-of-Thought Reasoning
    โ”œโ”€โ”€ Tool Execution Engine
    โ””โ”€โ”€ Project Organization

๐ŸŽฏ Why Hybrid Approach?

  • ๐Ÿš€ Speed: Simple REST for quick integrations and testing
  • ๐Ÿ“ Standards: MCP compliance for future-proof AI ecosystem integration
  • ๐Ÿ”„ Flexibility: Developers choose what fits their workflow
  • ๐Ÿ’ก Innovation: Best of both worlds without compromise

๐Ÿ”ง Configuration

API Comparison

Feature Simple REST API True MCP Server
Protocol HTTP REST JSON-RPC 2.0
Port 8000 8001
Endpoint /api/ask /mcp/rpc
Rate Limit 10/minute 30/minute
Initialization None required MCP handshake required
Tool Discovery Not available /tools/list method
Compliance Simple & fast MCP standard compliant
Use Case Quick integration Standard MCP clients

Environment Variables

For CLI mode, create a .env file with:

# Required for CLI
OPENAI_API_KEY=your_openai_api_key_here

For API mode, each request must include an api_key field with a valid OpenAI API key. The .env file is not required for API usage.

Supported Models

  • gpt-4o-mini (default) - Fast and cost-effective, optimal for most coding tasks
  • gpt-4o - More capable for complex architectural decisions and advanced coding
  • gpt-3.5-turbo - Budget-friendly option for simple tasks
  • gpt-4 - Legacy model, more expensive but highly capable

๐ŸŽฏ Model Selection: Users can choose models based on their specific needs:

  • Cost-conscious: Use gpt-4o-mini for routine coding tasks
  • Performance-critical: Use gpt-4o for complex problem-solving
  • Budget-limited: Use gpt-3.5-turbo for simple file generation

Since users provide their own API keys, they control the cost/performance trade-off!

๐Ÿง  How It Works

The assistant uses a structured reasoning approach:

  1. ๐Ÿ”ฅ START: Processes your voice/text input
  2. ๐Ÿง  PLAN: Creates multi-step execution plan
  3. ๐Ÿ› ๏ธ TOOL: Executes necessary tools (file creation, analysis)
  4. ๐Ÿ‘๏ธ OBSERVE: Reviews tool outputs
  5. ๐Ÿค– OUTPUT: Provides final response

Example Workflow

๐ŸŽค Input: "Create a landing page for a coffee shop"

๐Ÿง  Planning: "User wants a coffee shop landing page"
๐Ÿง  Planning: "I'll create HTML structure with header, menu, contact"
๐Ÿง  Planning: "Add CSS for warm, coffee-themed styling"
๐Ÿง  Planning: "Include JavaScript for interactive menu"

๐Ÿ› ๏ธ Tool: create_file(index.html, [HTML content])
๐Ÿ› ๏ธ Tool: create_file(style.css, [CSS content])  
๐Ÿ› ๏ธ Tool: create_file(script.js, [JS content])

๐Ÿค– Output: "Created a complete coffee shop landing page!"

Made with โค๏ธ by Dhrumil Bhut

โญ Star this repo if you find it helpful!

About

๐ŸŽค Voice-controlled AI coding assistant with OpenAI GPT integration. Create files, analyze code, and build web apps through natural speech commands.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages