Transform your voice into code with AI-powered development assistance
A sophisticated voice-controlled coding assistant that leverages OpenAI's GPT models to help developers create, analyze, and manage code through natural speech interaction. Simply speak your requirements, and watch as complete projects come to life!
- ๐๏ธ Voice-to-Code: Convert speech directly into functional code
- ๐ Text-to-Speech: AI responses spoken aloud with OpenAI's streaming TTS
- ๐๏ธ Hybrid Architecture: Both Simple REST API and True MCP Protocol support
- ๐ True MCP Server: Full JSON-RPC 2.0 compliant Model Context Protocol implementation
- ๐ฏ Model Selection: Users can choose from multiple OpenAI models (gpt-4o-mini, gpt-4o, gpt-3.5-turbo) based on their cost/performance needs
- ๐ Smart Project Organization: Automatically creates organized project folders in
ai_projects/ - ๐๏ธ Custom Locations: Users can specify project locations - "Create in my_workspace folder"
- ๐ก๏ธ Rate Limiting: Prevents abuse and controls API costs by limiting requests per user/IP
- ๐ ๏ธ Multi-Tool Support: File creation, code analysis, command execution
- ๐ง Chain-of-Thought Reasoning: Multi-step planning for complex tasks
- Python 3.8+
- OpenAI API key
- Microphone (optional - works in text mode too)
-
Clone the repository
git clone https://github.com/your-username/voice-coding-assistant.git cd voice-coding-assistant -
Install dependencies
pip install -r requirements.txt
-
Set up environment variables (CLI only)
cp .env.example .env # Edit .env and add your OpenAI API key (for CLI mode only) -
Run the assistant (CLI mode)
python main.py
-
๐ฅ Run Hybrid Mode (Recommended)
python hybrid_server.py
Starts both Simple REST API (8000) and MCP Server (8001)
-
Or run individually:
# Simple REST API only uvicorn server:app --reload # MCP Server only python mcp_server.py
-
๐งช Test the APIs
python test_apis.py
Run python main.py for interactive voice/text coding assistance.
Send a POST request to /api/ask:
๐ Click to view REST API curl example
curl -X POST "http://127.0.0.1:8000/api/ask" \
-H "Content-Type: application/json" \
-d '{
"user_input": "Create a Python function to add two numbers",
"api_key": "sk-...your-openai-key...",
"model": "gpt-4o-mini",
"context": {}
}'๐ฏ Model Selection: Users can specify which OpenAI model to use by including a "model" parameter. Defaults to "gpt-4o-mini" if not specified.
The project includes a real MCP-compliant server following the JSON-RPC 2.0 protocol:
๐ 1. Initialize the MCP connection
curl -X POST "http://127.0.0.1:8001/mcp/rpc" \
-H "Content-Type: application/json" \
-d '{
"jsonrpc": "2.0",
"id": 1,
"method": "initialize",
"params": {
"protocolVersion": "2024-11-05",
"capabilities": {"tools": {}},
"clientInfo": {"name": "my-client", "version": "1.0.0"}
}
}'๐ ๏ธ 2. List available tools
curl -X POST "http://127.0.0.1:8001/mcp/rpc" \
-H "Content-Type: application/json" \
-d '{
"jsonrpc": "2.0",
"id": 2,
"method": "tools/list"
}'โก 3. Call a tool
curl -X POST "http://127.0.0.1:8001/mcp/rpc" \
-H "Content-Type: application/json" \
-d '{
"jsonrpc": "2.0",
"id": 3,
"method": "tools/call",
"params": {
"name": "create_file",
"arguments": {
"file_path": "hello.py",
"content": "print(\"Hello MCP!\")",
"api_key": "sk-...your-key..."
}
}
}'๐ค 4. Use the AI assistant
curl -X POST "http://127.0.0.1:8001/mcp/rpc" \
-H "Content-Type: application/json" \
-d '{
"jsonrpc": "2.0",
"id": 4,
"method": "assistant/ask",
"params": {
"user_input": "Create a todo app",
"api_key": "sk-...your-key...",
"model": "gpt-4o-mini",
"context": {}
}
}'๐ฏ Model Selection: Both APIs support user-selectable models via the "model" parameter. This allows API key owners to control cost and performance trade-offs.
Note: Each request must include a valid OpenAI API key. The simple API is rate-limited (10/minute), while MCP allows 30/minute for more complex workflows.
๐ค "Create a todo app with HTML, CSS, and JavaScript"
Result: Complete todo application in ai_projects/todo_app/ folder with:
index.html- Responsive HTML structurestyle.css- Modern CSS stylingscript.js- Interactive JavaScript functionality
๐ค "Create a calculator app in my_projects folder"
Result: Calculator application in my_projects/calculator_app/ folder
๐ค "Analyze the main.py file"
Result: Detailed code analysis with metrics and insights
๐ค "Create a calculator app"
Result: Functional calculator in calculator_app/ folder
voice-coding-assistant/
โโโ main.py # Main application entry point (CLI)
โโโ assistant_core.py # Core assistant logic (shared by CLI and APIs)
โโโ server.py # Simple REST API server
โโโ mcp_server.py # True MCP-compliant JSON-RPC server
โโโ hybrid_server.py # Runs both APIs simultaneously
โโโ tools.py # Tool functions (file ops, analysis)
โโโ test_apis.py # Example usage for both APIs
โโโ requirements.txt # Python dependencies
โโโ .env.example # Environment variables template
โโโ README.md # This file
โโโ ai_projects/ # Organized AI-generated projects
โโโ todo_app/
โโโ calculator_app/
โโโ web_app/
โโโ game_app/
โโโ python_project/
| Tool | Description | Example Use |
|---|---|---|
create_file |
Create new files with content | Building HTML, CSS, JS files |
read_file |
Read existing file contents | Code review and analysis |
write_file |
Update existing files | Modifying configurations |
analyze_code |
Analyze code structure | Getting code metrics |
run_command |
Execute system commands | Git operations, builds |
The assistant automatically detects project types and creates organized folders in ai_projects/:
| Project Type | Keywords | Default Folder | Custom Location Example |
|---|---|---|---|
| ๐ Todo Apps | todo, task, checklist | ai_projects/todo_app/ |
my_workspace/todo_app/ |
| ๐งฎ Calculator | calc, calculator, math | ai_projects/calculator_app/ |
desktop/tools/calculator_app/ |
| ๐ค๏ธ Weather Apps | weather, forecast | ai_projects/weather_app/ |
projects/weather_app/ |
| ๐ผ Portfolio | portfolio, resume, cv | ai_projects/portfolio_app/ |
websites/portfolio_app/ |
| ๐ E-commerce | shop, store, cart | ai_projects/ecommerce_app/ |
business/ecommerce_app/ |
| ๐ฎ Games | game, puzzle, play | ai_projects/game_app/ |
my_games/game_app/ |
| ๐ Web Apps | General HTML/CSS/JS | ai_projects/web_app/ |
webdev/web_app/ |
| ๐ Python | Python files | ai_projects/python_project/ |
scripts/python_project/ |
Users can specify custom locations using natural language:
๐ค "Create a todo app in my_projects folder"
๐ Result: my_projects/todo_app/
๐ค "Put the calculator in desktop/tools"
๐ Result: desktop/tools/calculator_app/
๐ค "Save in location: custom_workspace"
๐ Result: custom_workspace/[detected_project_type]/
๐ค "Create in directory: user_apps"
๐ Result: user_apps/[detected_project_type]/The system recognizes various ways users specify custom locations:
- "Create ... in [folder]"
- "Put ... in [folder]"
- "Save in location: [folder]"
- "Create in directory: [folder]"
- "Location: [folder]"
- "Folder: [folder]"
๐ค Voice Coding Assistant - Hybrid Architecture
โโโ ๐ก Simple REST API (Port 8000) โโโ ๐ MCP Server (Port 8001)
โ โโโ POST /api/ask โ โโโ JSON-RPC 2.0 Protocol
โ โโโ Rate Limit: 10/min โ โโโ Rate Limit: 30/min
โ โโโ Perfect for MVPs โ โโโ Standards Compliant
โ โ
โโโ ๐ง Shared Core Logic (assistant_core.py)
โโโ OpenAI GPT Integration
โโโ Chain-of-Thought Reasoning
โโโ Tool Execution Engine
โโโ Project Organization
- ๐ Speed: Simple REST for quick integrations and testing
- ๐ Standards: MCP compliance for future-proof AI ecosystem integration
- ๐ Flexibility: Developers choose what fits their workflow
- ๐ก Innovation: Best of both worlds without compromise
| Feature | Simple REST API | True MCP Server |
|---|---|---|
| Protocol | HTTP REST | JSON-RPC 2.0 |
| Port | 8000 | 8001 |
| Endpoint | /api/ask |
/mcp/rpc |
| Rate Limit | 10/minute | 30/minute |
| Initialization | None required | MCP handshake required |
| Tool Discovery | Not available | /tools/list method |
| Compliance | Simple & fast | MCP standard compliant |
| Use Case | Quick integration | Standard MCP clients |
For CLI mode, create a .env file with:
# Required for CLI
OPENAI_API_KEY=your_openai_api_key_hereFor API mode, each request must include an api_key field with a valid OpenAI API key. The .env file is not required for API usage.
gpt-4o-mini(default) - Fast and cost-effective, optimal for most coding tasksgpt-4o- More capable for complex architectural decisions and advanced codinggpt-3.5-turbo- Budget-friendly option for simple tasksgpt-4- Legacy model, more expensive but highly capable
๐ฏ Model Selection: Users can choose models based on their specific needs:
- Cost-conscious: Use
gpt-4o-minifor routine coding tasks - Performance-critical: Use
gpt-4ofor complex problem-solving - Budget-limited: Use
gpt-3.5-turbofor simple file generation
Since users provide their own API keys, they control the cost/performance trade-off!
The assistant uses a structured reasoning approach:
- ๐ฅ START: Processes your voice/text input
- ๐ง PLAN: Creates multi-step execution plan
- ๐ ๏ธ TOOL: Executes necessary tools (file creation, analysis)
- ๐๏ธ OBSERVE: Reviews tool outputs
- ๐ค OUTPUT: Provides final response
๐ค Input: "Create a landing page for a coffee shop"
๐ง Planning: "User wants a coffee shop landing page"
๐ง Planning: "I'll create HTML structure with header, menu, contact"
๐ง Planning: "Add CSS for warm, coffee-themed styling"
๐ง Planning: "Include JavaScript for interactive menu"
๐ ๏ธ Tool: create_file(index.html, [HTML content])
๐ ๏ธ Tool: create_file(style.css, [CSS content])
๐ ๏ธ Tool: create_file(script.js, [JS content])
๐ค Output: "Created a complete coffee shop landing page!"
Made with โค๏ธ by Dhrumil Bhut
โญ Star this repo if you find it helpful!