Hawaii-built, localhost-first AI agent platform powered by Microsoft Agent Framework
OpenChatCi is a local AI agent runtime and UI that connects modern agent frameworks through the AG-UI protocol.
Run powerful AI agents directly on localhost with a modern UI, streaming responses, and tool integrations.
The platform connects the UI and agent runtime through the AG-UI protocol.
Weather Tools · Mermaid Diagrams · Image Analysis
DevUI · Search Session · Image generation
- Chat with AI agents via AG-UI protocol (SSE streaming)
- Rich message rendering: Markdown, code blocks, math (KaTeX), Mermaid diagrams
- LLM reasoning visualization with collapsible thinking blocks
- Web search with inline citation links
- Voice input via microphone with Whisper transcription
- Text-to-Speech playback and download via ElevenLabs
- Multimodal image analysis (file attachment, drag-and-drop, URL)
- Image generation, editing, and Canvas mask editor via Azure OpenAI gpt-image-1.5
- Weather tools with rich card widgets (Open-Meteo, no API key)
- Coding tools (file read/write, shell execution, file search)
- Prompt Templates: save, manage, and insert reusable prompts from "+" menu and message actions
- Agent Skills: portable domain knowledge packages with progressive disclosure
- MCP Integration: connect external tools via Model Context Protocol (Claude Desktop-compatible config)
- MCP Apps: interactive UI rendered in sandboxed iframes for MCP tools with
_meta.uiresources - RAG Pipeline: PDF ingestion with ChromaDB vector search, Azure OpenAI embedding, and source citations
- Batch Processing: async job queue via Core MCP Server with real-time MCP Apps dashboard
- Multi-model switching: switch between OpenAI models mid-conversation with per-model reasoning and context window
- Session management: save, search, pin, archive, fork, rename
- Background Responses: long-running agent timeout prevention with stream resumption
- Context window consumption display with warning levels
- Per-turn token usage display
- OpenAI-compatible API: expose agent as
/v1/responsesendpoint for external apps via OpenAI SDK - HTTPS/TLS support for LAN access with Secure Context (mkcert recommended)
- Multilingual chat with browser auto-translation suppressed
- Three layout scenarios: Chat, Popup, Sidebar
pip install openchatci
openchatci init
# Edit .env and set AZURE_OPENAI_ENDPOINT
az login
openchatciOpen:
http://localhost:8000/chat
| Tool | Version | Install |
|---|---|---|
| Node.js | 22+ | https://nodejs.org/ |
| pnpm | 10+ | npm install -g pnpm |
| Python | 3.12+ | https://www.python.org/ |
| uv | 0.9+ | https://docs.astral.sh/uv/ |
| Azure CLI | 2.x | https://learn.microsoft.com/cli/azure/install-azure-cli |
The backend authenticates to Azure OpenAI via AzureCliCredential.
You must log in before starting.
az loginSelect the subscription if needed:
az account set --subscription <subscription-id>Windows (PowerShell):
cd backend
copy .env.sample .env
# Edit .env and set your Azure OpenAI endpoint
notepad .env
uv sync --prerelease=allowmacOS / Linux:
cd backend
cp .env.sample .env
# Edit .env and set your Azure OpenAI endpoint
nano .env
uv sync --prerelease=allow.env configuration (required):
AZURE_OPENAI_ENDPOINT=https://<your-resource>.openai.azure.com/
AZURE_OPENAI_MODELS=gpt-4o
cd frontend
pnpm installOpen two terminals:
Terminal 1 -- Backend:
cd backend
uv run uvicorn app.main:app --reload --app-dir srcBackend starts at http://localhost:8000
Terminal 2 -- Frontend:
cd frontend
pnpm devFrontend dev server starts at http://localhost:5173 (API requests are proxied to the backend)
cd frontend
pnpm build
cd ../backend
uv run uvicorn app.main:app --app-dir srcThe backend serves both frontend build artifacts and the API at http://localhost:8000
openchatci Start the server
openchatci init Generate .env from template
openchatci init --force Overwrite existing .env
openchatci --host 0.0.0.0 Bind to all interfaces
openchatci --port 9000 Use custom port
openchatci --skip-auth-check Skip Azure CLI login check
openchatci --ssl-certfile cert.pem \
--ssl-keyfile key.pem Enable HTTPS (LAN access)
openchatci --version Show version
| Layer | Technology | Purpose |
|---|---|---|
| Frontend | React 19 + TypeScript + Vite | UI framework |
| Frontend | Tailwind CSS + shadcn/ui | Styling + Components |
| Frontend | Biome | Format + Lint |
| Backend | FastAPI + Python 3.12+ | API server |
| Backend | Microsoft Agent Framework | Agent execution + Tool control |
| Backend | Ruff | Format + Lint |
| Package | uv | Python dependency management |
| Package | pnpm | Node.js dependency management |
Save and reuse prompt templates from the chat interface:
TEMPLATES_DIR=.templates
- Click + button > Use template to open the management modal
- Create, edit, delete templates with name, category, and body
- Insert to Chat pastes the template into the input (editable before send)
- Click the FileText icon on any user message to save it as a template
Templates are stored as individual JSON files in the configured directory.
Generate and edit images via Azure OpenAI gpt-image-1.5:
IMAGE_DEPLOYMENT_NAME=gpt-image-1.5
- generate_image: create images from text prompts with configurable size, quality, format, background, and count (1-4)
- edit_image: modify existing session images using text prompts (prompt-based)
- Canvas Mask Editor: click the Edit button on any generated image to open a full-screen mask editor
- Draw over areas to edit with brush tools (S/M/L), eraser, undo/redo
- Enter a prompt and click Generate -- the agent edits only the masked region
- Generated images displayed inline in chat with click-to-open full-size
- Images stored in session upload directory and persist across reloads
The agent automatically uses these tools when users request image creation or editing. No opt-in flag needed -- the feature activates when IMAGE_DEPLOYMENT_NAME is set.
Enable AI-powered file operations and shell execution:
CODING_ENABLED=true
CODING_WORKSPACE_DIR=C:\path\to\workspace
Enable on-demand TTS for messages via ElevenLabs:
ELEVENLABS_API_KEY=your-api-key
TTS_MODEL_ID=eleven_multilingual_v2
TTS_VOICE_ID=your-voice-id
Speaker button plays audio, download button saves MP3 file. Audio is cached to avoid duplicate API calls.
Extend the agent with domain knowledge packages (Agent Skills specification):
SKILLS_DIR=.skills
Place SKILL.md files in subdirectories. The agent discovers and loads skills on demand:
.skills/
my-skill/
├── SKILL.md # Required: instructions + metadata
├── scripts/ # Optional: executable code
├── references/ # Optional: documentation
└── assets/ # Optional: templates, resources
Skills use progressive disclosure to minimize context window consumption (~100 tokens per skill when idle).
Connect external tools and services via Model Context Protocol using the Claude Desktop-compatible configuration format:
MCP_CONFIG_FILE=mcp_servers.json
Create a mcp_servers.json file (see backend/mcp_servers.sample.json):
{
"mcpServers": {
"filesystem": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-filesystem", "/path/to/workspace"]
},
"remote-api": {
"url": "https://api.example.com/mcp",
"headers": { "Authorization": "Bearer token" }
}
}
}- stdio servers (with
command): OpenChatCi spawns the process and communicates via stdin/stdout - HTTP/SSE servers (with
url): OpenChatCi connects to a running remote server - MCP tools appear alongside built-in tools (Weather, Coding, Image Generation)
- Tool calls display with categorized icons: built-in tools have dedicated icons, Skills tools show BookOpen/FileText, MCP tools show Plug
- Server lifecycle managed automatically (startup/shutdown with zombie process prevention)
- Reuse your existing Claude Desktop / Claude Code / Cursor MCP configurations
MCP tools that declare a _meta.ui resource automatically render interactive UI within chat messages. The HTML View runs in a secure double-iframe sandbox with CSP enforcement.
# Optional: change the sandbox proxy port (default 8081)
# MCP_APPS_SANDBOX_PORT=8081
- Automatic discovery: UI-enabled MCP tools detected at server startup
- Double-iframe sandbox: Views run on a separate origin with no access to host DOM, cookies, or storage
- CSP enforcement: external resources blocked by default; servers declare required domains via metadata
- View-to-Server proxying: all View interactions proxied through the Host (auditable)
- Display modes: inline (in chat) and fullscreen
- Session persistence: View HTML stored as files for reload restoration
- Progressive enhancement: tools work as text-only when UI is unavailable or unsupported
No configuration needed -- MCP Apps activates when MCP tools have _meta.ui.resourceUri in their definitions. The sandbox proxy starts automatically alongside MCP servers.
Upload PDF documents and ask questions about their content using vector similarity search:
CHROMA_DIR=.chroma
RAG_COLLECTION_NAME=default
RAG_TOP_K=5
EMBEDDING_DEPLOYMENT_NAME=text-embedding-3-small
RAG_CHUNK_SIZE=800
RAG_CHUNK_OVERLAP=200
- Click + button > Attach PDF to upload a document
- Ask the agent: "Please ingest this document"
- The batch job processes: PDF parsing > chunking > embedding > ChromaDB storage
- Ask questions: "What does the document say about X?"
- The agent searches the knowledge base and responds with source citations (filename, page)
- ChromaDB PersistentClient: file-based vector storage (
.chroma/directory) - Azure OpenAI Embedding:
text-embedding-3-smallfor consistent multilingual quality - Overlap chunking: configurable chunk size (800 chars) and overlap (200 chars)
- Metadata filtering: source filename, page number, chunk index for precise citation
- Deduplication: re-ingesting the same file overwrites existing chunks automatically
- PDF file cards: PDFs display as file icon cards (not image thumbnails) in chat
Requires the Batch Processing MCP Server to be configured (see below).
Run long-running tasks (RAG ingestion, data pipelines) as background batch jobs with a real-time monitoring dashboard:
- Add
"batch"to yourmcp_servers.json(seebackend/mcp_servers.sample.json) - Start the server -- the batch MCP server launches automatically
- Ask the agent: "Submit a sleep job for 60 seconds"
- A real-time dashboard appears inline showing progress, status, and controls
{
"mcpServers": {
"batch": {
"command": "uv",
"args": ["run", "python", "-m", "app.mcp_batch.server"],
"env": { "BATCH_JOBS_DIR": ".jobs" }
}
}
}- Conversation-based management: submit, monitor, cancel, delete jobs via chat
- MCP Apps dashboard: auto-refreshing progress bars, cancel/delete with confirmation dialogs
- File-based persistence: each job stored as a JSON file (crash-resilient)
- Extensible job types: sample sleep job + RAG Ingestion Pipeline (
rag-ingest) - Cooperative cancellation: jobs check cancel flag at each progress checkpoint
For long-running agent operations (e.g., o3/o4-mini reasoning models), enable Background Responses to prevent timeouts:
- Click the BG toggle button (left of the context window indicator)
- ChatInput border turns blue when active
- Continuation tokens are auto-saved to session for page reload resumption
No environment variable needed -- toggle on/off per session via the UI.
Expose the agent as an OpenAI-compatible endpoint for external applications:
API_KEY=sk-openchatci-your-secret-key-here
Any app using the OpenAI SDK can consume the agent by pointing base_url:
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:8000/v1",
api_key="sk-openchatci-your-secret-key-here",
)
# Non-streaming
response = client.responses.create(
model="openchatci",
input="What is the weather in Tokyo?",
)
# Streaming
stream = client.responses.create(
model="openchatci",
input="Explain quantum computing.",
stream=True,
)
for event in stream:
if event.type == "response.output_text.delta":
print(event.delta, end="", flush=True)- All agent Tools (Weather, Coding, Image Generation) and Skills are available
- Multi-turn conversations via
previous_response_id - API sessions appear in the chat sidebar with an API badge
- Streaming (SSE) and non-streaming response modes
- For HTTPS/LAN access, see OpenAI API Setup Guide
Access OpenChatCi from other devices on your home network (phones, tablets, other PCs). HTTPS enables browser Secure Context for voice input and clipboard on non-localhost origins.
APP_HOST=0.0.0.0
APP_SSL_CERTFILE=.certs/cert.pem
APP_SSL_KEYFILE=.certs/key.pem
Setup:
- Install mkcert and run
mkcert -install - Issue a certificate:
mkcert -cert-file .certs/cert.pem -key-file .certs/key.pem <your-ip> localhost 127.0.0.1 - Set the env vars above in
.env - Allow ports through firewall (8000 for production, 5173 for dev mode)
- Install the CA certificate (
rootCA.pem) on each client device
Access from LAN: https://<your-ip>:8000
When SSL is not configured, the server runs in HTTP mode as usual (no breaking change).
Switch between OpenAI-family models mid-conversation:
AZURE_OPENAI_MODELS=gpt-4o,o3,gpt-4.1-mini
- Model selector dropdown appears above the chat input (hidden when only one model configured)
- Per-session model selection persisted across page reloads
- Regenerate with different model: click the chevron on the Regenerate button to choose a model
- Per-message model label: each assistant message shows which model generated it
- All models share the same Tools, Skills, and MCP integrations
Per-model reasoning effort (only listed models send the parameter):
REASONING_EFFORT=o3:high,o4-mini:medium
Per-model context window limits:
MODEL_MAX_CONTEXT_TOKENS=gpt-4o:128000,o3:200000,gpt-4.1-mini:1047576
The progress bar above the chat input shows context window consumption rate. Colors change at 80% (amber) and 95% (red). When multiple models are configured, the display updates automatically when switching models.
Enable Microsoft Agent Framework DevUI for debugging:
DEVUI_ENABLED=true
DEVUI_PORT=8080
Access at http://localhost:8080
- Windows 10/11
- macOS (Intel / Apple Silicon)
- Linux (Ubuntu, Debian, etc.)






