Skip to content

Super lightweight Ollama + Qwen Code alternative to run Llama 3.3, DeepSeek-R1, Phi-4, Gemma 3, Mistral Small 3.1 and other large language models.

License

Notifications You must be signed in to change notification settings

KolosalAI/kolosal-cli

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Kolosal CLI

npm version License Node.js Version

AI-powered command-line workflow tool for developers

InstallationQuick StartFeaturesDocumentationContributing

Kolosal CLI is a powerful command-line AI workflow tool that enhances your development workflow with advanced code understanding, automated tasks, and intelligent assistance. Based on Qwen Code models, it provides seamless integration with your existing development environment.

Features

  • Code Understanding & Editing - Query and edit large codebases beyond traditional context window limits
  • Workflow Automation - Automate operational tasks like handling pull requests and complex rebases
  • Enhanced Parser - Optimized specifically for code-oriented models
  • Vision Model Support - Automatic detection and multimodal analysis of images in your input
  • Flexible Model Support - Works with OpenAI-compatible APIs and Hugging Face models

Installation

Quick Install (Recommended)

Install KolosalCode with a single command:

# macOS or Linux
curl -fsSL https://raw.githubusercontent.com/KolosalAI/kolosal-cli/main/install.sh | bash

# Linux (with sudo)
curl -fsSL https://raw.githubusercontent.com/KolosalAI/kolosal-cli/main/install.sh | sudo bash

The installer automatically detects your operating system and installs the appropriate package.

Alternative Installation Methods

Install from npm

Requires Node.js version 20 or higher.

npm install -g @kolosal-ai/kolosal-ai@latest
kolosal --version
Manual Download

macOS:

# Download and install the .pkg
curl -LO https://github.com/KolosalAI/kolosal-cli/releases/download/v0.1.0-pre/KolosalCode-macos-signed.pkg
sudo installer -pkg KolosalCode-macos-signed.pkg -target /

Linux (Debian/Ubuntu):

# Download and install the .deb
wget https://github.com/KolosalAI/kolosal-cli/releases/download/v0.1.0-pre/kolosal-code_0.1.2_amd64.deb
sudo dpkg -i kolosal-code_0.1.2_amd64.deb
sudo apt-get install -f  # Fix dependencies if needed
Build from source
git clone https://github.com/KolosalAI/kolosal-cli.git
cd kolosal-cli
npm install
npm run build
npm link  # Optional: install globally

See CONTRIBUTING.md for detailed build instructions.

Verify Installation

kolosal --version
kolosal --help

For detailed installation instructions and troubleshooting, see docs/INSTALLATION.md.

Homebrew (Coming Soon)

brew install kolosal-ai

Quick Start

# Start Kolosal CLI
kolosal

# Example commands
> Explain this codebase structure
> Help me refactor this function
> Generate unit tests for this module

Authentication Options

1. OpenAI Compatible API

Set environment variables or create a .env file:

export OPENAI_API_KEY="your_api_key_here"
export OPENAI_BASE_URL="your_api_endpoint" 
export OPENAI_MODEL="your_model_choice"

2. Hugging Face LLM Models

Configure Hugging Face models for local or API-based inference:

export HF_TOKEN="your_huggingface_token"
export HF_MODEL="your_model_name"

Note: Kolosal CLI may issue multiple API calls per cycle, which can result in higher token usage. We're actively optimizing API efficiency.

Configuration

Session Management

Configure token limits in .kolosal/settings.json:

{
  "sessionTokenLimit": 32000
}

Vision Models

Kolosal CLI automatically detects images and switches to vision-capable models. Configure the behavior:

{
  "experimental": {
    "vlmSwitchMode": "once",  // "once", "session", "persist"
    "visionModelPreview": true
  }
}

Usage Examples

Code Analysis & Understanding

> Explain this codebase architecture
> What are the key dependencies and how do they interact?
> Find all API endpoints and their authentication methods
> Generate a dependency graph for this module

Code Development & Refactoring

> Refactor this function to improve readability and performance
> Convert this class to use dependency injection
> Generate unit tests for the authentication module
> Add error handling to all database operations

Workflow Automation

> Analyze git commits from the last 7 days, grouped by feature
> Create a changelog from recent commits
> Find and remove all console.log statements
> Check for potential SQL injection vulnerabilities

Commands & Shortcuts

Session Commands

  • /help - Display available commands
  • /clear - Clear conversation history
  • /compress - Compress history to save tokens
  • /stats - Show current session information
  • /exit or /quit - Exit Kolosal CLI

Keyboard Shortcuts

  • Ctrl+C - Cancel current operation
  • Ctrl+D - Exit (on empty line)
  • Up/Down - Navigate command history

Development & Contributing

See CONTRIBUTING.md to learn how to contribute to the project.

Distribution

Windows Package

Build a single executable and installer:

npm run bundle               # builds bundle/gemini.js
npm run build:win:exe        # creates dist/win/kolosal.exe
iscc installer/kolosal.iss   # creates installer

macOS Package

Build a standalone .pkg installer with bundled Node.js:

npm run build:mac:pkg

Creates dist/mac/KolosalCode-macos-signed.pkg - a self-contained package that requires no Node.js installation.

Linux Packages

Build Linux packages (.deb, .rpm, and .tar.gz):

# Build all formats
npm run build:linux

# Build specific format
npm run build:linux:deb      # Debian/Ubuntu package
npm run build:linux:rpm      # Red Hat/Fedora package  
npm run build:linux:tar      # Universal tarball

# Or use the convenience script
bash scripts/clean-build-linux.sh [all|deb|rpm|tar]

Creates packages in dist/linux/ with bundled Node.js - no separate runtime installation needed.

See docs/LINUX-PACKAGING.md for detailed Linux packaging documentation.

Acknowledgments

Kolosal CLI is built upon and adapted from the Qwen Code project, which is licensed under the Apache License, Version 2.0. We acknowledge and appreciate the excellent work of the Qwen development team.

This project also incorporates concepts and approaches from Google Gemini CLI. We appreciate the foundational work that made this project possible.

License

Licensed under the Apache License, Version 2.0. See LICENSE for the full license text.

This product includes software developed from the Qwen Code project, which is also licensed under the Apache License, Version 2.0.

CLI Embedded API (experimental)

The embedded API is enabled by default. It starts alongside the CLI and listens on 127.0.0.1:38080 unless configured otherwise.

Configure via settings.json or environment variables.

Start a local HTTP API alongside the CLI using environment variables:

  • KOLOSAL_CLI_API=true — enable the API server
  • KOLOSAL_CLI_API_PORT=38080 — port to listen on (default 38080)
  • KOLOSAL_CLI_API_HOST=127.0.0.1 — host binding (default 127.0.0.1)
  • KOLOSAL_CLI_API_TOKEN=secret — if set, requests must include Authorization: Bearer

Endpoints:

  • GET /healthz — health check
  • POST /v1/generate — generate content

Request (JSON):

{ "input": "Your prompt here", "stream": false, "prompt_id": "optional-id", "history": [] // optional: conversation history for multi-turn }

Response (JSON):

{ "output": "model response text", "prompt_id": "...", "messages": [...], // transcript of this turn "history": [...] // updated conversation history (pass this back for next turn) }

Streaming: Set stream=true to receive Server-Sent Events (SSE). Events are sent in real-time as they occur:

  • event: content — data: incremental text chunk
  • event: assistant — data: {"type":"assistant","content":"full assistant message"}
  • event: tool_call — data: {"type":"tool_call","name":"toolName","arguments":{...}}
  • event: tool_result — data: {"type":"tool_result","name":"toolName","ok":true,"responseText":"..."}
  • event: history — data: [...] (updated conversation history)
  • event: done — data: true
  • event: error — data: {"message":"..."}

Alternatively, configure via settings.json (used by the CLI):

{ "api": { "enabled": true, "host": "127.0.0.1", "port": 38080, "token": "your-token", "corsEnabled": true } }

Example requests

Plain JSON response:

curl -s \
  -H 'Content-Type: application/json' \
  -X POST http://127.0.0.1:38080/v1/generate \
  -d '{"input":"Summarize the project structure.","stream":false}'

Server-Sent Events (SSE) streaming response:

curl -N -s \
  -H 'Content-Type: application/json' \
  -X POST http://127.0.0.1:38080/v1/generate \
  -d '{"input":"Summarize the project structure","stream":true}'

Multi-turn conversation (maintaining context):

# First turn
RESPONSE=$(curl -s \
  -H 'Content-Type: application/json' \
  -X POST http://127.0.0.1:38080/v1/generate \
  -d '{"input":"What is 2+2?","stream":false}')

# Extract history from response
HISTORY=$(echo $RESPONSE | jq -c '.history')

# Second turn - pass history to maintain context
curl -s \
  -H 'Content-Type: application/json' \
  -X POST http://127.0.0.1:38080/v1/generate \
  -d "{\"input\":\"Now multiply that by 3\",\"stream\":false,\"history\":$HISTORY}" | jq