Skip to content

crussella0129/Animus

Repository files navigation

Animus

Local-first agentic coding tool powered by local LLMs.

Build Python License


What is Animus

Animus is a local-first agentic coding assistant that runs entirely on your machine. It uses local GGUF models via llama-cpp-python — no API keys, no data leaving your machine. Inspired by claw-code but designed from the ground up for local models, Animus adapts its behavior to the capability of the model you load: a 7-tier system scales planner complexity, grammar enforcement, tool availability, and turn budget to match what the model can reliably handle. Small models get tight GBNF grammar constraints and a decomposing planner; large models get full tool access and free-form generation. The result is a tool that works well across the full spectrum of local hardware.


Quick Start

Install

# Clone the repository
git clone https://github.com/crussella0129/Animus.git
cd Animus

# Create and activate a virtual environment
python -m venv .venv
source .venv/bin/activate   # Windows: .venv\Scripts\activate

# Install Animus and its dependencies
pip install -e ".[dev]"

# Install llama-cpp-python (choose one):
pip install llama-cpp-python                          # CPU only
pip install llama-cpp-python --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cu124  # CUDA 12.4

Download a Model

Animus works with any GGUF model. A few recommended starting points:

# ~4GB, good for most machines (Small tier)
huggingface-cli download Qwen/Qwen2.5-Coder-7B-Instruct-GGUF \
    qwen2.5-coder-7b-instruct-q4_k_m.gguf --local-dir ~/models

# ~8GB, better quality (Medium tier)
huggingface-cli download Qwen/Qwen2.5-Coder-14B-Instruct-GGUF \
    qwen2.5-coder-14b-instruct-q4_k_m.gguf --local-dir ~/models

Run One-Shot (Single Prompt)

animus "Explain the structure of this project" --model ~/models/qwen2.5-coder-7b-instruct-q4_k_m.gguf

Run the REPL

animus --model ~/models/qwen2.5-coder-7b-instruct-q4_k_m.gguf

Once in the REPL, type your prompt and press Enter. Use /help to see all commands.


CLI Reference

animus [PROMPT] [OPTIONS]

Arguments:
  PROMPT    One-shot prompt. Omit to enter the interactive REPL.

Options:
  -m, --model TEXT         Path to a GGUF model file, or model name.
  -p, --permission TEXT    Permission mode: read-only, standard, full, prompt.
                           Default: standard.
  -w, --workspace TEXT     Workspace root directory. Default: current directory.
  -c, --config TEXT        Path to an additional config YAML file (local tier).
  --help                   Show this message and exit.

Model Tiers

Animus detects the tier automatically from the model's parameter count embedded in its metadata. Tier controls planner behavior, grammar enforcement, max turns, and which tools are available.

Tier Params Planner Grammar Mode Max Turns Tools Available Example Models
Nano < 4B Yes (2 steps) full 6 4 Qwen2.5-Coder-1.5B, Phi-3-mini
Small 4 – 13B Yes (3 steps) first_turn 15 6 Qwen2.5-Coder-7B, Mistral-7B
Medium 13 – 30B No off 20 8 Qwen2.5-Coder-14B, DeepSeek-Coder-V2-Lite
Large 30 – 70B No off 15 10 Qwen2.5-Coder-32B, CodeLlama-34B
XL 70 – 200B No off 25 10 Qwen2.5-Coder-72B, Llama-3.1-70B
Ultra > 200B No off 30 11 Llama-3.1-405B, DeepSeek-V3

Planner — For Nano and Small tiers, tasks are decomposed into sub-steps before execution. Each step has its own scoped tool list and turn budget.

Grammar Mode — GBNF grammar constraints are applied to force structured JSON output from the model. full = every turn; first_turn = only the first generation; off = free-form.


Tools

All tools are confined to the workspace boundary. Attempting to access paths outside the workspace is blocked at the security layer.

Tool Permission Min Tier Description
read_file READ Nano Read a file with optional line offset and limit. Returns numbered lines.
write_file WRITE Nano Write (overwrite) a file. Creates parent directories as needed.
edit_file WRITE Small Replace an exact string in a file. Fails on ambiguous matches.
list_dir READ Nano List directory contents with type indicators (trailing / for dirs).
glob_search READ Nano Find files matching a glob pattern. Returns up to 100 results.
grep_search READ Small Search file contents with a regex. Returns up to 50 file:line: matches.
bash EXECUTE Medium Run a shell command in the workspace. Injection patterns are blocked.
git WRITE Medium Run git subcommands (status, diff, add, commit, etc.). Network ops blocked.

Slash Commands

Slash commands are intercepted in the REPL before input reaches the model.

Command Description
/help List all available slash commands.
/status Show session stats: message count, token estimate, tier, context, mode.
/compact Manually compact session history to free context window space.
/clear Clear session history and start a fresh conversation.
/cost Show token usage for the current session (input, output, total).
/model [name] Show current model info, or request a model switch.
/permissions [mode] Show or change the permission mode (read-only, standard, full, prompt).
/session Show the current session ID and creation timestamp.
/diff Run git diff in the workspace root and display the result.
/config [key] [value] Show or set a config value (e.g. /config model.context_length 32768).
/plan Show whether the planner is active for the current model tier.
/tier Show the detected model tier and parameter count.

Permission Modes

Mode Allows Use When
read-only READ tools only You only want the model to read and analyze code.
standard READ + WRITE tools Normal coding sessions. Default mode.
full READ + WRITE + EXECUTE (bash, git) You trust the model to run shell commands.
prompt READ always; prompts before WRITE/EXEC Reserved for future interactive approval workflow.

Set the mode at launch:

animus --model ~/models/model.gguf --permission full

Or change it mid-session:

> /permissions read-only
Permission mode set to: read-only

Configuration

Animus uses a three-tier YAML config system. Each tier overrides the previous via deep merge:

Tier Location Purpose
User ~/.animus/config.yaml Your personal defaults across all projects.
Project .animus/config.yaml Project-specific settings. Commit this.
Local .animus/config.local.yaml Machine-local overrides. Git-ignored.

Example Config

# ~/.animus/config.yaml

model:
  provider: native            # Only "native" (llama-cpp-python) is supported today.
  model_path: ""              # Absolute path to your .gguf file.
  temperature: 0.7            # Sampling temperature (0.0 – 2.0).
  max_tokens: 2048            # Maximum tokens per generation.
  context_length: 4096        # Model context window size (tokens).
  gpu_layers: -1              # GPU layers to offload. -1 = all, 0 = CPU only.
  size_tier: auto             # Override tier detection: auto/nano/small/medium/large/xl/ultra.

agent:
  permission_mode: standard   # read-only | standard | full | prompt
  max_turns: 20               # Hard cap on agentic loop iterations per turn.
  system_prompt: "You are Animus, a local AI coding assistant with tool use."

session:
  auto_save: true             # Save session to .animus/sessions/ on exit.
  auto_compact: true          # Automatically compact when nearing context limit.
  compact_threshold: 0.7      # Compact when session fills this fraction of context.
  preserve_recent: 4          # Messages kept verbatim after compaction.

Config Precedence (highest wins)

CLI --config flag  >  .animus/config.local.yaml  >  .animus/config.yaml  >  ~/.animus/config.yaml

You can also point to any YAML file as the local (highest-priority) config tier:

animus --config /path/to/overrides.yaml --model ~/models/model.gguf

Session Management

Sessions are automatically saved to .animus/sessions/ when you exit the REPL (if auto_save: true). Each session is a JSON file named session-<uuid>.json and contains the full message history plus token usage counters.

Auto-compaction activates when the conversation grows past compact_threshold of the model's context window. The compactor builds a structured plain-text summary of older messages (tools used, files referenced, user requests, assistant actions) and replaces them with a single system message, preserving the preserve_recent most-recent exchanges verbatim. The session ID and creation timestamp are retained.

Trigger compaction manually with /compact, or clear the session entirely with /clear.


Architecture

ConversationRuntime         ReAct loop: prompt → generate → tool calls → results → repeat
  ├─ Provider (protocol)    Abstracts LLM backends. NativeProvider wraps llama-cpp-python.
  ├─ ToolRunner             Dispatches tool calls to registered ToolSpec handlers.
  ├─ PermissionPolicy       Authorizes tool calls based on PermissionLevel vs. mode.
  ├─ Workspace              Enforces file boundary: all paths resolved and checked.
  ├─ Session                Append-only conversation history with JSON persistence.
  └─ Compactor              Summarizes old messages to reclaim context window space.

Tier system                 Auto-detects model size → scales planner, grammar, turn budget.
Planner (Nano/Small)        Decomposes tasks into scoped sub-steps before execution.
GBNF grammar                Enforces structured JSON tool-call output for small models.
Deny lists                  Hardcoded blocks: injection patterns, destructive shell commands.

The core loop lives in src/core/runtime.py. The Provider protocol (src/providers/base.py) makes it straightforward to add new LLM backends (HTTP APIs, cloud providers, etc.) without touching the runtime. Tool handlers are pure functions — (args: dict, workspace: Workspace) -> ToolResult — registered declaratively with a JSON schema, permission level, and minimum tier. Adding a new tool does not require modifying any other module.

Security is defense-in-depth: workspace boundary checks in every tool handler, injection pattern rejection in the shell tool, a deny-list for destructive commands, and permission gating at the runtime level.


Development

Setup

git clone https://github.com/crussella0129/Animus.git
cd Animus
python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"

Run Tests

pytest

All tests mock the LLM provider — no model download required.

Adding a New Tool

  1. Write a handler function in the appropriate file under src/tools/:

    def handle_my_tool(args: dict[str, Any], workspace: Workspace) -> ToolResult:
        value = args.get("my_param", "")
        # ... do work ...
        return ToolResult(output="result")
  2. Register it in src/main.py inside the ToolRunner setup block:

    tool_runner.register(ToolSpec(
        name="my_tool",
        description="What my tool does.",
        input_schema={
            "type": "object",
            "properties": {
                "my_param": {"type": "string", "description": "..."},
            },
            "required": ["my_param"],
        },
        permission=PermissionLevel.READ,
        min_tier=Tier.NANO,
        handler=lambda args: handle_my_tool(args, ws),
    ))
  3. Add tests in tests/tools/test_my_tool.py.

Adding a New Provider

Implement the Provider protocol in src/providers/:

from src.providers.base import Provider, ProviderResponse, ModelCapabilities

class MyProvider(Provider):
    def generate(self, messages, tools=None, grammar=None, stream=False) -> ProviderResponse:
        ...

    def capabilities(self) -> ModelCapabilities:
        ...

Then instantiate it in main.py in place of NativeProvider.

Project Structure

src/
  main.py               CLI entry point — wires all modules together
  core/
    config.py           Three-tier YAML config (User < Project < Local)
    runtime.py          ConversationRuntime — the ReAct agentic loop
    session.py          Append-only conversation history + JSON persistence
    compactor.py        Session summarization to reclaim context space
    tiers.py            7-tier model system + TierConfig constants
    planner.py          Task decomposer for Nano/Small tiers
  providers/
    base.py             Provider protocol (abstract base class)
    native.py           NativeProvider wrapping llama-cpp-python
  tools/
    registry.py         ToolRunner + ToolSpec declarative tool system
    filesystem.py       read_file, write_file, edit_file, list_dir
    search.py           glob_search, grep_search
    shell.py            bash — with injection blocking and timeout
    git.py              git — with network-op blocking
  security/
    workspace.py        Workspace boundary enforcement
    permissions.py      PermissionLevel + PermissionPolicy
    deny_lists.py       Injection patterns + blocked command list
  grammar/
    gbnf.py             GBNF grammar builder for structured tool-call output
  cli/
    repl.py             Interactive REPL loop
    commands.py         Slash command registry and parser
    render.py           Rich console rendering helpers

License

MIT License. See LICENSE for details.

About

A repository for the open source, natively running, Agentic CLI project: Animus

Resources

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors