Skip to content

Eilen6316/LinuxAgent

LinuxAgent

LinuxAgent Logo

CI Release GitHub Release package Coverage Security Policy

A Linux ops CLI where LLM-generated commands must pass deterministic policy and human approval before execution.

简体中文完整文档 · Full English manual · why substring matching is not safety · v4.1.0 release notes · share real usage feedback


LinuxAgent is not a free-form shell chatbot and not an autonomous remediator. It lets an LLM propose Linux operations, but execution stays behind deterministic policy checks, Human-in-the-Loop confirmation, SSH safety guards, output redaction, and a hash-chained audit log.

The core project is intentionally narrow: command planning, policy, HITL, audit, SSH guardrails, and a small set of practical providers. Extra providers, runbooks, and integrations are treated as extension points rather than the center of the product.

Why It Exists

LLM command agents usually fail at the exact point operators care about: trust.

LinuxAgent's default stance is different:

Principle What LinuxAgent does
The model is not trusted First-time LLM-generated commands require confirmation
Safety is policy, not substring matching Commands are tokenized and evaluated by a capability-based policy engine
Production output may contain secrets Tool output is guarded and redacted before LLM-facing analysis
SSH must not silently trust hosts Remote execution uses known-host verification and shell-syntax guards
Every approval should be reviewable HITL decisions are written to a 0o600 hash-chained audit log

v4.1 Security Depth

LinuxAgent v4.1 turns the command safety boundary into a measurable subsystem, not just a set of claims in the README:

Layer What changed
Red-team proof 24 adversarial command-agent cases run in CI with make red-team
Shell structure Pipelines, subshells, command substitution, redirects, and nested shell execution are analyzed before execution
LOLBin coverage Network-to-shell pipelines, find -exec, xargs, awk system(), editor escapes, and interpreter inline execution are classified deterministically
Fuzzing and benchmark Hypothesis parser fuzzing plus P50/P95/P99 policy latency reporting
Audit depth Optional HTTP sink forwards hash-chained entries while local append remains the source of truth
Observability Telemetry can export local JSONL, console JSON, OTLP HTTP JSON, or be disabled explicitly
Sandbox roadmap Landlock design documents capability probes, fallback order, compatibility limits, and implementation slices
Agent integration linuxagent mcp exposes read-only policy classify and audit verify tools over stdio MCP

One-Minute Start

git clone https://github.com/Eilen6316/LinuxAgent.git
cd LinuxAgent
./scripts/bootstrap.sh
source .venv/bin/activate

Then edit ./config.yaml and set a provider. The core paths are OpenAI, DeepSeek, local Ollama/OpenAI-compatible models, and Anthropic:

api:
  provider: deepseek
  api_key: "replace-me"

For local Ollama:

api:
  provider: ollama
  base_url: http://127.0.0.1:11434/v1
  model: llama3.1
  api_key: ""
  token_parameter: max_tokens

Other OpenAI-compatible relays and provider shortcuts are documented in the Provider Compatibility Matrix. They are useful, but they are not the project's core safety story.

Validate and start:

linuxagent check
linuxagent

Try a read-only request:

check the Linux version

When a first LLM-generated command appears, choose Yes for one execution, Yes, don't ask again for matching commands only in this conversation and the same /resume thread, or No to refuse. Use !uname -a for direct operator-authored command mode.

config.yaml must be owned by the current user and chmod 600; secrets are not loaded from .env.

What a Turn Looks Like

you: find services listening on port 8080

parse_intent  -> LLM proposes: ss -tlnp sport = :8080
safety_check  -> CONFIRM (LLM_FIRST_RUN)
confirm       -> operator approves in terminal
execute       -> asyncio subprocess, no shell=True
analyze       -> concise operator summary
audit.log     -> hash-chained JSONL decision record

For ordinary conversation, LinuxAgent first asks an LLM-owned intent router for DIRECT_ANSWER, COMMAND_PLAN, or CLARIFY. Direct answers do not create a command plan and therefore do not show the confirmation panel. Operational methods are not hard-coded in Python; successful command patterns are learned in the local learner memory after sensitive values are redacted. Deterministic safety policy data lives in YAML, while Python code only loads, validates, and applies those policies.

Each CLI launch starts with an empty conversation context. Saved sessions are available only when the operator asks for it with /resume; then enter the shown number or use the interactive picker to resume that saved session. If the selected session stopped at a HITL confirmation, LinuxAgent reloads the local checkpoint and reopens the confirmation flow. Use /new to reset context inside a running CLI session and /tools to see available slash/tool entry points. Typing / opens the slash-command completion menu. Command confirmations use an arrow-key menu with Yes, Yes, don't ask again, and No when conversation permissions are allowed. Yes, don't ask again is scoped to the current conversation thread and the same thread when resumed with /resume. New conversations do not inherit it, and destructive or never_whitelist policy matches still ask every time. Input beginning with ! is direct command mode: LinuxAgent executes the operator-authored command, streams stdout/stderr live, and records both !<command> and the system result into the active conversation context. It does not ask the LLM to explain or generate a reply for that turn.

Core Capabilities

Capability Why it matters
Capability-based policy engine Produces SAFE / CONFIRM / BLOCK, risk scores, capabilities, and matched rules
YAML policy defaults Command policy data is loaded from configs/policy.default.yaml, not Python rule tables
Structured CommandPlan LLM output must validate as JSON before any policy or execution path
Structured file patches Script/code/config edits use transactional FilePatchPlan apply, unified-diff validation, path policy, and HITL review
Read-only workspace tools Planner can inspect allowed files with read_file, list_dir, and search_files before proposing patches
AI-owned intent routing Conversation vs operation vs clarification is decided by prompts/intent_router.md, not Python keyword rules
Explicit resume control New sessions do not inherit previous chats unless /resume is used; pending HITL checkpoints resume there too
Direct ! command mode Runs operator-authored commands without an AI reply and adds command/output to current context
Sandbox metadata boundary Commands carry a selected sandbox profile into audit and telemetry; default noop records metadata only
YAML runbooks Common ops procedures are injected as planner guidance, not pre-LLM hard routes
Learner memory Successful command patterns are persisted locally after secret redaction
LangGraph HITL Confirmation uses interrupt() and checkpointing rather than inline input()
SSH cluster guard Batch confirmation, remote shell metacharacter blocking, remote profile audit
Output protection Command results are redacted and bounded before model-facing analysis
Hash-chained audit linuxagent audit verify detects local audit-log tampering
Reproducible release constraints.txt, wheel verification, and packaged config/prompt/runbook checks

File Changes

Requests such as "create a shell script", "update this Python file", or "edit this config" do not bypass the safety model. LinuxAgent asks the planner for a structured FilePatchPlan, then validates and previews the unified diff before writing anything. The plan carries a structured request_intent field (create, update, or unknown) instead of relying on Python keyword matching.

The planner can first inspect the environment with read-only tools:

  • read_file(path, offset, limit) reads a bounded window from an allowed file.
  • list_dir(path) lists an allowed directory.
  • search_files(pattern, root) searches literal text under an allowed root; regex metacharacters are treated as ordinary text.
  • get_system_info, search_logs, and safety-gated execute_command provide system context when needed.

Tool calls run through the tool sandbox runtime before output reaches the model: each tool carries explicit permissions (read_files, write_files, execute_commands, system_inspect, network_access, and hitl mode), workspace/log roots are checked, per-tool timeouts and output limits are applied, oversized output is marked as truncated, and tool errors are returned as structured model-visible events while telemetry records allowed, denied, timeout, or truncated.

The terminal shows observable tool activity such as LinuxAgent is reading ... / LinuxAgent is listing ..., then prints concise evidence from completed workspace tool calls, such as the first line-numbered read_file snippets or search_files matches. If the planner concludes that no file change is needed, the final answer includes the cited evidence so the operator can see which file lines or search results supported the judgment. Patch confirmation shows per-file stats, compact + / - diff snippets, high-risk path warnings, permission changes, large-diff pagination, and per-file acceptance for multi-file patches. Full diffs are not shown twice; extra review prompts appear only when hidden pages exist.

Command confirmation also shows planned sandbox context: requested profile, runner, enforcement state, cwd, allowed roots, network policy, and fallback reason when the configured runner cannot enforce isolation.

After approval, patch application runs as a transaction. LinuxAgent validates target paths before reading file content, rejects symlink path components, hardlinks, directories, device files, FIFOs, sockets, oversized targets, and non-UTF-8 text, then writes through a temporary file and atomic replace. Existing targets are backed up under a local .linuxagent-patch-* sandbox directory and rolled back automatically if a later file or permission change fails. Audit metadata records changed files, permission changes, backup path hashes, rollback outcome, and the sandbox root.

By default, file patch reads and writes are limited to the current workspace and /tmp through file_patch.allow_roots. Sensitive roots such as /etc and SSH key directories are highlighted as high risk, and permission changes such as 0755 for generated scripts appear explicitly in the confirmation panel. Automatic patch repair defaults to two rounds and can be tuned with file_patch.max_repair_attempts (0 disables automatic patch repair). Failed command-plan repair is separately capped by command_plan.max_repair_attempts (0 disables failed-command replanning).

Sandbox Status

LinuxAgent local command execution now goes through a sandbox runner boundary. The default remains sandbox.enabled: false with runner: noop, which preserves compatibility while recording sandbox metadata only. runner: local applies process lifecycle controls such as clean environment, closed stdin, timeout, process-group cleanup, resource limits, output limits, and configured cwd roots, but it does not claim filesystem or network isolation for safe profiles. runner: bubblewrap is optional and capability-probed; if bwrap is missing or cannot enforce the requested profile or network policy, safe profiles fail closed while explicit passthrough profiles remain auditable passthrough. The planned Landlock backend is documented in docs/design/sandbox-landlock.md, including capability probing, fallback order, and the compatibility test matrix.

Safety Model

Operation Default behavior
User-authored read-only command May run when policy returns SAFE
First LLM-generated command CONFIRM
Conversation-approved LLM command May skip repeat confirmation only in the same conversation thread, including /resume of that thread
Destructive command CONFIRM every time; never conversation-whitelisted
Command targeting root or sensitive paths BLOCK when matched by policy
SSH batch across two or more hosts Explicit batch confirmation with target hosts and remote profiles
Non-TTY confirmation request Auto-deny
Unknown SSH host Reject by default
Default sandbox runner Records profile metadata only; no process isolation
Enabled safe sandbox profile unavailable Fail closed before spawning

MCP Server Prototype

linuxagent mcp starts a local stdio MCP server with read-only tools for policy classification and audit hash-chain verification. It intentionally does not expose command execution, file patch application, SSH fan-out, or secrets. The threat model and future execution boundary are documented in docs/design/mcp-server.md.

LinuxAgent is not an autonomous remediator. The current default noop sandbox runner is also not a command sandbox; it is intended for controlled operator-in-the-loop use. See Production Readiness and Threat Model.

SSH execution is not protected by local OS sandboxing. Configure cluster hosts with least-privilege users, pre-registered known_hosts, a remote working directory, and explicit sudo allowlists when sudo is required.

Built-In Runbooks

LinuxAgent v4 ships with eleven YAML runbooks for common diagnostics:

Runbook area Examples
Disk and filesystem df, top directories, journal usage
Ports and networking listeners, port ownership, connectivity checks
Services and logs systemd status, recent unit logs, error search
System health, OS, load, and memory overall host health, OS release, CPU pressure, memory pressure, OOM clues
Containers, packages, and certificates container status, installed packages, certificate expiry

Runbooks no longer perform natural-language hard matching before LLM planning. They are loaded, policy-validated, and supplied to the planner as advisory examples. The planner may use, adapt, or ignore that guidance based on the actual request. If it produces a multi-step plan inspired by a runbook, every step still goes through normal policy, HITL, audit, and analysis flow.

Quality Gate

Current documented baseline from make test on 2026-05-11:

Gate Status
Unit tests 677 passing
Optional provider compatibility covered by make optional-anthropic when the extra is installed
Sandbox boundary suite covered by make sandbox
Red-team policy suite adversarial command corpus
Policy benchmark P50/P95/P99 policy latency
Harness scenarios scenario-driven HITL / runbook / cluster / sandbox coverage
Integration smoke tests 10 passing
Coverage 86.73% (--cov-fail-under=80)
Static checks ruff, mypy, bandit, project code-rule checks
Build verification wheel + sdist + packaged data install check

Useful commands:

make test
make sandbox
make lint
make type
make security
make red-team
make benchmark
make harness
make verify-build

Install Paths

Path Use when
./scripts/bootstrap.sh You are working from a source checkout
pip install -c constraints.txt https://github.com/Eilen6316/LinuxAgent/releases/download/v4.1.0/linuxagent-4.1.0-py3-none-any.whl You want the published GitHub Release wheel
pip install linuxagent You want the PyPI package after the release is published
pip install -e ".[dev]" You are developing or running the full local gate
pip install -e ".[anthropic]" You need the optional Anthropic provider

Documentation

Document Purpose
Documentation index All long-form docs in one place
docs/zh/README.md Full Chinese manual
docs/en/README.md Full English manual
Quick Start Installation and first run
Provider Matrix Provider setup paths and compatibility status
Operator Safety Model Plain-language safety boundaries for users
Runbook Authoring How to contribute safe YAML runbooks
Roadmap Maintainer priorities and good first issue areas
Migration Guide v3 to v4 breaking changes
Threat Model Assets, trust boundaries, and mitigations
Production Readiness Where LinuxAgent is and is not appropriate
Security Policy Vulnerability reporting and supported versions
Contributing Contribution workflow and review expectations
Changelog Release history
Why substring matching is not safety Technical argument behind LinuxAgent's command safety model
Real user feedback Tell us what happened on a real server, VM, container, or homelab machine

Mirrors and Community

Link Notes
GitHub Primary repository
GitCode Mirror
Gitee Mirror
QQ Group 281392454 Community
CSDN intro Project article

License

MIT

About

LLM-driven Linux operations assistant CLI with mandatory HITL safety, policy engine, runbooks, SSH guards, and audit trails.

Topics

Resources

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages