All notable user-facing changes to Swival.
/auditnow accepts an--allflag that skips Phase 2 triage and sends every file in scope straight to deep review. Useful when you have already narrowed scope to a subtree you want exhaustively reviewed and do not want triage second-guessing which files are worth a closer look. The flag is recorded with the run, so a bare/audit --resumepicks up an--allrun without needing the flag again.- Server-side context overflow is now recoverable. When the local tiktoken estimate under-counts against the model's real tokenizer, the agent used to give up after the no-tools clamp also got rejected. It now progressively truncates the prompt at tighter targets (50%, 25%, 10% of the context window) and retries each one before declaring the turn lost.
- A goal-driven mode has been added: a structured spin on the Ralph-style "keep prompting until it's done" loop. Set an objective with
/goal <objective>in the REPL and the agent doesn't get to declare victory and walk away after one turn. The original objective is fed back to the model after every answer, and the loop only ends when the agent itself signals the goal is complete after a real evidence-based audit, declares a blocker, or hits the optional token budget. This makes it practical to point Swival at ambitious, long-running tasks like refactors, audits, or end-to-end fixes, and let it grind for hours without giving up halfway./goal pause,/goal resume,/goal replace, and/goal cleargive you full control. - First-run setup now writes a
[profiles.default]block to the generated config, so the freshly created file lines up with the profile structure used everywhere else. - The history file is automatically trimmed when it grows past its maximum capacity.
- The system prompt has been optimized for efficiency, and small models may enjoy a significant reduction in token usage.
/auditgot more constraints to focus more on security issues./auditcompatibility with models such as Xiaomi MiMo was also improved.
/auditnow accepts multiple focus paths in a single invocation (/audit src/auth/ src/api/).- Other minor improvements to
/auditto reduce false positives while exploring more bug classes.
--logouthas been added to delete locally cached ChatGPT OAuth tokens and exit, so users can sign out without hand-deleting files under~/.config/litellm/chatgpt/./auditno longer asks the LLM for JSON. Intermediate phase responses now use a simple structured-text format (@@ name @@blocks withkey: valuelines), which models emit far more reliably across long prompts than nested JSON./auditphase 1 (file profiling) is now dramatically faster on large repositories. File contents are read through a singlegit cat-file --batchprocess instead of one subprocess per file, cutting the per-file overhead by an order of magnitude on multi-thousand-file scans.- A
--debugoption has been added to/audit. When enabled, a real-time JSONL log is written to.swival/audit/debug.jsonlcapturing every LLM request and response, parse outcomes, repair attempts, and per-phase metrics, which makes it tractable to diagnose model misbehavior on large audits. - Another
/auditimprovement: it is now considerably more verbose during phase 3, surfacing per-file progress instead of presenting one long silent batch. - Phase 5 audit reports no longer occasionally contain raw tool-call JSON (
{"cmd": "ls"}) or conversational preamble like "I'll inspect the patch...". /auditprompt cache hit rates have been improved: the bug-class taxonomy and finding metadata interpolated into phase 3 system prompts have been moved into user messages so the system prefix stays static across calls, and per-phase cache statistics are now logged when--debugis on.
- GPT-5.5 is now recognized by the ChatGPT provider. Older LiteLLM releases that don't yet know about the model are patched at runtime so context-length queries and Responses API routing work out of the box.
- Emergency truncation has been added as a last-resort compaction stage.
- Prompt caching now works for tool-less LLM calls such as
/audit. Previously, cache control breakpoints were only injected when tool schemas were present. /auditPhase 2 triage now places the repository profile in the system prompt instead of repeating it in every user message, improving prompt cache hit rates and reducing costs./auditPhase 3b finding expansions now run sequentially with per-item error handling instead of in parallel, so a single failed expansion no longer kills the entire batch.- D language (
.d) files are now recognized as source code by/audit. - LiteLLM has been updated to add support for the Mythos provider.
top_pis no longer sent to the provider by default, letting each provider use its own default. The--top-pflag is still available to override it explicitly.- A
--user-agentoption has been added to set a customUser-Agentheader on LLM API requests. The generic and llama.cpp providers now sendSwival/<version>by default, and OpenRouter forwards the header when set. This can also be configured viauser_agentin config files. /auditpath scoping no longer silently skips the target directory when the argument is missing a trailing slash.- Provider-specific workarounds have been added for Kimi K2.6.
- When a file is too large for the LLM's context window during an audit, the audit now progressively truncates it and retries instead of failing outright.
- Audit LLM calls no longer force a fixed temperature and top_p, letting providers that reject custom sampling parameters (such as Anthropic) work without errors.
/auditcan now be used in one-shot mode, not just the REPL.- Security audit LLM calls now retry automatically on transient failures (rate limits, timeouts, server errors) with exponential backoff.
- Audit patch generation no longer crashes on files containing non-UTF-8 bytes.
- A built-in
/auditcommand has been added for deep security audits over committed Git-tracked code. It scans source and config files for vulnerabilities using the session's LLM, produces a structured report with severity ratings, and can optionally generate a patch. Supports Python, JavaScript, TypeScript, Go, Rust, C/C++, Zig, and many other languages.
- Subagents now inherit the parent session's proactive context compaction setting, so long-running subagent tasks get the same graduated summarization as the main loop.
- When a subagent hits a context overflow, it now recovers partial results from the last real assistant message instead of failing outright. Recap-only messages are skipped so the recovered text reflects actual work.
- Proactive context compaction is now enabled in subagents, giving them the same graduated summarization as the main loop.
"context size exceeded"errors from llama.cpp are now recognized as context overflow, triggering compaction instead of failing the turn.
- Hugging Face models that don't support chat completions now fall back to plain text generation in non-tool turns, and models that exist on the Hub but have no live Inference Provider deployment now fail with a clearer error explaining how to run them instead.
- All the underscore-prefixed internal keys are now stripped from outbound messages.
- Added a quick shell command (
!!) to the REPL, allowing users to run shell commands without LLM involvement. - Added an inline
@trigger for tab-completing file paths mid-prompt in the REPL. - Fixed Gemini 3 multi-turn tool calling failures by preserving
thought_signaturein current-turn tool calls. - Custom commands (
!) now support inlining the content of non-executable text files directly into the prompt. - JSONL traces now use relative workspace paths instead of absolute paths to reduce sensitive leakage.
fetch_urlnow allows connecting tolocalhost,127.0.0.1, and::1. Agents frequently run a local server and then need to test or inspect it, and the previous blanket loopback block made that workflow awkward. Other private, link-local, and reserved addresses are still blocked.- MCP tool names are now stored separately from the tool schema rather than as an internal
_mcp_original_namefield. This fixes Gemini rejecting MCP tool schemas that contained an unrecognized property.
--command-middlewareadds a hook point before everyrun_commandandrun_shell_commandcall. The middleware receives a JSON payload on stdin and can pass the command through unchanged, rewrite it, or block it with a reason. Rewritten commands are still validated against Swival's own command policy so the middleware cannot bypass allowlists or--commands none.- When command, MCP, or A2A output exceeds the inline limit and spills to a temp file, the first 50 lines (up to 2 KB) are now included directly in the tool result. The model can usually continue without a follow-up
read_filecall. --reportnow works in REPL mode and produces a full-session report on exit.- HuggingFace-compatible agent trace export (
format:agent-traces) has been implemented. AGENTS.mdfiles are loaded from all ancestor directories up to the project root, not just the project root itself.- Custom commands whose name contains a slash are now resolved relative to the config directory, making it easier to organize commands in subdirectories.
- Slash commands (
/) and custom commands (!) can now be used in one-shot (non-REPL) mode. Because one-shot input may come from untrusted sources, command dispatch is disabled by default; pass--oneshot-commandsto opt in. - Skill directory scanning depth has been reduced from 5 to 3 to avoid descending into vendored or generated trees.
- Swival now auto-detects the project root by walking up to the nearest
.gitdirectory orswival.toml, so launching from a subdirectory keeps file tools and project-scoped behavior anchored to the repository root. edit_filenow accepts an optionalline_numberparameter so targeted replacements can disambiguate repeated matches using the line numbers returned byread_file.- ChatGPT provider handling now tolerates empty Responses API payloads instead of failing the turn.
LITELLM_LOCAL_MODEL_COST_MAPis now enabled unconditionally to avoid unnecessary remote pricing lookups for local-model providers.
- Add native support for llama.cpp
- Shell-command execution is now only exposed in unrestricted command modes:
run_shell_commandis hidden outside--commands all/--yolo, whilerun_commandremains available for argv-style execution in--commands askand allowlist modes. - Profiles that omit
max_output_tokensno longer crash or override provider defaults. Swival now preserves an unset output cap instead of substituting a large context-derived value.
- Swival now automatically falls back to plain chat when a provider or model does not support function calling, including OpenRouter's tool-unsupported responses.
- Command execution has been split into two tools:
run_commandnow takes an argv array, whilerun_shell_commandtakes a shell string and is only exposed in unrestricted command modes. This avoids the old union-type schema that weaker models often mangled. - Tool-call repair has been tightened for small models, making malformed arguments more likely to be repaired into valid tool calls.
- REPL
/profileswitching now correctly inherits top-level config values: profiles that omit keys likeapi_keypick them up from the config file rather than from the previously active profile. - Malformed tool-call repair now handles file path parameters: glob
metacharacters (
*,?,[]) are stripped from path and directory fields whose schema description does not indicate a glob or pattern value, and common field-name aliases (path,file,filename) are mapped to the correct schema name. That helps small models.
- A new
/profileREPL command can list available profiles, switch to a different LLM profile mid-session, and revert to the startup profile (or baseline config) with/profile -./statusnow shows the active profile. - TAB completion has been added to the REPL for slash commands, custom
!commands, directory-path arguments for/add-dirand/add-dir-ro, and$skillmentions. /initnow includes commit and pull request style guidance in generatedAGENTS.mdfiles, derived from recent git history and any PR template.
- Interactive command approval mode has been added:
--commands askprompts the user before every shell command execution. Approvals can be scoped per command bucket and persisted to.swival/approved_buckets, denied, or allowed once. High-risk commands and inline code execution (bash -c,python -c,node -e, etc.) are flagged with extra warnings. - Untrusted external content labeling has been added: output from
fetch_url, MCP servers, and A2A agents is now wrapped with a deterministic[UNTRUSTED EXTERNAL CONTENT]header before the model sees it, instructing the model to treat it as data only. The label is baked into spill files so it survives laterread_fileaccess. - JSON reports now include a
securitysection that tracks command policy blocks, approvals, and untrusted input ingestion events. - Bedrock provider now forwards the AWS profile to the reviewer session.
- Special tokens in user, system, and tool messages are now escaped by inserting zero-width spaces at token boundaries, preventing the tokenizer from misinterpreting literal text as control tokens.
- Tool descriptions have been removed from the system prompt, freeing up context space (models already receive tool schemas via the function-calling API).
- Internal litellm fields and
reasoning_contentare now stripped from assistant messages before they are sent back to the provider, fixing compatibility issues.
/statusREPL command has been added to show the current session state (provider, model, profile, token usage, active tools, and configuration).- Bedrock provider now suggests the
aws sso logincommand when authentication fails. - LM Studio provider now sets
LITELLM_LOCAL_MODEL_COST_MAPto avoid unnecessary remote lookups for model pricing.
- Onboarding has been improved.
- Subagents are now auto-enabled when the context window is 100K tokens or larger.
- An interactive onboarding wizard has been added: on first run with no config file, Swival guides the user through provider selection, API key entry, and config file creation. Re-running onboarding merges new provider settings into an existing config file instead of overwriting it.
- Common malformed tool calls from weaker models are now automatically repaired before reaching dispatch: orphaned tool-call references, missing required fields, and broken JSON are patched up so the agent loop can continue.
- Named LLM profiles have been added:
[profiles.NAME]tables can be defined in config files to bundle provider, model, API key, and other LLM settings under a short name. Use--profile NAMEto select one at runtime, or setactive_profilein config for a default.--list-profilesprints all available profiles. - Provider error messages now include the model ID for easier debugging.
- Minimax-specific transient errors are now caught and retried.
- Filesystem access controls have been decoupled from
--yolo:--files(all,some,none) controls file access independently, and--commands(all,none, or a comma-separated whitelist) controls which shell commands the agent may run.--yolois now shorthand for--files all --commands all. - AWS Bedrock has been added as a provider.
/simplifyREPL command has been added: runs a review pass over recently changed code, checking for reuse opportunities, quality issues, and inefficiencies, then fixes any problems found.- REPL answers are now rendered as Markdown on TTYs.
- Project-level MCP configuration has been moved from
.mcp.jsonto.swival/mcp.json.
- Parallel subagents have been added:
spawn_subagentlaunches an independent agent loop in a background thread to work on a task concurrently, andcheck_subagentspolls, collects results, or cancels running subagents. Up to 4 subagents can run in parallel. Each gets its own thinking, todo, snapshot, and file-tracker state. Subagents have access to all file and search tools but cannot spawn their own subagents. - The todo list is now session-scoped and purely in-memory. It no longer
persists to
.swival/todo.mdor uses file locking. Concurrent sessions get fully independent todo lists with no cross-session interference. /remember <text>REPL command has been added to persist a project fact toAGENTS.mdunder## Conventions. The live system prompt is updated immediately so the agent sees the new fact without restarting.read_fileon a missingMEMORY.mdnow returns a helpful hint explaining its purpose instead of a generic "file not found" error.
- Prompt caching has been added. When a provider supports it, the system
prompt is cached on the first request and reused for subsequent calls,
reducing costs and latency. Can be disabled with
--no-cache-prompts.
outlinetool has been added: shows the structural skeleton of one or more files (classes, functions, top-level declarations) with line numbers, without bodies. Useful for navigating unfamiliar code.
/copyREPL command has been added to copy the last assistant response to the clipboard.- When using LM Studio, the max context length is now always queried from the server instead of relying on a hardcoded default.
- When Swival is launched on a TTY with no task, it now enters REPL mode directly.
- Filesystem built-in tools now expand
~in paths, so home-directory paths work consistently across file reads, writes, edits, deletes, and searches. - The fetch_url tool has a higher probability to get used consistently by small models.
- Homebrew installation support has been added.
Session.ask()now rolls back conversation history on failure, so a failed turn doesn't corrupt a long-lived Python session.- Public Python API exceptions have been formalized:
ContextOverflowErrorandLifecycleErrorare now exported - The persistent todo list is now safer across concurrent sessions and processes: writes use file locking and merge on-disk changes instead of clobbering them.
- SIGTERM now shuts Swival down cleanly with exit code 143, preserves continue-here state, and closes MCP/A2A managers during teardown.
- Generic lifecycle hooks have been added: user-configured commands run at
startup and exit, with Git and project metadata passed via
SWIVAL_*environment variables. Startup hooks run before memory and continue-here loading so they can hydrate.swival/from remote storage; exit hooks run after all artifacts are written. Configurable viaswival.tomlor~/.config/swival/config.toml. - Custom command arguments are now passed as a single string:
!command a b ccalls the script with$2="a b c"instead of spreading each word as a separate argv entry.
- Outbound LLM filter: a new
--llm-filterflag (andllm_filterconfig key) runs a user-defined script before every provider call. The script receives messages as JSON on stdin and can redact content or block the request entirely. Fails closed — script errors or rejections prevent the request from being sent. Runs before secret encryption so filters see human-readable text. Configurable from CLI,swival.toml, or~/.config/swival/config.toml.
- Custom commands have been added: executable scripts placed in
~/.config/swival/commands/can be invoked from the REPL with!name, and their output is injected into the conversation as the next user message.
/initworkflow discovery is now platform-aware: it detects the current OS and architecture and only extracts commands that apply to the host platform.
/initnow discovers workflow files and validates the generated instructions by writing them out and checking the result.- Transient LLM errors (rate limits, timeouts, server errors) are now retried automatically with exponential backoff.
- An interaction-policy system prompt has been added to distinguish REPL and autonomous modes, giving the model clearer behavioral guidance for each.
- Updated ChangeLog
- Last-resort compaction has been added: when the context window is too small for tool schemas, all tool definitions are dropped and the system prompt is truncated so the conversation can continue as plain chat.
- Command provider now supports tool calling via a
<swival:call>XML convention, allowing external command-based backends to invoke tools. - Data-URI inlined images are now stripped after HTML-to-markdown conversion to avoid bloating context with base64 blobs.
- Markdown comments (
<!-- ... -->) are now trimmed from skill and agent instruction files. - OpenRouter requests now include
refererandtitleheaders.
- The
greptool now supports acontext_linesparameter to show surrounding lines before and after each match. /newhas been added as a synonym for/clearin the REPL.reasoning_effortset to"default"is now skipped instead of being sent to the provider.
- Secrets encryption has been added: credential tokens in LLM messages can be transparently encrypted before being sent to the provider and decrypted on return, preventing accidental leakage through hosted APIs.
- The
--sanitize-thinkingCLI flag has been fixed (it was accepted but ignored in 0.1.29). read_multiple_filesnow accepts a plain string in addition to an array, for resilience with models that pass a single filename as a string.
- Command provider has been added for shelling out to external programs as the LLM backend: the conversation is passed as a plain-text transcript on stdin, and the response is read from stdout.
- Leaked reasoning tags (
<think>,</think>) from models with bogus templates can now be stripped. This can be controlled withsanitize_thinkingin config or--sanitize-thinking. - Race conditions when multiple A2A contexts run concurrently have been fixed by isolating per-context temporary files (cmd_output) and adding file locks.
- SQLite cross-thread error when
--serveand--cacheare combined has been fixed.
- Support for vision has been added: a new
view_imagetool allows the agent use vision-enabled models to examine images. - Skill scanning now skips dot directories.
- Skills can now be loaded from
.agents/skills/and~/.agents/skills/directories. - Global agent instructions via
~/.agents/AGENTS.mdhave been added. - Documentation has been improved with web browsing options, lightpanda MCP server usage, and chrome-devtools-mcp examples.
- Google Gemini provider has been switched to use the OpenAI-compatible endpoint.
- Built-in help output has been grouped by purpose.
- Documentation and examples have been improved.
- Native Google Gemini API support has been added.
- A2A streaming (
SendStreamingMessage) has been added: real-time SSE delivery of status updates, tool lifecycle events, and incremental text. CancelTasksupport has been added: per-task cancel flags are checked between tool calls and at each turn boundary.- A2A server hardening has been added: sliding-window rate limiting, request size validation, concurrency semaphore, and active-context protection against LRU eviction.
- Read access to external skill directories has been auto-granted and supporting files are now listed on skill activation.
- A2A server mode (
--serve) has been added: a swival Session can be exposed as an A2A endpoint, with context-keyed multi-turn sessions, bearer auth, and TTL-based cleanup. - Customizable A2A server agent card has been added:
--serve-name,--serve-description, and[[serve_skills]]inswival.tomlcontrol how the agent advertises itself. /toolsREPL command has been added to list available tools.
- A2A (Agent-to-Agent) support has been added: remote agents can be connected via
[a2a_servers.*]inswival.tomlor--a2a-config, with tools exposed asa2a__<agent>__<skill>. - Budgeted memory injection has been added.
--memory-fullcan be used for legacy full injection. - Support for reading questions from stdin when piped has been added.
--self-reviewoption has been added: the agent reviews its own work before finishing.- Reviewer feedback visibility has been improved and expected actions have been made more explicit.
- Informational stderr from the reviewer is now shown as warnings instead of being silently discarded.
- The default number of review rounds has been bumped up to 15.
- A cache miss cascade caused by dropped
tool_callfields in cached responses has been fixed.
- Optional SQLite LLM response cache (
--cache) has been added for faster repeated queries, with system-prompt-independent cache keys. - A deadlock when a shell command backgrounds a child process has been fixed.
- The
todotool accepting JSON-encoded array strings instead of proper lists has been fixed.
- The project-local skills directory has been moved from
skills/to.swival/skills/. - Spurious "shadowed by itself" warnings when
--skills-dirpointed to the same directory as the project-local skills location have been fixed. $skill-namemention syntax has been added:$deploycan be typed in a message to automatically activate a skill without the model needing to calluse_skill.- The skill catalog in the system prompt has been reworked with file paths, trigger rules, and progressive disclosure guidance.
- Auto-injected skills now use assistant+tool message pairs so compaction can shrink or drop them under context pressure.
- Auto-activated skills are now recorded in JSON reports.
/learncommand has been added for interactive skill discovery.
read_multiple_filestool has been added for reading several files in a single call.- Continue-here feature has been added: session state is saved on interruption (Ctrl+C, max turns, compaction failure) and resumed on next start.
- The
todotool has been made to accept multiple tasks in one call. - The
greptool has been extended with additional options. - Context overflow detection for non-standard exception types has been fixed.
--reasoning-effortoption has been added.- Session memories that persist across runs have been added.
- GPT-5.4 has been added to the built-in model list.
- Markdown formatting for agent responses has been added.
- Spinner and progress display have been improved.
- Todo list UI has been improved.
- All CLI options have been listed in
--helpand sorted alphabetically.
- Colored diff output has been added to the
edit_filetool.
write_filehas been made to coerce JSON content into a string instead of erroring.
- ChatGPT has been added as a provider (direct OpenAI API).
- AgentFS sandbox support has been integrated with auto-session IDs, diff hints, and strict read mode.
- "Did you mean?" suggestions for mistyped tool command names have been added.
- MCP servers have been made to inherit the parent process environment variables.
- Generic OpenAI-compatible provider has been added for any server that speaks the OpenAI API.
- Snapshot tool has been added for proactive context collapse, with
/snapshotand/restoreREPL commands. --extra-bodyoption has been added to pass arbitrary JSON to the LLM request (useful for disabling thinking, etc.).- OpenRouter documentation and setup instructions have been added.
- MCP (Model Context Protocol) server support has been added. Servers are
configured in
swival.tomlor.mcp.json; tools are exposed asmcp__<server>__<tool>. - Configurable size limits for MCP tool output (
MCP_INLINE_LIMIT,MCP_FILE_LIMIT) have been added.
- Reviewer mode (
--reviewer-mode) has been added: an LLM-as-judge loop that automatically evaluates agent output, with--objective,--verify, and--review-promptoptions. --max-review-roundshas been added to cap review iterations.
- Graduated context compaction has been introduced:
compact_messages->drop_middle_turns->aggressive_drop_turns, replacing the previous all-or-nothing approach. /continueis now suggested when the agent hits the max turn limit.- Clamping and retry messages have been improved.
grepandlist_filestools have been made to accept file paths in addition to directories.greptool output has been improved.- Whether the model supports vision is now reported.
- Global instructions via
~/.config/swival/AGENTS.mdhave been added. --no-instructionsbehavior has been clarified.
- Configuration file support (
swival.tomland~/.config/swival/config.toml) has been added. --add-dir-rohas been added for read-only additional directories (renamed from--allow-dir).- Common command syntax mistakes in yolo mode are now auto-corrected.
- Instructions file has been switched from
ZOK.mdtoAGENT.md.
thinktool has been redesigned with numbered thoughts, revisions, and branches.- CI pipeline has been added.
Makefilewith common development commands has been added.- Trash/undo handling has been fixed.
- Error when the model sends a file size with units has been improved.
todotool has been added: a persistent checklist in.swival/todo.mdthat survives context compaction, with periodic reminders and duplicate detection./initcommand has been added for bootstrappingAGENT.md.- A public Python API (
swival.Session,swival.run()) has been exposed. - A loading spinner during LLM calls has been added.
- The unused
notestool has been removed.
- OpenRouter has been added as a provider.
delete_filetool has been added.move_file/rename_filetools have been added.- External reviewer support for automated evaluation has been added.
- Read-before-write is now required: the agent must read a file before editing or
overwriting it (can be disabled with
--no-read-guard). - Final output is now printed even when
--reportis enabled. - Default values for
temperatureandtop_phave been removed (the provider decides).
- Package has been renamed from
swival-agenttoswival. --versionflag has been added.- Recursive skill discovery has been deepened.
- Skill activation events have been included in reports.
--reporthas been added for JSON session reports.--historyhas been added to replay previous sessions.- Thinking tool has been revamped.
- Absolute paths in yolo mode have been allowed.
- Full shell expansion in yolo mode has been added.
- Default max turn limit has been increased.
--seedoption has been added for deterministic output.
Initial release. Core agent loop with tool-use, LM Studio and HuggingFace providers, file read/write/edit, grep, list_files, run_command, thinking tool, skills system, and REPL mode.