03 Apr 18:15

teknium1

abf1e98

Hermes Agent v0.7.0 (v2026.4.3) Latest

Latest

Hermes Agent v0.7.0 (v2026.4.3)

Release Date: April 3, 2026

The resilience release — pluggable memory providers, credential pool rotation, Camofox anti-detection browser, inline diff previews, gateway hardening across race conditions and approval routing, and deep security fixes across 168 PRs and 46 resolved issues.

✨ Highlights

Pluggable Memory Provider Interface — Memory is now an extensible plugin system. Third-party memory backends (Honcho, vector stores, custom DBs) implement a simple provider ABC and register via the plugin system. Built-in memory is the default provider. Honcho integration restored to full parity as the reference plugin with profile-scoped host/peer resolution. (#4623, #4616, #4355)
Same-Provider Credential Pools — Configure multiple API keys for the same provider with automatic rotation. Thread-safe least_used strategy distributes load across keys, and 401 failures trigger automatic rotation to the next credential. Set up via the setup wizard or credential_pool config. (#4188, #4300, #4361)
Camofox Anti-Detection Browser Backend — New local browser backend using Camoufox for stealth browsing. Persistent sessions with VNC URL discovery for visual debugging, configurable SSRF bypass for local backends, auto-install via hermes tools. (#4008, #4419, #4292)
Inline Diff Previews — File write and patch operations now show inline diffs in the tool activity feed, giving you visual confirmation of what changed before the agent moves on. (#4411, #4423)
API Server Session Continuity & Tool Streaming — The API server (Open WebUI integration) now streams tool progress events in real-time and supports X-Hermes-Session-Id headers for persistent sessions across requests. Sessions persist to the shared SessionDB. (#4092, #4478, #4802)
ACP: Client-Provided MCP Servers — Editor integrations (VS Code, Zed, JetBrains) can now register their own MCP servers, which Hermes picks up as additional agent tools. Your editor's MCP ecosystem flows directly into the agent. (#4705)
Gateway Hardening — Major stability pass across race conditions, photo media delivery, flood control, stuck sessions, approval routing, and compression death spirals. The gateway is substantially more reliable in production. (#4727, #4750, #4798, #4557)
Security: Secret Exfiltration Blocking — Browser URLs and LLM responses are now scanned for secret patterns, blocking exfiltration attempts via URL encoding, base64, or prompt injection. Credential directory protections expanded to .docker, .azure, .config/gh. Execute_code sandbox output is redacted. (#4483, #4360, #4305, #4327)

🏗️ Core Agent & Architecture

Provider & Model Support

Same-provider credential pools — configure multiple API keys with automatic least_used rotation and 401 failover (#4188, #4300)
Credential pool preserved through smart routing — pool state survives fallback provider switches and defers eager fallback on 429 (#4361)
Per-turn primary runtime restoration — after fallback provider use, the agent automatically restores the primary provider on the next turn with transport recovery (#4624)
developer role for GPT-5 and Codex models — uses OpenAI's recommended system message role for newer models (#4498)
Google model operational guidance — Gemini and Gemma models get provider-specific prompting guidance (#4641)
Anthropic long-context tier 429 handling — automatically reduces context to 200k when hitting tier limits (#4747)
URL-based auth for third-party Anthropic endpoints + CI test fixes (#4148)
Bearer auth for MiniMax Anthropic endpoints (#4028)
Fireworks context length detection (#4158)
Standard DashScope international endpoint for Alibaba provider (#4133, closes #3912)
Custom providers context_length honored in hygiene compression (#4085)
Non-sk-ant keys treated as regular API keys, not OAuth tokens (#4093)
Claude-sonnet-4.6 added to OpenRouter and Nous model lists (#4157)
Qwen 3.6 Plus Preview added to model lists (#4376)
MiniMax M2.7 added to hermes model picker and OpenCode (#4208)
Auto-detect models from server probe in custom endpoint setup (#4218)
Config.yaml single source of truth for endpoint URLs — no more env var vs config.yaml conflicts (#4165)
Setup wizard no longer overwrites custom endpoint config (#4180, closes #4172)
Unified setup wizard provider selection with hermes model — single code path for both flows (#4200)
Root-level provider config no longer overrides model.provider (#4329)
Rate-limit pairing rejection messages to prevent spam (#4081)

Agent Loop & Conversation

Preserve Anthropic thinking block signatures across tool-use turns (#4626)
Classify think-only empty responses before retrying — prevents infinite retry loops on models that produce thinking blocks without content (#4645)
Prevent compression death spiral from API disconnects — stops the loop where compression triggers, fails, compresses again (#4750, closes #2153)
Persist compressed context to gateway session after mid-run compression (#4095)
Context-exceeded error messages now include actionable guidance (#4155, closes #4061)
Strip orphaned think/reasoning tags from user-facing responses (#4311, closes #4285)
Harden Codex responses preflight and stream error handling (#4313)
Deterministic call_id fallbacks instead of random UUIDs for prompt cache consistency (#3991)
Context pressure warning spam prevented after compression (#4012)
AsyncOpenAI created lazily in trajectory compressor to avoid closed event loop errors (#4013)

Memory & Sessions

Pluggable memory provider interface — ABC-based plugin system for custom memory backends with profile isolation (#4623)
Honcho full integration parity restored as reference memory provider plugin (#4355) — @erosika
Honcho profile-scoped host and peer resolution (#4616)
Memory flush state persisted to prevent redundant re-flushes on gateway restart (#4481)
Memory provider tools routed through sequential execution path ([#4803](https://github.com/NousResearch/hermes-a...

Contributors

nick, lstep, and 46 other contributors

Assets 2

30 Mar 15:30

teknium1

v2026.3.30

e08778f

Hermes Agent v0.6.0 (v2026.3.30)

Release Date: March 30, 2026

The multi-instance release — Profiles for running isolated agent instances, MCP server mode, Docker container, fallback provider chains, two new messaging platforms (Feishu/Lark and WeCom), Telegram webhook mode, Slack multi-workspace OAuth, 95 PRs and 16 resolved issues in 2 days.

✨ Highlights

Profiles — Multi-Instance Hermes — Run multiple isolated Hermes instances from the same installation. Each profile gets its own config, memory, sessions, skills, and gateway service. Create with hermes profile create, switch with hermes -p <name>, export/import for sharing. Full token-lock isolation prevents two profiles from using the same bot credential. (#3681)
MCP Server Mode — Expose Hermes conversations and sessions to any MCP-compatible client (Claude Desktop, Cursor, VS Code, etc.) via hermes mcp serve. Browse conversations, read messages, search across sessions, and manage attachments — all through the Model Context Protocol. Supports both stdio and Streamable HTTP transports. (#3795)
Docker Container — Official Dockerfile for running Hermes Agent in a container. Supports both CLI and gateway modes with volume-mounted config. (#3668, closes #850)
Ordered Fallback Provider Chain — Configure multiple inference providers with automatic failover. When your primary provider returns errors or is unreachable, Hermes automatically tries the next provider in the chain. Configure via fallback_providers in config.yaml. (#3813, closes #1734)
Feishu/Lark Platform Support — Full gateway adapter for Feishu (飞书) and Lark with event subscriptions, message cards, group chat, image/file attachments, and interactive card callbacks. (#3799, #3817, closes #1788)
WeCom (Enterprise WeChat) Platform Support — New gateway adapter for WeCom (企业微信) with text/image/voice messages, group chats, and callback verification. (#3847)
Slack Multi-Workspace OAuth — Connect a single Hermes gateway to multiple Slack workspaces via OAuth token file. Each workspace gets its own bot token, resolved dynamically per incoming event. (#3903)
Telegram Webhook Mode & Group Controls — Run the Telegram adapter in webhook mode as an alternative to polling — faster response times and better for production deployments behind a reverse proxy. New group mention gating controls when the bot responds: always, only when @mentioned, or via regex triggers. (#3880, #3870)
Exa Search Backend — Add Exa as an alternative web search and content extraction backend alongside Firecrawl and DuckDuckGo. Set EXA_API_KEY and configure as preferred backend. (#3648)
Skills & Credentials on Remote Backends — Mount skill directories and credential files into Modal and Docker containers, so remote terminal sessions have access to the same skills and secrets as local execution. (#3890, #3671, closes #3665, #3433)

🏗️ Core Agent & Architecture

Provider & Model Support

Ordered fallback provider chain — automatic failover across multiple configured providers (#3813)
Fix api_mode on provider switch — switching providers via hermes model now correctly clears stale api_mode instead of hardcoding chat_completions, fixing 404s for providers with Anthropic-compatible endpoints (#3726, #3857, closes #3685)
Stop silent OpenRouter fallback — when no provider is configured, Hermes now raises a clear error instead of silently routing to OpenRouter (#3807, #3862)
Gemini 3.1 preview models — added to OpenRouter and Nous Portal catalogs (#3803, closes #3753)
Gemini direct API context length — full context length resolution for direct Google AI endpoints (#3876)
gpt-5.4-mini added to Codex fallback catalog (#3855)
Curated model lists preferred over live API probe when the probe returns fewer models (#3856, #3867)
User-friendly 429 rate limit messages with Retry-After countdown (#3809)
Auxiliary client placeholder key for local servers without auth requirements (#3842)
INFO-level logging for auxiliary provider resolution (#3866)

Agent Loop & Conversation

Subagent status reporting — reports completed status when summary exists instead of generic failure (#3829)
Session log file updated during compression — prevents stale file references after context compression (#3835)
Omit empty tools param — sends no tools parameter when empty instead of None, fixing compatibility with strict providers (#3820)

Profiles & Multi-Instance

Profiles system — hermes profile create/list/switch/delete/export/import/rename. Each profile gets isolated HERMES_HOME, gateway service, CLI wrapper. Token locks prevent credential collisions. Tab completion for profile names. (#3681)
Profile-aware display paths — all user-facing ~/.hermes paths replaced with display_hermes_home() to show the correct profile directory (#3623)
Lazy display_hermes_home imports — prevents ImportError during hermes update when modules cache stale bytecode (#3776)
HERMES_HOME for protected paths — .env write-deny path now respects HERMES_HOME instead of hardcoded ~/.hermes (#3840)

📱 Messaging Platforms (Gateway)

New Platforms

Feishu/Lark — Full adapter with event subscriptions, message cards, group chat, image/file attachments, interactive card callbacks (#3799, #3817)
WeCom (Enterprise WeChat) — Text/image/voice messages, group chats, callback verification (#3847)

Webhook mode — run as webhook endpoint instead of polling for production deployments (#3880)
Group mention gating & regex triggers — configurable bot response behavior in groups: always, @mention-only, or regex-matched (#3870)
Gracefully handle deleted reply targets — no more crashes when the message being replied to was deleted (#3858, closes #3229)

Discord

Message processing reactions — adds a reaction emoji while processing and removes it when done, giving visual feedback in channels (#3871)
DISCORD_IGNORE_NO_MENTION — skip messages that @mention other users/bots but not Hermes (#3640)
Clean up deferred "thinking..." — properly removes the "thinking..." indicator after slash commands complete (#3674, closes #3595)

Slack

Multi-workspace OAuth — connect to multiple Slack workspaces from a single gateway via OAuth token file (#3903)

Persistent aiohttp session — reuse HTTP sessions across requests instead of creating new ones per message (#3818)
LID↔phone alias resolution — correctly match Linked ID and phone number formats in allowlists (#3830)
Skip reply prefix in bot mode — cleaner message formatting when running as a WhatsApp bot ([#3931](https://github.com/NousResearch/hermes-...

Contributors

primmer, AdrianScott, and 18 other contributors

Assets 2

28 Mar 20:12

teknium1

v2026.3.28

558cc14

Hermes Agent v0.5.0 (v2026.3.28)

Release Date: March 28, 2026

The hardening release — Hugging Face provider, /model command overhaul, Telegram Private Chat Topics, native Modal SDK, plugin lifecycle hooks, tool-use enforcement for GPT models, Nix flake, 50+ security and reliability fixes, and a comprehensive supply chain audit.

✨ Highlights

Nous Portal now supports 400+ models — The Nous Research inference portal has expanded dramatically, giving Hermes Agent users access to over 400 models through a single provider endpoint
Hugging Face as a first-class inference provider — Full integration with HF Inference API including curated agentic model picker that maps to OpenRouter analogues, live /models endpoint probe, and setup wizard flow (#3419, #3440)
Telegram Private Chat Topics — Project-based conversations with functional skill binding per topic, enabling isolated workflows within a single Telegram chat (#3163)
Native Modal SDK backend — Replaced swe-rex dependency with native Modal SDK (Sandbox.create.aio + exec.aio), eliminating tunnels and simplifying the Modal terminal backend (#3538)
Plugin lifecycle hooks activated — pre_llm_call, post_llm_call, on_session_start, and on_session_end hooks now fire in the agent loop and CLI/gateway, completing the plugin hook system (#3542)
Improved OpenAI Model Reliability — Added GPT_TOOL_USE_GUIDANCE to prevent GPT models from describing intended actions instead of making tool calls, plus automatic stripping of stale budget warnings from conversation history that caused models to avoid tools across turns (#3528)
Nix flake — Full uv2nix build, NixOS module with persistent container mode, auto-generated config keys from Python source, and suffix PATHs for agent-friendliness (#20, #3274, #3061) by @alt-glitch
Supply chain hardening — Removed compromised litellm dependency, pinned all dependency version ranges, regenerated uv.lock with hashes, added CI workflow scanning PRs for supply chain attack patterns, and bumped deps to fix CVEs (#2796, #2810, #2812, #2816, #3073)
Anthropic output limits fix — Replaced hardcoded 16K max_tokens with per-model native output limits (128K for Opus 4.6, 64K for Sonnet 4.6), fixing "Response truncated" and thinking-budget exhaustion on direct Anthropic API (#3426, #3444)

🏗️ Core Agent & Architecture

New Provider: Hugging Face

First-class Hugging Face Inference API integration with auth, setup wizard, and model picker (#3419)
Curated model list mapping OpenRouter agentic defaults to HF equivalents — providers with 8+ curated models skip live /models probe for speed (#3440)
Added glm-5-turbo to Z.AI provider model list (#3095)

Provider & Model Improvements

/model command overhaul — extracted shared switch_model() pipeline for CLI and gateway, custom endpoint support, provider-aware routing (#2795, #2799)
Removed /model slash command from CLI and gateway in favor of hermes model subcommand (#3080)
Preserve custom provider instead of silently remapping to openrouter (#2792)
Read root-level provider and base_url from config.yaml into model config (#3112)
Align Nous Portal model slugs with OpenRouter naming (#3253)
Fix Alibaba provider default endpoint and model list (#3484)
Allow MiniMax users to override /v1 → /anthropic auto-correction (#3553)
Migrate OAuth token refresh to platform.claude.com with fallback (#3246)

Agent Loop & Conversation

Improved OpenAI model reliability — GPT_TOOL_USE_GUIDANCE prevents GPT models from describing actions instead of calling tools + automatic budget warning stripping from history (#3528)
Surface lifecycle events — All retry, fallback, and compression events now surface to the user as formatted messages (#3153)
Anthropic output limits — Per-model native output limits instead of hardcoded 16K max_tokens (#3426)
Thinking-budget exhaustion detection — Skip useless continuation retries when model uses all output tokens on reasoning (#3444)
Always prefer streaming for API calls to prevent hung subagents (#3120)
Restore safe non-streaming fallback after stream failures (#3020)
Give subagents independent iteration budgets (#3004)
Update api_key in _try_activate_fallback for subagent auth (#3103)
Graceful return on max retries instead of crashing thread (untagged commit)
Count compression restarts toward retry limit (#3070)
Include tool tokens in preflight estimate, guard context probe persistence (#3164)
Update context compressor limits after fallback activation (#3305)
Validate empty user messages to prevent Anthropic API 400 errors (#3322)
GLM reasoning-only and max-length handling (#3010)
Increase API timeout default from 900s to 1800s for slow-thinking models (#3431)
Send max_tokens for Claude/OpenRouter + retry SSE connection errors (#3497)
Prevent AsyncOpenAI/httpx cross-loop deadlock in gateway mode (#2701) by @ctlst

Streaming & Reasoning

Persist reasoning across gateway session turns with new schema v6 columns (reasoning, reasoning_details, codex_reasoning_items) (#2974)
Detect and kill stale SSE connections (untagged commit)
Fix stale stream detector race causing spurious RemoteProtocolError (untagged commit)
Skip duplicate callback for <think>-extracted reasoning during streaming (#3116)
Preserve reasoning fields in rewrite_transcript (#3311)
Preserve Gemini thought signatures in streamed tool calls (#2997)
Ensure first delta is fired during reasoning updates (untagged commit)

Session & Memory

Session search recent sessions mode — Omit query to browse recent sessions with titles, previews, and timestamps (#2533)
Session config surfacing on /new, /reset, and auto-reset (#3321)
Third-party session isolation — --source flag for isolating sessions by origin (#3255)
Add /resume CLI handler, session log truncation guard, reopen_session API (#3315)
Clear compressor summary and turn counter on /clear and /new (#3102)
Surface silent SessionDB failures that cause session data loss (#2999)
Session search fallback preview on summarization failure (#3478)
Prevent stale memory overwrites by flush agent (#2687)

Context Compression

Replace dead summary_target_tokens with ratio-based scaling (#2554)
Expose compression.target_ratio, protect_last_n, and threshold in...

Contributors

ctlst, alt-glitch, and 3 other contributors

Assets 2

24 Mar 05:34

teknium1

v2026.3.23

8416bc2

Hermes Agent v0.4.0 (v2026.3.23)

Release Date: March 23, 2026

The platform expansion release — OpenAI-compatible API server, 6 new messaging adapters, 4 new inference providers, MCP server management with OAuth 2.1, @ context references, gateway prompt caching, streaming enabled by default, and a sweeping reliability pass with 200+ bug fixes.

✨ Highlights

OpenAI-compatible API server — Expose Hermes as an /v1/chat/completions endpoint with a new /api/jobs REST API for cron job management, hardened with input limits, field whitelists, SQLite-backed response persistence, and CORS origin protection (#1756, #2450, #2456, #2451, #2472)
6 new messaging platform adapters — Signal, DingTalk, SMS (Twilio), Mattermost, Matrix, and Webhook adapters join Telegram, Discord, and WhatsApp. Gateway auto-reconnects failed platforms with exponential backoff (#2206, #1685, #1688, #1683, #2166, #2584)
@ context references — Claude Code-style @file and @url context injection with tab completions in the CLI (#2343, #2482)
4 new inference providers — GitHub Copilot (OAuth + token validation), Alibaba Cloud / DashScope, Kilo Code, and OpenCode Zen/Go (#1924, #1879 by @mchzimm, #1673, #1666, #1650)
MCP server management CLI — hermes mcp commands for installing, configuring, and authenticating MCP servers with full OAuth 2.1 PKCE flow (#2465)
Gateway prompt caching — Cache AIAgent instances per session, preserving Anthropic prompt cache across turns for dramatic cost reduction on long conversations (#2282, #2284, #2361)
Context compression overhaul — Structured summaries with iterative updates, token-budget tail protection, configurable summary endpoint, and fallback model support (#2323, #1727, #2224)
Streaming enabled by default — CLI streaming on by default with proper spinner/tool progress display during streaming mode, plus extensive linebreak and concatenation fixes (#2340, #2161, #2258)

🖥️ CLI & User Experience

New Commands & Interactions

@ context completions — Tab-completable @file/@url references that inject file content or web pages into the conversation (#2482, #2343)
/statusbar — Toggle a persistent config bar showing model + provider info in the prompt (#2240, #1917)
/queue — Queue prompts for the agent without interrupting the current run (#2191, #2469)
/permission — Switch approval mode dynamically during a session (#2207)
/browser — Interactive browser sessions from the CLI (#2273, #1814)
/cost — Live pricing and usage tracking in gateway mode (#2180)
/approve and /deny — Replaced bare text approval in gateway with explicit commands (#2002)

Streaming & Display

Streaming enabled by default in CLI (#2340)
Show spinners and tool progress during streaming mode (#2161)
Show reasoning/thinking blocks when show_reasoning enabled (#2118)
Context pressure warnings for CLI and gateway (#2159)
Fix: streaming chunks concatenated without whitespace (#2258)
Fix: iteration boundary linebreak prevents stream concatenation (#2413)
Fix: defer streaming linebreak to prevent blank line stacking (#2473)
Fix: suppress spinner animation in non-TTY environments (#2216)
Fix: display provider and endpoint in API error messages (#2266)
Fix: resolve garbled ANSI escape codes in status printouts (#2448)
Fix: update gold ANSI color to true-color format (#2246)
Fix: normalize toolset labels and use skin colors in banner (#1912)

CLI Polish

Fix: prevent 'Press ENTER to continue...' on exit (#2555)
Fix: flush stdout during agent loop to prevent macOS display freeze (#1654)
Fix: show human-readable error when hermes setup hits permissions error (#2196)
Fix: /stop command crash + UnboundLocalError in streaming media delivery (#2463)
Fix: allow custom/local endpoints without API key (#2556)
Fix: Kitty keyboard protocol Shift+Enter for Ghostty/WezTerm (attempted + reverted due to prompt_toolkit crash) (#2345, #2349)

Configuration

${ENV_VAR} substitution in config.yaml (#2684)
Real-time config reload — config.yaml changes apply without restart (#2210)
custom_models.yaml for user-managed model additions (#2214)
Priority-based context file selection + CLAUDE.md support (#2301)
Merge nested YAML sections instead of replacing on config update (#2213)
Fix: config.yaml provider key overrides env var silently (#2272)
Fix: log warning instead of silently swallowing config.yaml errors (#2683)
Fix: disabled toolsets re-enable themselves after hermes tools (#2268)
Fix: platform default toolsets silently override tool deselection (#2624)
Fix: honor bare YAML approvals.mode: off (#2620)
Fix: hermes update use .[all] extras with fallback (#1728)
Fix: hermes update prompt before resetting working tree on stash conflicts (#2390)
Fix: use git pull --rebase in update/install to avoid divergent branch error (#2274)
Fix: add zprofile fallback and create zshrc on fresh macOS installs (#2320)
Fix: remove ANTHROPIC_BASE_URL env var to avoid collisions (#1675)
Fix: don't ask IMAP password if already in keyring or env (#2212)
Fix: OpenCode Zen/Go show OpenRouter models instead of their own (#2277)

🏗️ Core Agent & Architecture

New Providers

GitHub Copilot — Full OAuth auth, API routing, token validation, and 400k context. (#1924, #1896, #1879 ...

Contributors

jquesnelle, hanai, and 15 other contributors

Assets 2

17 Mar 07:56

teknium1

v2026.3.17

6ebb816

Hermes Agent v0.3.0 (v2026.3.17)

Release Date: March 17, 2026

The streaming, plugins, and provider release — unified real-time token delivery, first-class plugin architecture, rebuilt provider system with Vercel AI Gateway, native Anthropic provider, smart approvals, live Chrome CDP browser connect, ACP IDE integration, Honcho memory, voice mode, persistent shell, and 50+ bug fixes across every platform.

✨ Highlights

Unified Streaming Infrastructure — Real-time token-by-token delivery in CLI and all gateway platforms. Responses stream as they're generated instead of arriving as a block. (#1538)
First-Class Plugin Architecture — Drop Python files into ~/.hermes/plugins/ to extend Hermes with custom tools, commands, and hooks. No forking required. (#1544, #1555)
Native Anthropic Provider — Direct Anthropic API calls with Claude Code credential auto-discovery, OAuth PKCE flows, and native prompt caching. No OpenRouter middleman needed. (#1097)
Smart Approvals + /stop Command — Codex-inspired approval system that learns which commands are safe and remembers your preferences. /stop kills the current agent run immediately. (#1543)
Honcho Memory Integration — Async memory writes, configurable recall modes, session title integration, and multi-user isolation in gateway mode. By @erosika. (#736)
Voice Mode — Push-to-talk in CLI, voice notes in Telegram/Discord, Discord voice channel support, and local Whisper transcription via faster-whisper. (#1299, #1185, #1429)
Concurrent Tool Execution — Multiple independent tool calls now run in parallel via ThreadPoolExecutor, significantly reducing latency for multi-tool turns. (#1152)
PII Redaction — When privacy.redact_pii is enabled, personally identifiable information is automatically scrubbed before sending context to LLM providers. (#1542)
/browser connect via CDP — Attach browser tools to a live Chrome instance through Chrome DevTools Protocol. Debug, inspect, and interact with pages you already have open. (#1549)
Vercel AI Gateway Provider — Route Hermes through Vercel's AI Gateway for access to their model catalog and infrastructure. (#1628)
Centralized Provider Router — Rebuilt provider system with call_llm API, unified /model command, auto-detect provider on model switch, and direct endpoint overrides for auxiliary/delegation clients. (#1003, #1506, #1375)
ACP Server (IDE Integration) — VS Code, Zed, and JetBrains can now connect to Hermes as an agent backend, with full slash command support. (#1254, #1532)
Persistent Shell Mode — Local and SSH terminal backends can maintain shell state across tool calls — cd, env vars, and aliases persist. By @alt-glitch. (#1067, #1483)
Agentic On-Policy Distillation (OPD) — New RL training environment for distilling agent policies, expanding the Atropos training ecosystem. (#1149)

🏗️ Core Agent & Architecture

Provider & Model Support

Centralized provider router with call_llm API and unified /model command — switch models and providers seamlessly (#1003)
Vercel AI Gateway provider support (#1628)
Auto-detect provider when switching models via /model (#1506)
Direct endpoint overrides for auxiliary and delegation clients — point vision/subagent calls at specific endpoints (#1375)
Native Anthropic auxiliary vision — use Claude's native vision API instead of routing through OpenAI-compatible endpoints (#1377)
Anthropic OAuth flow improvements — auto-run claude setup-token, reauthentication, PKCE state persistence, identity fingerprinting (#1132, #1360, #1396, #1597)
Fix adaptive thinking without budget_tokens for Claude 4.6 models — by @ASRagab (#1128)
Fix Anthropic cache markers through adapter — by @brandtcormorant (#1216)
Retry Anthropic 429/529 errors and surface details to users — by @0xbyt4 (#1585)
Fix Anthropic adapter max_tokens, fallback crash, proxy base_url — by @0xbyt4 (#1121)
Fix DeepSeek V3 parser dropping multiple parallel tool calls — by @mr-emmett-one (#1365, #1300)
Accept unlisted models with warning instead of rejecting (#1047, #1102)
Skip reasoning params for unsupported OpenRouter models (#1485)
MiniMax Anthropic API compatibility fix (#1623)
Custom endpoint /models verification and /v1 base URL suggestion (#1480)
Resolve delegation providers from custom_providers config (#1328)
Kimi model additions and User-Agent fix (#1039)
Strip call_id/response_item_id for Mistral compatibility (#1058)

Agent Loop & Conversation

Anthropic Context Editing API support (#1147)
Improved context compaction handoff summaries — compressor now preserves more actionable state (#1273)
Sync session_id after mid-run context compression (#1160)
Session hygiene threshold tuned to 50% for more proactive compression (#1096, #1161)
Include session ID in system prompt via --pass-session-id flag (#1040)
Prevent closed OpenAI client reuse across retries (#1391)
Sanitize chat payloads and provider precedence (#1253)
Handle dict tool call arguments from Codex and local backends (#1393, #1440)

Memory & Sessions

Improve memory prioritization — user preferences and corrections weighted above procedural knowledge (#1548)
Tighter memory and session recall guidance in system prompts (#1329)
Persist CLI token counts to session DB for /insights (#1498)
Keep Honcho recall out of the cached system prefix (#1201)
Correct seed_ai_identity to use session.add_messages() (#1475)
Isolate Honcho session routing for multi-user gateway (#1500)

📱 Messaging Platforms (Gateway)

Gateway Core

System gateway service mode — run as a system-level systemd service, not just user-level (#1371)
Gateway install scope prompts — choose user vs system scope during setup (#1374)
Reasoning hot reload — change reasoning settings without restarting the gateway (#1275)
Default group sessions to per-user isolation — no more shared state across users in group chats (#1495, #1417)
Harden gateway restart recovery (#1310)
Cancel active...

Contributors

austinpickett, jplew, and 13 other contributors

Assets 2

12 Mar 10:07

teknium1

v2026.3.12

a370ab8

Hermes Agent v0.2.0 (2026.3.12)

Hermes Agent v0.2.0 (v2026.3.12)

Release Date: March 12, 2026

First tagged release since v0.1.0 (the initial pre-public foundation). In just over two weeks, Hermes Agent went from a small internal project to a full-featured AI agent platform — thanks to an explosion of community contributions. This release covers 216 merged pull requests from 63 contributors, resolving 119 issues.

✨ Highlights

Multi-Platform Messaging Gateway — Telegram, Discord, Slack, WhatsApp, Signal, Email (IMAP/SMTP), and Home Assistant platforms with unified session management, media attachments, and per-platform tool configuration.
MCP (Model Context Protocol) Client — Native MCP support with stdio and HTTP transports, reconnection, resource/prompt discovery, and sampling (server-initiated LLM requests). (#291 — @0xbyt4, #301, #753)
Skills Ecosystem — 70+ bundled and optional skills across 15+ categories with a Skills Hub for community discovery, per-platform enable/disable, conditional activation based on tool availability, and prerequisite validation. (#743 — @teyrebaz33, #785 — @teyrebaz33)
Centralized Provider Router — Unified call_llm()/async_call_llm() API replaces scattered provider logic across vision, summarization, compression, and trajectory saving. All auxiliary consumers route through a single code path with automatic credential resolution. (#1003)
ACP Server — VS Code, Zed, and JetBrains editor integration via the Agent Communication Protocol standard. (#949)
CLI Skin/Theme Engine — Data-driven visual customization: banners, spinners, colors, branding. 7 built-in skins + custom YAML skins.
Git Worktree Isolation — hermes -w launches isolated agent sessions in git worktrees for safe parallel work on the same repo. (#654)
Filesystem Checkpoints & Rollback — Automatic snapshots before destructive operations with /rollback to restore. (#824)
3,289 Tests — From near-zero test coverage to a comprehensive test suite covering agent, gateway, tools, cron, and CLI.

🏗️ Core Agent & Architecture

Provider & Model Support

Centralized provider router with resolve_provider_client() + call_llm() API (#1003)
Nous Portal as first-class provider in setup (#644)
OpenAI Codex (Responses API) with ChatGPT subscription support (#43) — @grp06
Codex OAuth vision support + multimodal content adapter
Validate /model against live API instead of hardcoded lists
Self-hosted Firecrawl support (#460) — @caentzminger
Kimi Code API support (#635) — @christomitov
MiniMax model ID update (#473) — @tars90percent
OpenRouter provider routing configuration (provider_preferences)
Nous credential refresh on 401 errors (#571, #269) — @rewbs
z.ai/GLM, Kimi/Moonshot, MiniMax, Azure OpenAI as first-class providers
Unified /model and /provider into single view

Agent Loop & Conversation

Simple fallback model for provider resilience (#740)
Shared iteration budget across parent + subagent delegation
Iteration budget pressure via tool result injection
Configurable subagent provider/model with full credential resolution
Handle 413 payload-too-large via compression instead of aborting (#153) — @tekelala
Retry with rebuilt payload after compression (#616) — @tripledoublev
Auto-compress pathologically large gateway sessions (#628)
Tool call repair middleware — auto-lowercase and invalid tool handler
Reasoning effort configuration and /reasoning command (#921)
Detect and block file re-read/search loops after context compression (#705) — @0xbyt4

Session & Memory

Session naming with unique titles, auto-lineage, rich listing, and resume by name (#720)
Interactive session browser with search filtering (#733)
Display previous messages when resuming a session (#734)
Honcho AI-native cross-session user modeling (#38) — @erosika
Proactive async memory flush on session expiry
Smart context length probing with persistent caching + banner display
/resume command for switching to named sessions in gateway
Session reset policy for messaging platforms

📱 Messaging Platforms (Gateway)

Native file attachments: send_document + send_video
Document file processing for PDF, text, and Office files — @tekelala
Forum topic session isolation (#766) — @spanishflu-est1918
Browser screenshot sharing via MEDIA: protocol (#657)
Location support for find-nearby skill
TTS voice message accumulation fix (#176) — @Bartok9
Improved error handling and logging (#763) — @aydnOktay
Italic regex newline fix + 43 format tests (#204) — @0xbyt4

Discord

Channel topic included in session context (#248) — @Bartok9
DISCORD_ALLOW_BOTS config for bot message filtering (#758)
Document and video support (#784)
Improved error handling and logging (#761) — @aydnOktay

Slack

App_mention 404 fix + document/video support (#784)
Structured logging replacing print statements — @aydnOktay

Native media sending — images, videos, documents (#292) — @satelerd
Multi-user session isolation (#75) — @satelerd
Cross-platform port cleanup replacing Linux-only fuser (#433) — @Farukest
DM interrupt key mismatch fix (#350) — @Farukest

Signal

Full Signal messenger gateway via signal-cli-rest-api (#405)
Media URL support in message events (#871)

Email (IMAP/SMTP)

New email gateway platform — @0xbyt4

Home Assistant

REST tools + WebSocket gateway integration (#184) — @0xbyt4
Service discovery and enhanced setup
Toolset mapping fix (#538) — @Himess

Gateway Core

Expose subagent tool calls and thinking to users (#186) — @cutepawss
Configurable background process watcher notifications (#840)
edit_message() for Telegram/Discord/Slack with fallback
/compress, /usage, /update slash commands
Eliminated 3x SQLite message duplication in gateway sessions (#873)
Stabilize system prompt across gateway turns for cache hits (#754)
MCP server shutdown on gateway exit (#796) — @0xbyt4
Pass session_db to AIAgent, fixing session_search error (#108) — @Bartok9
Persist transcript changes in /retry, /undo; fix /reset attribute (#217) — @Farukest
UTF-8 encoding fix preventing Windows crashes (#369) — @ch3ronsa

🖥️ CLI & User Experience

Interactive CLI

Data-driven skin/theme engine — 7 built-in skins (default, ares, mono, slate, poseidon, sisyphus, charizard) + custom YAML skins
/personality command with custom personality + disable support (#773) — @teyrebaz33
User-defined quick commands that bypass the agent loop (#746) — @teyrebaz33
/reasoning command for effort level and display toggle (#921)
/verbose slash command to toggle debug at runtime ([#94](https://github.com/Nou...