Skip to content

Releases: NousResearch/hermes-agent

Hermes Agent v0.7.0 (v2026.4.3)

03 Apr 18:15
v2026.4.3
abf1e98

Choose a tag to compare

Hermes Agent v0.7.0 (v2026.4.3)

Release Date: April 3, 2026

The resilience release — pluggable memory providers, credential pool rotation, Camofox anti-detection browser, inline diff previews, gateway hardening across race conditions and approval routing, and deep security fixes across 168 PRs and 46 resolved issues.


✨ Highlights

  • Pluggable Memory Provider Interface — Memory is now an extensible plugin system. Third-party memory backends (Honcho, vector stores, custom DBs) implement a simple provider ABC and register via the plugin system. Built-in memory is the default provider. Honcho integration restored to full parity as the reference plugin with profile-scoped host/peer resolution. (#4623, #4616, #4355)

  • Same-Provider Credential Pools — Configure multiple API keys for the same provider with automatic rotation. Thread-safe least_used strategy distributes load across keys, and 401 failures trigger automatic rotation to the next credential. Set up via the setup wizard or credential_pool config. (#4188, #4300, #4361)

  • Camofox Anti-Detection Browser Backend — New local browser backend using Camoufox for stealth browsing. Persistent sessions with VNC URL discovery for visual debugging, configurable SSRF bypass for local backends, auto-install via hermes tools. (#4008, #4419, #4292)

  • Inline Diff Previews — File write and patch operations now show inline diffs in the tool activity feed, giving you visual confirmation of what changed before the agent moves on. (#4411, #4423)

  • API Server Session Continuity & Tool Streaming — The API server (Open WebUI integration) now streams tool progress events in real-time and supports X-Hermes-Session-Id headers for persistent sessions across requests. Sessions persist to the shared SessionDB. (#4092, #4478, #4802)

  • ACP: Client-Provided MCP Servers — Editor integrations (VS Code, Zed, JetBrains) can now register their own MCP servers, which Hermes picks up as additional agent tools. Your editor's MCP ecosystem flows directly into the agent. (#4705)

  • Gateway Hardening — Major stability pass across race conditions, photo media delivery, flood control, stuck sessions, approval routing, and compression death spirals. The gateway is substantially more reliable in production. (#4727, #4750, #4798, #4557)

  • Security: Secret Exfiltration Blocking — Browser URLs and LLM responses are now scanned for secret patterns, blocking exfiltration attempts via URL encoding, base64, or prompt injection. Credential directory protections expanded to .docker, .azure, .config/gh. Execute_code sandbox output is redacted. (#4483, #4360, #4305, #4327)


🏗️ Core Agent & Architecture

Provider & Model Support

  • Same-provider credential pools — configure multiple API keys with automatic least_used rotation and 401 failover (#4188, #4300)
  • Credential pool preserved through smart routing — pool state survives fallback provider switches and defers eager fallback on 429 (#4361)
  • Per-turn primary runtime restoration — after fallback provider use, the agent automatically restores the primary provider on the next turn with transport recovery (#4624)
  • developer role for GPT-5 and Codex models — uses OpenAI's recommended system message role for newer models (#4498)
  • Google model operational guidance — Gemini and Gemma models get provider-specific prompting guidance (#4641)
  • Anthropic long-context tier 429 handling — automatically reduces context to 200k when hitting tier limits (#4747)
  • URL-based auth for third-party Anthropic endpoints + CI test fixes (#4148)
  • Bearer auth for MiniMax Anthropic endpoints (#4028)
  • Fireworks context length detection (#4158)
  • Standard DashScope international endpoint for Alibaba provider (#4133, closes #3912)
  • Custom providers context_length honored in hygiene compression (#4085)
  • Non-sk-ant keys treated as regular API keys, not OAuth tokens (#4093)
  • Claude-sonnet-4.6 added to OpenRouter and Nous model lists (#4157)
  • Qwen 3.6 Plus Preview added to model lists (#4376)
  • MiniMax M2.7 added to hermes model picker and OpenCode (#4208)
  • Auto-detect models from server probe in custom endpoint setup (#4218)
  • Config.yaml single source of truth for endpoint URLs — no more env var vs config.yaml conflicts (#4165)
  • Setup wizard no longer overwrites custom endpoint config (#4180, closes #4172)
  • Unified setup wizard provider selection with hermes model — single code path for both flows (#4200)
  • Root-level provider config no longer overrides model.provider (#4329)
  • Rate-limit pairing rejection messages to prevent spam (#4081)

Agent Loop & Conversation

  • Preserve Anthropic thinking block signatures across tool-use turns (#4626)
  • Classify think-only empty responses before retrying — prevents infinite retry loops on models that produce thinking blocks without content (#4645)
  • Prevent compression death spiral from API disconnects — stops the loop where compression triggers, fails, compresses again (#4750, closes #2153)
  • Persist compressed context to gateway session after mid-run compression (#4095)
  • Context-exceeded error messages now include actionable guidance (#4155, closes #4061)
  • Strip orphaned think/reasoning tags from user-facing responses (#4311, closes #4285)
  • Harden Codex responses preflight and stream error handling (#4313)
  • Deterministic call_id fallbacks instead of random UUIDs for prompt cache consistency (#3991)
  • Context pressure warning spam prevented after compression (#4012)
  • AsyncOpenAI created lazily in trajectory compressor to avoid closed event loop errors (#4013)

Memory & Sessions

  • Pluggable memory provider interface — ABC-based plugin system for custom memory backends with profile isolation (#4623)
  • Honcho full integration parity restored as reference memory provider plugin (#4355) — @erosika
  • Honcho profile-scoped host and peer resolution (#4616)
  • Memory flush state persisted to prevent redundant re-flushes on gateway restart (#4481)
  • Memory provider tools routed through sequential execution path ([#4803](https://github.com/NousResearch/hermes-a...
Read more

Hermes Agent v0.6.0 (v2026.3.30)

30 Mar 15:30
v2026.3.30
e08778f

Choose a tag to compare

Hermes Agent v0.6.0 (v2026.3.30)

Release Date: March 30, 2026

The multi-instance release — Profiles for running isolated agent instances, MCP server mode, Docker container, fallback provider chains, two new messaging platforms (Feishu/Lark and WeCom), Telegram webhook mode, Slack multi-workspace OAuth, 95 PRs and 16 resolved issues in 2 days.


✨ Highlights

  • Profiles — Multi-Instance Hermes — Run multiple isolated Hermes instances from the same installation. Each profile gets its own config, memory, sessions, skills, and gateway service. Create with hermes profile create, switch with hermes -p <name>, export/import for sharing. Full token-lock isolation prevents two profiles from using the same bot credential. (#3681)

  • MCP Server Mode — Expose Hermes conversations and sessions to any MCP-compatible client (Claude Desktop, Cursor, VS Code, etc.) via hermes mcp serve. Browse conversations, read messages, search across sessions, and manage attachments — all through the Model Context Protocol. Supports both stdio and Streamable HTTP transports. (#3795)

  • Docker Container — Official Dockerfile for running Hermes Agent in a container. Supports both CLI and gateway modes with volume-mounted config. (#3668, closes #850)

  • Ordered Fallback Provider Chain — Configure multiple inference providers with automatic failover. When your primary provider returns errors or is unreachable, Hermes automatically tries the next provider in the chain. Configure via fallback_providers in config.yaml. (#3813, closes #1734)

  • Feishu/Lark Platform Support — Full gateway adapter for Feishu (飞书) and Lark with event subscriptions, message cards, group chat, image/file attachments, and interactive card callbacks. (#3799, #3817, closes #1788)

  • WeCom (Enterprise WeChat) Platform Support — New gateway adapter for WeCom (企业微信) with text/image/voice messages, group chats, and callback verification. (#3847)

  • Slack Multi-Workspace OAuth — Connect a single Hermes gateway to multiple Slack workspaces via OAuth token file. Each workspace gets its own bot token, resolved dynamically per incoming event. (#3903)

  • Telegram Webhook Mode & Group Controls — Run the Telegram adapter in webhook mode as an alternative to polling — faster response times and better for production deployments behind a reverse proxy. New group mention gating controls when the bot responds: always, only when @mentioned, or via regex triggers. (#3880, #3870)

  • Exa Search Backend — Add Exa as an alternative web search and content extraction backend alongside Firecrawl and DuckDuckGo. Set EXA_API_KEY and configure as preferred backend. (#3648)

  • Skills & Credentials on Remote Backends — Mount skill directories and credential files into Modal and Docker containers, so remote terminal sessions have access to the same skills and secrets as local execution. (#3890, #3671, closes #3665, #3433)


🏗️ Core Agent & Architecture

Provider & Model Support

  • Ordered fallback provider chain — automatic failover across multiple configured providers (#3813)
  • Fix api_mode on provider switch — switching providers via hermes model now correctly clears stale api_mode instead of hardcoding chat_completions, fixing 404s for providers with Anthropic-compatible endpoints (#3726, #3857, closes #3685)
  • Stop silent OpenRouter fallback — when no provider is configured, Hermes now raises a clear error instead of silently routing to OpenRouter (#3807, #3862)
  • Gemini 3.1 preview models — added to OpenRouter and Nous Portal catalogs (#3803, closes #3753)
  • Gemini direct API context length — full context length resolution for direct Google AI endpoints (#3876)
  • gpt-5.4-mini added to Codex fallback catalog (#3855)
  • Curated model lists preferred over live API probe when the probe returns fewer models (#3856, #3867)
  • User-friendly 429 rate limit messages with Retry-After countdown (#3809)
  • Auxiliary client placeholder key for local servers without auth requirements (#3842)
  • INFO-level logging for auxiliary provider resolution (#3866)

Agent Loop & Conversation

  • Subagent status reporting — reports completed status when summary exists instead of generic failure (#3829)
  • Session log file updated during compression — prevents stale file references after context compression (#3835)
  • Omit empty tools param — sends no tools parameter when empty instead of None, fixing compatibility with strict providers (#3820)

Profiles & Multi-Instance

  • Profiles systemhermes profile create/list/switch/delete/export/import/rename. Each profile gets isolated HERMES_HOME, gateway service, CLI wrapper. Token locks prevent credential collisions. Tab completion for profile names. (#3681)
  • Profile-aware display paths — all user-facing ~/.hermes paths replaced with display_hermes_home() to show the correct profile directory (#3623)
  • Lazy display_hermes_home imports — prevents ImportError during hermes update when modules cache stale bytecode (#3776)
  • HERMES_HOME for protected paths.env write-deny path now respects HERMES_HOME instead of hardcoded ~/.hermes (#3840)

📱 Messaging Platforms (Gateway)

New Platforms

  • Feishu/Lark — Full adapter with event subscriptions, message cards, group chat, image/file attachments, interactive card callbacks (#3799, #3817)
  • WeCom (Enterprise WeChat) — Text/image/voice messages, group chats, callback verification (#3847)

Telegram

  • Webhook mode — run as webhook endpoint instead of polling for production deployments (#3880)
  • Group mention gating & regex triggers — configurable bot response behavior in groups: always, @mention-only, or regex-matched (#3870)
  • Gracefully handle deleted reply targets — no more crashes when the message being replied to was deleted (#3858, closes #3229)

Discord

  • Message processing reactions — adds a reaction emoji while processing and removes it when done, giving visual feedback in channels (#3871)
  • DISCORD_IGNORE_NO_MENTION — skip messages that @mention other users/bots but not Hermes (#3640)
  • Clean up deferred "thinking..." — properly removes the "thinking..." indicator after slash commands complete (#3674, closes #3595)

Slack

  • Multi-workspace OAuth — connect to multiple Slack workspaces from a single gateway via OAuth token file (#3903)

WhatsApp

  • Persistent aiohttp session — reuse HTTP sessions across requests instead of creating new ones per message (#3818)
  • LID↔phone alias resolution — correctly match Linked ID and phone number formats in allowlists (#3830)
  • Skip reply prefix in bot mode — cleaner message formatting when running as a WhatsApp bot ([#3931](https://github.com/NousResearch/hermes-...
Read more

Hermes Agent v0.5.0 (v2026.3.28)

28 Mar 20:12
v2026.3.28
558cc14

Choose a tag to compare

Hermes Agent v0.5.0 (v2026.3.28)

Release Date: March 28, 2026

The hardening release — Hugging Face provider, /model command overhaul, Telegram Private Chat Topics, native Modal SDK, plugin lifecycle hooks, tool-use enforcement for GPT models, Nix flake, 50+ security and reliability fixes, and a comprehensive supply chain audit.


✨ Highlights

  • Nous Portal now supports 400+ models — The Nous Research inference portal has expanded dramatically, giving Hermes Agent users access to over 400 models through a single provider endpoint

  • Hugging Face as a first-class inference provider — Full integration with HF Inference API including curated agentic model picker that maps to OpenRouter analogues, live /models endpoint probe, and setup wizard flow (#3419, #3440)

  • Telegram Private Chat Topics — Project-based conversations with functional skill binding per topic, enabling isolated workflows within a single Telegram chat (#3163)

  • Native Modal SDK backend — Replaced swe-rex dependency with native Modal SDK (Sandbox.create.aio + exec.aio), eliminating tunnels and simplifying the Modal terminal backend (#3538)

  • Plugin lifecycle hooks activatedpre_llm_call, post_llm_call, on_session_start, and on_session_end hooks now fire in the agent loop and CLI/gateway, completing the plugin hook system (#3542)

  • Improved OpenAI Model Reliability — Added GPT_TOOL_USE_GUIDANCE to prevent GPT models from describing intended actions instead of making tool calls, plus automatic stripping of stale budget warnings from conversation history that caused models to avoid tools across turns (#3528)

  • Nix flake — Full uv2nix build, NixOS module with persistent container mode, auto-generated config keys from Python source, and suffix PATHs for agent-friendliness (#20, #3274, #3061) by @alt-glitch

  • Supply chain hardening — Removed compromised litellm dependency, pinned all dependency version ranges, regenerated uv.lock with hashes, added CI workflow scanning PRs for supply chain attack patterns, and bumped deps to fix CVEs (#2796, #2810, #2812, #2816, #3073)

  • Anthropic output limits fix — Replaced hardcoded 16K max_tokens with per-model native output limits (128K for Opus 4.6, 64K for Sonnet 4.6), fixing "Response truncated" and thinking-budget exhaustion on direct Anthropic API (#3426, #3444)


🏗️ Core Agent & Architecture

New Provider: Hugging Face

  • First-class Hugging Face Inference API integration with auth, setup wizard, and model picker (#3419)
  • Curated model list mapping OpenRouter agentic defaults to HF equivalents — providers with 8+ curated models skip live /models probe for speed (#3440)
  • Added glm-5-turbo to Z.AI provider model list (#3095)

Provider & Model Improvements

  • /model command overhaul — extracted shared switch_model() pipeline for CLI and gateway, custom endpoint support, provider-aware routing (#2795, #2799)
  • Removed /model slash command from CLI and gateway in favor of hermes model subcommand (#3080)
  • Preserve custom provider instead of silently remapping to openrouter (#2792)
  • Read root-level provider and base_url from config.yaml into model config (#3112)
  • Align Nous Portal model slugs with OpenRouter naming (#3253)
  • Fix Alibaba provider default endpoint and model list (#3484)
  • Allow MiniMax users to override /v1/anthropic auto-correction (#3553)
  • Migrate OAuth token refresh to platform.claude.com with fallback (#3246)

Agent Loop & Conversation

  • Improved OpenAI model reliabilityGPT_TOOL_USE_GUIDANCE prevents GPT models from describing actions instead of calling tools + automatic budget warning stripping from history (#3528)
  • Surface lifecycle events — All retry, fallback, and compression events now surface to the user as formatted messages (#3153)
  • Anthropic output limits — Per-model native output limits instead of hardcoded 16K max_tokens (#3426)
  • Thinking-budget exhaustion detection — Skip useless continuation retries when model uses all output tokens on reasoning (#3444)
  • Always prefer streaming for API calls to prevent hung subagents (#3120)
  • Restore safe non-streaming fallback after stream failures (#3020)
  • Give subagents independent iteration budgets (#3004)
  • Update api_key in _try_activate_fallback for subagent auth (#3103)
  • Graceful return on max retries instead of crashing thread (untagged commit)
  • Count compression restarts toward retry limit (#3070)
  • Include tool tokens in preflight estimate, guard context probe persistence (#3164)
  • Update context compressor limits after fallback activation (#3305)
  • Validate empty user messages to prevent Anthropic API 400 errors (#3322)
  • GLM reasoning-only and max-length handling (#3010)
  • Increase API timeout default from 900s to 1800s for slow-thinking models (#3431)
  • Send max_tokens for Claude/OpenRouter + retry SSE connection errors (#3497)
  • Prevent AsyncOpenAI/httpx cross-loop deadlock in gateway mode (#2701) by @ctlst

Streaming & Reasoning

  • Persist reasoning across gateway session turns with new schema v6 columns (reasoning, reasoning_details, codex_reasoning_items) (#2974)
  • Detect and kill stale SSE connections (untagged commit)
  • Fix stale stream detector race causing spurious RemoteProtocolError (untagged commit)
  • Skip duplicate callback for <think>-extracted reasoning during streaming (#3116)
  • Preserve reasoning fields in rewrite_transcript (#3311)
  • Preserve Gemini thought signatures in streamed tool calls (#2997)
  • Ensure first delta is fired during reasoning updates (untagged commit)

Session & Memory

  • Session search recent sessions mode — Omit query to browse recent sessions with titles, previews, and timestamps (#2533)
  • Session config surfacing on /new, /reset, and auto-reset (#3321)
  • Third-party session isolation--source flag for isolating sessions by origin (#3255)
  • Add /resume CLI handler, session log truncation guard, reopen_session API (#3315)
  • Clear compressor summary and turn counter on /clear and /new (#3102)
  • Surface silent SessionDB failures that cause session data loss (#2999)
  • Session search fallback preview on summarization failure (#3478)
  • Prevent stale memory overwrites by flush agent (#2687)

Context Compression

  • Replace dead summary_target_tokens with ratio-based scaling (#2554)
  • Expose compression.target_ratio, protect_last_n, and threshold in...
Read more

Hermes Agent v0.4.0 (v2026.3.23)

24 Mar 05:34
v2026.3.23
8416bc2

Choose a tag to compare

Hermes Agent v0.4.0 (v2026.3.23)

Release Date: March 23, 2026

The platform expansion release — OpenAI-compatible API server, 6 new messaging adapters, 4 new inference providers, MCP server management with OAuth 2.1, @ context references, gateway prompt caching, streaming enabled by default, and a sweeping reliability pass with 200+ bug fixes.


✨ Highlights

  • OpenAI-compatible API server — Expose Hermes as an /v1/chat/completions endpoint with a new /api/jobs REST API for cron job management, hardened with input limits, field whitelists, SQLite-backed response persistence, and CORS origin protection (#1756, #2450, #2456, #2451, #2472)

  • 6 new messaging platform adapters — Signal, DingTalk, SMS (Twilio), Mattermost, Matrix, and Webhook adapters join Telegram, Discord, and WhatsApp. Gateway auto-reconnects failed platforms with exponential backoff (#2206, #1685, #1688, #1683, #2166, #2584)

  • @ context references — Claude Code-style @file and @url context injection with tab completions in the CLI (#2343, #2482)

  • 4 new inference providers — GitHub Copilot (OAuth + token validation), Alibaba Cloud / DashScope, Kilo Code, and OpenCode Zen/Go (#1924, #1879 by @mchzimm, #1673, #1666, #1650)

  • MCP server management CLIhermes mcp commands for installing, configuring, and authenticating MCP servers with full OAuth 2.1 PKCE flow (#2465)

  • Gateway prompt caching — Cache AIAgent instances per session, preserving Anthropic prompt cache across turns for dramatic cost reduction on long conversations (#2282, #2284, #2361)

  • Context compression overhaul — Structured summaries with iterative updates, token-budget tail protection, configurable summary endpoint, and fallback model support (#2323, #1727, #2224)

  • Streaming enabled by default — CLI streaming on by default with proper spinner/tool progress display during streaming mode, plus extensive linebreak and concatenation fixes (#2340, #2161, #2258)


🖥️ CLI & User Experience

New Commands & Interactions

  • @ context completions — Tab-completable @file/@url references that inject file content or web pages into the conversation (#2482, #2343)
  • /statusbar — Toggle a persistent config bar showing model + provider info in the prompt (#2240, #1917)
  • /queue — Queue prompts for the agent without interrupting the current run (#2191, #2469)
  • /permission — Switch approval mode dynamically during a session (#2207)
  • /browser — Interactive browser sessions from the CLI (#2273, #1814)
  • /cost — Live pricing and usage tracking in gateway mode (#2180)
  • /approve and /deny — Replaced bare text approval in gateway with explicit commands (#2002)

Streaming & Display

  • Streaming enabled by default in CLI (#2340)
  • Show spinners and tool progress during streaming mode (#2161)
  • Show reasoning/thinking blocks when show_reasoning enabled (#2118)
  • Context pressure warnings for CLI and gateway (#2159)
  • Fix: streaming chunks concatenated without whitespace (#2258)
  • Fix: iteration boundary linebreak prevents stream concatenation (#2413)
  • Fix: defer streaming linebreak to prevent blank line stacking (#2473)
  • Fix: suppress spinner animation in non-TTY environments (#2216)
  • Fix: display provider and endpoint in API error messages (#2266)
  • Fix: resolve garbled ANSI escape codes in status printouts (#2448)
  • Fix: update gold ANSI color to true-color format (#2246)
  • Fix: normalize toolset labels and use skin colors in banner (#1912)

CLI Polish

  • Fix: prevent 'Press ENTER to continue...' on exit (#2555)
  • Fix: flush stdout during agent loop to prevent macOS display freeze (#1654)
  • Fix: show human-readable error when hermes setup hits permissions error (#2196)
  • Fix: /stop command crash + UnboundLocalError in streaming media delivery (#2463)
  • Fix: allow custom/local endpoints without API key (#2556)
  • Fix: Kitty keyboard protocol Shift+Enter for Ghostty/WezTerm (attempted + reverted due to prompt_toolkit crash) (#2345, #2349)

Configuration

  • ${ENV_VAR} substitution in config.yaml (#2684)
  • Real-time config reload — config.yaml changes apply without restart (#2210)
  • custom_models.yaml for user-managed model additions (#2214)
  • Priority-based context file selection + CLAUDE.md support (#2301)
  • Merge nested YAML sections instead of replacing on config update (#2213)
  • Fix: config.yaml provider key overrides env var silently (#2272)
  • Fix: log warning instead of silently swallowing config.yaml errors (#2683)
  • Fix: disabled toolsets re-enable themselves after hermes tools (#2268)
  • Fix: platform default toolsets silently override tool deselection (#2624)
  • Fix: honor bare YAML approvals.mode: off (#2620)
  • Fix: hermes update use .[all] extras with fallback (#1728)
  • Fix: hermes update prompt before resetting working tree on stash conflicts (#2390)
  • Fix: use git pull --rebase in update/install to avoid divergent branch error (#2274)
  • Fix: add zprofile fallback and create zshrc on fresh macOS installs (#2320)
  • Fix: remove ANTHROPIC_BASE_URL env var to avoid collisions (#1675)
  • Fix: don't ask IMAP password if already in keyring or env (#2212)
  • Fix: OpenCode Zen/Go show OpenRouter models instead of their own (#2277)

🏗️ Core Agent & Architecture

New Providers

  • GitHub Copilot — Full OAuth auth, API routing, token validation, and 400k context. (#1924, #1896, #1879 ...
Read more

Hermes Agent v0.3.0 (v2026.3.17)

17 Mar 07:56
6ebb816

Choose a tag to compare

Hermes Agent v0.3.0 (v2026.3.17)

Release Date: March 17, 2026

The streaming, plugins, and provider release — unified real-time token delivery, first-class plugin architecture, rebuilt provider system with Vercel AI Gateway, native Anthropic provider, smart approvals, live Chrome CDP browser connect, ACP IDE integration, Honcho memory, voice mode, persistent shell, and 50+ bug fixes across every platform.


✨ Highlights

  • Unified Streaming Infrastructure — Real-time token-by-token delivery in CLI and all gateway platforms. Responses stream as they're generated instead of arriving as a block. (#1538)

  • First-Class Plugin Architecture — Drop Python files into ~/.hermes/plugins/ to extend Hermes with custom tools, commands, and hooks. No forking required. (#1544, #1555)

  • Native Anthropic Provider — Direct Anthropic API calls with Claude Code credential auto-discovery, OAuth PKCE flows, and native prompt caching. No OpenRouter middleman needed. (#1097)

  • Smart Approvals + /stop Command — Codex-inspired approval system that learns which commands are safe and remembers your preferences. /stop kills the current agent run immediately. (#1543)

  • Honcho Memory Integration — Async memory writes, configurable recall modes, session title integration, and multi-user isolation in gateway mode. By @erosika. (#736)

  • Voice Mode — Push-to-talk in CLI, voice notes in Telegram/Discord, Discord voice channel support, and local Whisper transcription via faster-whisper. (#1299, #1185, #1429)

  • Concurrent Tool Execution — Multiple independent tool calls now run in parallel via ThreadPoolExecutor, significantly reducing latency for multi-tool turns. (#1152)

  • PII Redaction — When privacy.redact_pii is enabled, personally identifiable information is automatically scrubbed before sending context to LLM providers. (#1542)

  • /browser connect via CDP — Attach browser tools to a live Chrome instance through Chrome DevTools Protocol. Debug, inspect, and interact with pages you already have open. (#1549)

  • Vercel AI Gateway Provider — Route Hermes through Vercel's AI Gateway for access to their model catalog and infrastructure. (#1628)

  • Centralized Provider Router — Rebuilt provider system with call_llm API, unified /model command, auto-detect provider on model switch, and direct endpoint overrides for auxiliary/delegation clients. (#1003, #1506, #1375)

  • ACP Server (IDE Integration) — VS Code, Zed, and JetBrains can now connect to Hermes as an agent backend, with full slash command support. (#1254, #1532)

  • Persistent Shell Mode — Local and SSH terminal backends can maintain shell state across tool calls — cd, env vars, and aliases persist. By @alt-glitch. (#1067, #1483)

  • Agentic On-Policy Distillation (OPD) — New RL training environment for distilling agent policies, expanding the Atropos training ecosystem. (#1149)


🏗️ Core Agent & Architecture

Provider & Model Support

  • Centralized provider router with call_llm API and unified /model command — switch models and providers seamlessly (#1003)
  • Vercel AI Gateway provider support (#1628)
  • Auto-detect provider when switching models via /model (#1506)
  • Direct endpoint overrides for auxiliary and delegation clients — point vision/subagent calls at specific endpoints (#1375)
  • Native Anthropic auxiliary vision — use Claude's native vision API instead of routing through OpenAI-compatible endpoints (#1377)
  • Anthropic OAuth flow improvements — auto-run claude setup-token, reauthentication, PKCE state persistence, identity fingerprinting (#1132, #1360, #1396, #1597)
  • Fix adaptive thinking without budget_tokens for Claude 4.6 models — by @ASRagab (#1128)
  • Fix Anthropic cache markers through adapter — by @brandtcormorant (#1216)
  • Retry Anthropic 429/529 errors and surface details to users — by @0xbyt4 (#1585)
  • Fix Anthropic adapter max_tokens, fallback crash, proxy base_url — by @0xbyt4 (#1121)
  • Fix DeepSeek V3 parser dropping multiple parallel tool calls — by @mr-emmett-one (#1365, #1300)
  • Accept unlisted models with warning instead of rejecting (#1047, #1102)
  • Skip reasoning params for unsupported OpenRouter models (#1485)
  • MiniMax Anthropic API compatibility fix (#1623)
  • Custom endpoint /models verification and /v1 base URL suggestion (#1480)
  • Resolve delegation providers from custom_providers config (#1328)
  • Kimi model additions and User-Agent fix (#1039)
  • Strip call_id/response_item_id for Mistral compatibility (#1058)

Agent Loop & Conversation

  • Anthropic Context Editing API support (#1147)
  • Improved context compaction handoff summaries — compressor now preserves more actionable state (#1273)
  • Sync session_id after mid-run context compression (#1160)
  • Session hygiene threshold tuned to 50% for more proactive compression (#1096, #1161)
  • Include session ID in system prompt via --pass-session-id flag (#1040)
  • Prevent closed OpenAI client reuse across retries (#1391)
  • Sanitize chat payloads and provider precedence (#1253)
  • Handle dict tool call arguments from Codex and local backends (#1393, #1440)

Memory & Sessions

  • Improve memory prioritization — user preferences and corrections weighted above procedural knowledge (#1548)
  • Tighter memory and session recall guidance in system prompts (#1329)
  • Persist CLI token counts to session DB for /insights (#1498)
  • Keep Honcho recall out of the cached system prefix (#1201)
  • Correct seed_ai_identity to use session.add_messages() (#1475)
  • Isolate Honcho session routing for multi-user gateway (#1500)

📱 Messaging Platforms (Gateway)

Gateway Core

  • System gateway service mode — run as a system-level systemd service, not just user-level (#1371)
  • Gateway install scope prompts — choose user vs system scope during setup (#1374)
  • Reasoning hot reload — change reasoning settings without restarting the gateway (#1275)
  • Default group sessions to per-user isolation — no more shared state across users in group chats (#1495, #1417)
  • Harden gateway restart recovery (#1310)
  • Cancel active...
Read more

Hermes Agent v0.2.0 (2026.3.12)

12 Mar 10:07
a370ab8

Choose a tag to compare

Hermes Agent v0.2.0 (v2026.3.12)

Release Date: March 12, 2026

First tagged release since v0.1.0 (the initial pre-public foundation). In just over two weeks, Hermes Agent went from a small internal project to a full-featured AI agent platform — thanks to an explosion of community contributions. This release covers 216 merged pull requests from 63 contributors, resolving 119 issues.


✨ Highlights

  • Multi-Platform Messaging Gateway — Telegram, Discord, Slack, WhatsApp, Signal, Email (IMAP/SMTP), and Home Assistant platforms with unified session management, media attachments, and per-platform tool configuration.

  • MCP (Model Context Protocol) Client — Native MCP support with stdio and HTTP transports, reconnection, resource/prompt discovery, and sampling (server-initiated LLM requests). (#291@0xbyt4, #301, #753)

  • Skills Ecosystem — 70+ bundled and optional skills across 15+ categories with a Skills Hub for community discovery, per-platform enable/disable, conditional activation based on tool availability, and prerequisite validation. (#743@teyrebaz33, #785@teyrebaz33)

  • Centralized Provider Router — Unified call_llm()/async_call_llm() API replaces scattered provider logic across vision, summarization, compression, and trajectory saving. All auxiliary consumers route through a single code path with automatic credential resolution. (#1003)

  • ACP Server — VS Code, Zed, and JetBrains editor integration via the Agent Communication Protocol standard. (#949)

  • CLI Skin/Theme Engine — Data-driven visual customization: banners, spinners, colors, branding. 7 built-in skins + custom YAML skins.

  • Git Worktree Isolationhermes -w launches isolated agent sessions in git worktrees for safe parallel work on the same repo. (#654)

  • Filesystem Checkpoints & Rollback — Automatic snapshots before destructive operations with /rollback to restore. (#824)

  • 3,289 Tests — From near-zero test coverage to a comprehensive test suite covering agent, gateway, tools, cron, and CLI.


🏗️ Core Agent & Architecture

Provider & Model Support

  • Centralized provider router with resolve_provider_client() + call_llm() API (#1003)
  • Nous Portal as first-class provider in setup (#644)
  • OpenAI Codex (Responses API) with ChatGPT subscription support (#43) — @grp06
  • Codex OAuth vision support + multimodal content adapter
  • Validate /model against live API instead of hardcoded lists
  • Self-hosted Firecrawl support (#460) — @caentzminger
  • Kimi Code API support (#635) — @christomitov
  • MiniMax model ID update (#473) — @tars90percent
  • OpenRouter provider routing configuration (provider_preferences)
  • Nous credential refresh on 401 errors (#571, #269) — @rewbs
  • z.ai/GLM, Kimi/Moonshot, MiniMax, Azure OpenAI as first-class providers
  • Unified /model and /provider into single view

Agent Loop & Conversation

  • Simple fallback model for provider resilience (#740)
  • Shared iteration budget across parent + subagent delegation
  • Iteration budget pressure via tool result injection
  • Configurable subagent provider/model with full credential resolution
  • Handle 413 payload-too-large via compression instead of aborting (#153) — @tekelala
  • Retry with rebuilt payload after compression (#616) — @tripledoublev
  • Auto-compress pathologically large gateway sessions (#628)
  • Tool call repair middleware — auto-lowercase and invalid tool handler
  • Reasoning effort configuration and /reasoning command (#921)
  • Detect and block file re-read/search loops after context compression (#705) — @0xbyt4

Session & Memory

  • Session naming with unique titles, auto-lineage, rich listing, and resume by name (#720)
  • Interactive session browser with search filtering (#733)
  • Display previous messages when resuming a session (#734)
  • Honcho AI-native cross-session user modeling (#38) — @erosika
  • Proactive async memory flush on session expiry
  • Smart context length probing with persistent caching + banner display
  • /resume command for switching to named sessions in gateway
  • Session reset policy for messaging platforms

📱 Messaging Platforms (Gateway)

Telegram

  • Native file attachments: send_document + send_video
  • Document file processing for PDF, text, and Office files — @tekelala
  • Forum topic session isolation (#766) — @spanishflu-est1918
  • Browser screenshot sharing via MEDIA: protocol (#657)
  • Location support for find-nearby skill
  • TTS voice message accumulation fix (#176) — @Bartok9
  • Improved error handling and logging (#763) — @aydnOktay
  • Italic regex newline fix + 43 format tests (#204) — @0xbyt4

Discord

  • Channel topic included in session context (#248) — @Bartok9
  • DISCORD_ALLOW_BOTS config for bot message filtering (#758)
  • Document and video support (#784)
  • Improved error handling and logging (#761) — @aydnOktay

Slack

  • App_mention 404 fix + document/video support (#784)
  • Structured logging replacing print statements — @aydnOktay

WhatsApp

Signal

  • Full Signal messenger gateway via signal-cli-rest-api (#405)
  • Media URL support in message events (#871)

Email (IMAP/SMTP)

  • New email gateway platform — @0xbyt4

Home Assistant

  • REST tools + WebSocket gateway integration (#184) — @0xbyt4
  • Service discovery and enhanced setup
  • Toolset mapping fix (#538) — @Himess

Gateway Core

  • Expose subagent tool calls and thinking to users (#186) — @cutepawss
  • Configurable background process watcher notifications (#840)
  • edit_message() for Telegram/Discord/Slack with fallback
  • /compress, /usage, /update slash commands
  • Eliminated 3x SQLite message duplication in gateway sessions (#873)
  • Stabilize system prompt across gateway turns for cache hits (#754)
  • MCP server shutdown on gateway exit (#796) — @0xbyt4
  • Pass session_db to AIAgent, fixing session_search error (#108) — @Bartok9
  • Persist transcript changes in /retry, /undo; fix /reset attribute (#217) — @Farukest
  • UTF-8 encoding fix preventing Windows crashes (#369) — @ch3ronsa

🖥️ CLI & User Experience

Interactive CLI

  • Data-driven skin/theme engine — 7 built-in skins (default, ares, mono, slate, poseidon, sisyphus, charizard) + custom YAML skins
  • /personality command with custom personality + disable support (#773) — @teyrebaz33
  • User-defined quick commands that bypass the agent loop (#746) — @teyrebaz33
  • /reasoning command for effort level and display toggle (#921)
  • /verbose slash command to toggle debug at runtime ([#94](https://github.com/Nou...
Read more