Skip to content

chore: promote staging to staging-promote/e74214dc-23104855330 (2026-03-15 07:18 UTC)#1197

Merged
henrypark133 merged 41 commits intomainfrom
staging-promote/e0f393bf-23105705354
Mar 17, 2026
Merged

chore: promote staging to staging-promote/e74214dc-23104855330 (2026-03-15 07:18 UTC)#1197
henrypark133 merged 41 commits intomainfrom
staging-promote/e0f393bf-23105705354

Conversation

@ironclaw-ci
Copy link
Contributor

@ironclaw-ci ironclaw-ci bot commented Mar 15, 2026

Auto-promotion from staging CI

Batch range: 15ab156d62632e173d9a10933b775cece6ea66a5..e0f393bf04ffc29d9de4108c6725b3380b83536b
Promotion branch: staging-promote/e0f393bf-23105705354
Base: staging-promote/e74214dc-23104855330
Triggered by: Staging CI batch at 2026-03-15 07:18 UTC

Commits in this batch (10):

Current commits in this promotion (29)

Current base: main
Current head: staging-promote/e0f393bf-23105705354
Current range: origin/main..origin/staging-promote/e0f393bf-23105705354

Auto-updated by staging promotion metadata workflow

Waiting for gates:

  • Tests: pending
  • E2E: pending
  • Claude Code review: pending (will post comments on this PR)

Auto-created by staging-ci workflow

)

* fix(auth): avoid false success and block chat while auth pending

* fix(web): clear stale auth UI on failure and add setup regression test

* Update src/agent/thread_ops.rs

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

* fix(fmt): place auth activation comment on separate line

---------

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Illia Polosukhin <ilblackdragon@gmail.com>
@github-actions github-actions bot added scope: agent Agent core (agent loop, router, scheduler) scope: channel/web Web gateway channel size: M 50-199 changed lines risk: medium Business logic, config, or moderate-risk modules contributor: core 20+ merged PRs labels Mar 15, 2026
@claude
Copy link

claude bot commented Mar 15, 2026

Code review

Found 2 issues:

  1. [MEDIUM:75] Missing SSE broadcast when auth token submission fails in chat_auth_token_handler

When a token is submitted and result.activated == false, the handler returns ActionResponse::fail() but does not broadcast an AuthCompleted SSE event. This is asymmetric with the success case (which clears auth mode via SSE) and with extensions_setup_submit_handler (which broadcasts with success: result.activated). The frontend's authFlowPending flag may not be cleared if the UI relies on the SSE event rather than the HTTP response.

https://github.com/anthropics/ironclaw/blob/c27b740bdd35c1cc0efbe660d33e3e46467a757e/src/channels/web/server.rs#L1163-L1180

  1. [MEDIUM:65] Race condition: JavaScript authFlowPending flag can become permanently stale

If an auth flow terminates unexpectedly (network timeout, error, browser close) without reaching handleAuthCompleted(), the authFlowPending global flag remains true, permanently blocking chat input. The flag has no timeout and no recovery mechanism. Example: user submits token → network timeout → browser shows "Complete auth step" but can't retry.

https://github.com/anthropics/ironclaw/blob/c27b740bdd35c1cc0efbe660d33e3e46467a757e/src/channels/web/static/app.js#L490-L495

ilblackdragon and others added 19 commits March 15, 2026 20:38
…igured (#1194)

When a tunnel provider (ngrok, cloudflare, tailscale, etc.) or static
TUNNEL_URL is configured, external traffic arrives through the tunnel,
so binding 0.0.0.0 is unnecessary attack surface. The webhook server
now defaults to 127.0.0.1 when a tunnel is active. Explicit HTTP_HOST
still overrides the default in all cases.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Adversarial safety tests for regex, Unicode, and control char edge cases
…1200) (#1204)

Resolve compilation errors in Feishu/Lark WASM channel
…#1195)

The `__internal_job_monitor` metadata key that bypassed the entire
agent pipeline (hooks, safety checks, LLM processing) was spoofable
by external channels — WASM channel plugins could inject arbitrary
metadata including this key, causing attacker-controlled content to be
forwarded directly as assistant responses.

Replace the metadata-based check with a dedicated `is_internal` field
on `IncomingMessage` that can only be set via `into_internal()` by
trusted in-process code. Both the field and setter are `pub(crate)` to
prevent external crates from spoofing the flag. Also remove
`notify_metadata` forwarding (the monitor only needs channel/user/thread
routing) and the unused `__job_monitor_job_id` metadata key.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Mention MiniMax as built-in provider in READMEs
…1210)

* refactor(setup): extract init logic from wizard into owning modules

Move database, LLM model discovery, and secrets initialization logic
out of the setup wizard and into their owning modules, following the
CLAUDE.md principle that module-specific initialization must live in
the owning module as a public factory function.

Database (src/db/mod.rs, src/config/database.rs):
- Add DatabaseConfig::from_postgres_url() and from_libsql_path()
- Add connect_without_migrations() for connectivity testing
- Add validate_postgres() returning structured PgDiagnostic results

LLM (src/llm/models.rs — new file):
- Extract 8 model-fetching functions from wizard.rs (~380 lines)
- fetch_anthropic_models, fetch_openai_models, fetch_ollama_models,
  fetch_openai_compatible_models, build_nearai_model_fetch_config,
  and OpenAI sorting/filtering helpers

Secrets (src/secrets/mod.rs):
- Add resolve_master_key() unifying env var + keychain resolution
- Add crypto_from_hex() convenience wrapper

Wizard restructuring (src/setup/wizard.rs):
- Replace cfg-gated db_pool/db_backend fields with generic
  db: Option<Arc<dyn Database>> + db_handles: Option<DatabaseHandles>
- Delete 6 backend-specific methods (reconnect_postgres/libsql,
  test_database_connection_postgres/libsql, run_migrations_postgres/
  libsql, create_postgres/libsql_secrets_store)
- Simplify persist_settings, try_load_existing_settings,
  persist_session_to_db, init_secrets_context to backend-agnostic
  implementations using the new module factories
- Eliminate all references to deadpool_postgres, PoolConfig,
  LibSqlBackend, Store::from_pool, refinery::embed_migrations

Net: -878 lines from wizard, +395 lines in owning modules, +378 new.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test(settings): add wizard re-run regression tests

Add 10 tests covering settings preservation during wizard re-runs:
- provider_only rerun preserves channels/embeddings/heartbeat
- channels_only rerun preserves provider/model/embeddings
- quick mode rerun preserves prior channels and heartbeat
- full rerun same provider preserves model through merge
- full rerun different provider clears model through merge
- incremental persist doesn't clobber prior steps
- switching DB backend allows fresh connection settings
- merge preserves true booleans when overlay has default false
- embeddings survive rerun that skips step 5

These cover the scenarios where re-running the wizard would
previously risk resetting models, providers, or channel settings.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor(setup): eliminate cfg(feature) gates from wizard methods

Replace compile-time #[cfg(feature)] dispatch in the wizard with
runtime dispatch via DatabaseBackend enum and cfg!() macro constants.

- Merge step_database_postgres + step_database_libsql into step_database
  using runtime backend selection
- Rewrite auto_setup_database without feature gates
- Remove cfg(feature = "postgres") from mask_password_in_url (pure fn)
- Remove cfg(feature = "postgres") from test_mask_password_in_url

Only one internal #[cfg(feature = "postgres")] remains: guarding the
call to db::validate_postgres() which is itself feature-gated.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor(db): fold PG validation into connect_without_migrations

Move PostgreSQL prerequisite validation (version >= 15, pgvector)
from the wizard into connect_without_migrations() in the db module.
The validation now returns DatabaseError directly with user-facing
messages, eliminating the PgDiagnostic enum and the last
#[cfg(feature)] gate from the wizard.

The wizard's test_database_connection() is now a 5-line method that
calls the db module factory and stores the result.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address PR review comments [skip-regression-check]

- Use .as_ref().map() to avoid partial move of db_config.libsql_path
  (gemini-code-assist)
- Default to available backend when DATABASE_BACKEND is invalid, not
  unconditionally to Postgres which may not be compiled (Copilot)
- Match DatabaseBackend::Postgres explicitly instead of _ => wildcard
  in connect_with_handles, connect_without_migrations, and
  create_secrets_store to avoid silently routing LibSql configs through
  the Postgres path when libsql feature is disabled (Copilot)
- Upgrade Ollama connection failure log from info to warn with the
  base URL for better visibility in wizard UX (Copilot)
- Clarify crypto_from_hex doc: SecretsCrypto validates key length,
  not hex encoding (Copilot)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address zmanian's PR review feedback [skip-regression-check]

- Update src/setup/README.md to reflect Arc<dyn Database> flow
- Remove stale "Test PostgreSQL connection" doc comment
- Replace unwrap_or(0) in validate_postgres with descriptive error
- Add NearAiConfig::for_model_discovery() constructor
- Narrow pub to pub(crate) for internal model helpers

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address Copilot review comments (quick-mode postgres gate, empty env vars) [skip-regression-check]

- Gate DATABASE_URL auto-detection on POSTGRES_AVAILABLE in quick mode
  so libsql-only builds don't attempt a postgres connection
- Match empty-env-var filtering in key source detection to align with
  resolve_master_key() behavior
- Filter empty strings to None in DatabaseConfig::from_libsql_path()
  for turso_url/turso_token

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…1166)

* fix: Telegram bot token validation fails intermittently (HTTP 404)

* fix: code style

* fix

* fix

* fix

* review fix
* feat: add Codex auth.json token reuse for LLM authentication

When LLM_USE_CODEX_AUTH=true, IronClaw reads the Codex CLI's auth.json
(default ~/.codex/auth.json) and extracts the API key or OAuth access
token. This lets IronClaw piggyback on a Codex login without
implementing its own OAuth flow.

New env vars:
  - LLM_USE_CODEX_AUTH: enable Codex auth fallback (default: false)
  - CODEX_AUTH_PATH: override path to auth.json

* fix: handle ChatGPT auth mode correctly

Switch base_url to chatgpt.com/backend-api/codex when auth.json
contains ChatGPT OAuth tokens. The access_token is a JWT that only
works against the private ChatGPT backend, not the public OpenAI API.

Refactored codex_auth.rs to return CodexCredentials (token +
is_chatgpt_mode) instead of just a string key.

* fix: Codex auth takes highest priority over secrets store

When LLM_USE_CODEX_AUTH=true, Codex credentials are now loaded before
checking env vars or the secrets store overlay. Previously the secrets
store key (injected during onboarding) would shadow the Codex token.

* feat: Responses API provider for ChatGPT backend

- New CodexChatGptProvider speaks the Responses API protocol
- Auto-detects model from /models endpoint (gpt-4o -> gpt-5.2-codex)
- Adds store=false (required by ChatGPT backend)
- Error handling with timeout for HTTP 400 responses
- Message format translation: Chat Completions -> Responses API
- SSE response parsing for text, tool calls, and usage stats
- 7 unit tests for message conversion and SSE parsing

* fix: SSE parser uses item_id instead of call_id for tool call deltas

The Responses API sends function_call_arguments.delta events with
item_id (e.g. fc_...) not call_id (e.g. call_...). The parser now
keys pending tool calls by item_id from output_item.added and
tracks call_id separately for result matching.

* fix: strip empty string values from tool call arguments

gpt-5.2-codex fills optional tool parameters with empty strings
(e.g. timestamp: ""), which IronClaw's tool validation rejects.
Strip them before passing to tool execution.

* fix: prevent apiKey mode fallback to ChatGPT token

When auth_mode is explicitly 'apiKey' but the key is missing/empty,
do not fall through to check for a ChatGPT access_token. This prevents
returning credentials with is_chatgpt_mode: true and routing to the
wrong LLM provider.

* refactor: reuse single reqwest::Client across model discovery and LLM calls

Create Client once in with_auto_model, pass &Client to
fetch_default_model, and move it into the provider struct.
Eliminates the redundant Client::new() that wasted a connection pool.

* fix: bump client_version to 1.0.0 to unlock gpt-5.3-codex and gpt-5.4

The /models endpoint gates newer models behind client_version.
Version 0.1.0 only returns up to gpt-5.2-codex, while 1.0.0+
also returns gpt-5.3-codex and gpt-5.4.

* feat: user-configured LLM_MODEL takes priority over auto-detection

Fetch the full model list from /models endpoint. If LLM_MODEL is set,
validate it against the supported list and warn with available models
if not found. If LLM_MODEL is not set, auto-detect the highest-priority
model. Also bumps client_version to 1.0.0 to unlock gpt-5.3/5.4.

* fix: add 10s timeout to model discovery HTTP request

Prevents startup from blocking indefinitely if chatgpt.com
is slow or unreachable. Uses reqwest per-request timeout.

* docs: add private API warning for ChatGPT backend endpoint

The chatgpt.com/backend-api/codex endpoint is private and
undocumented. Add warning in module docs and a runtime log
on first use to inform users of potential ToS implications.

* feat: implement OAuth 401 token refresh for Codex ChatGPT provider

On HTTP 401, if a refresh_token is available, the provider now
automatically refreshes the access token via auth.openai.com/oauth/token
(same protocol as Codex CLI) and retries the request once. Refreshed
tokens are persisted back to auth.json.

Changes:
- codex_auth: read refresh_token, add refresh_access_token() and
  persist_refreshed_tokens()
- codex_chatgpt: RwLock for api_key, 401 detection + retry in
  send_request, send_http_request helper
- config/llm: thread refresh_token/auth_path through RegistryProviderConfig
- llm/mod: pass refresh params to with_auto_model

* refactor: lazy model detection via OnceCell, remove block_in_place

Model is no longer resolved during provider construction. Instead,
resolve_model() uses tokio::sync::OnceCell to lazily fetch from
/models on the first LLM call. This eliminates the block_in_place
+ block_on workaround in create_codex_chatgpt_from_registry.

- with_auto_model (async) -> with_lazy_model (sync constructor)
- resolve_model() added with OnceCell-based lazy init
- build_request_body takes model as parameter
- model_name() returns resolved or configured_model as fallback

* feat: support multimodal content (images) in Codex ChatGPT provider

message_to_input_items now checks content_parts for user messages.
ContentPart::Text maps to input_text and ContentPart::ImageUrl maps
to input_image, matching the Responses API format used by Codex CLI.
Falls back to plain text when content_parts is empty.

Also updates client_version to 0.111.0 for /models endpoint.

Adds test: test_message_conversion_user_with_image

* refactor: move codex_auth module from src/ to src/llm/

codex_auth is only used by the LLM layer (codex_chatgpt provider
and config/llm). Moving it under src/llm/ reflects its actual scope.

- Remove pub mod codex_auth from lib.rs
- Add pub mod codex_auth to llm/mod.rs
- Update imports: super::codex_auth, crate::llm::codex_auth

* Fix codex provider style issues

* Use SecretString throughout codex auth refresh flow

* Use SecretString for codex access tokens

* Reuse provider client for codex token refresh

* Stream Codex SSE responses incrementally

* Fix Windows clippy and SQLite test linkage

* Trigger checks after regression skip label

* Tighten codex auth module handling
…1029)

* feat(heartbeat): fire_at time-of-day scheduling with IANA timezone support

- HEARTBEAT_FIRE_AT=HH:MM — fire heartbeat at a specific time of day instead
  of on a rolling interval; format is 24h HH:MM (e.g. "14:00")
- HEARTBEAT_TIMEZONE=Region/City — IANA timezone name for fire_at (e.g.
  "Pacific/Auckland", "America/New_York"). Defaults to UTC.
- When fire_at is set, interval_secs is ignored
- Config also readable from settings.toml [heartbeat] section

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(heartbeat): wire fire_at + timezone into HeartbeatConfig runner

Missed file from heartbeat scheduling commit. HeartbeatConfig struct in
agent/heartbeat.rs now carries fire_at: Option<NaiveTime> and timezone: Tz
so the runner can schedule against a fixed time of day.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: add chrono-tz dependency for heartbeat fire_at timezone support

The chrono-tz crate was used in the heartbeat fire_at commits but
its Cargo.toml entry was lost during rebase conflict resolution.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* style: rustfmt fix for chained method call

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* test(heartbeat): add fire_at scheduling and DST safety tests

- test_default_config_has_no_fire_at: interval-based default unchanged
- test_with_fire_at_builder: builder sets time and timezone
- test_duration_until_next_fire_is_bounded: result always 1s–24h
- test_duration_until_next_fire_dst_timezone_no_panic: US Eastern DST
- test_resolved_tz_defaults_to_utc: missing timezone falls back to UTC
- test_resolved_tz_parses_iana: IANA string resolves correctly

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(heartbeat): restore drift-free interval, add settings.json fallback for fire_at

- Interval path: restore tokio::time::interval (drift-free) instead of
  tokio::time::sleep which drifts by loop body execution time
- fire_at config: fall back to settings.heartbeat.fire_at when
  HEARTBEAT_FIRE_AT env var is not set, consistent with other settings

Addresses Gemini Code Assist review feedback.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: IronClaw <deploy@agentiff.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…1069)

* fix(worker): prevent orphaned tool_results and fix parallel merging

Two fixes for tool result handling in the Worker:

1. Preserve reasoning text from select_tools() in the RespondResult
   content field so it appears in the assistant_with_tool_calls message
   pushed by execute_tool_calls. Without this, the LLM's reasoning
   context was lost when using the select_tools path.

2. Merge consecutive tool_result messages into a single User message
   in rig_adapter's convert_messages(). When parallel tools execute,
   each produces a separate ChatMessage with role: Tool. Without
   merging, these become consecutive User messages which Anthropic
   rejects. Now consecutive tool results are merged into one User
   message with multiple ToolResult content items.

Includes regression tests for both fixes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(worker): use find_map for first non-empty reasoning extraction

The previous code only checked the first ToolSelection's reasoning,
missing cases where the first selection has empty reasoning but
subsequent ones do not. Switch to find_map to get the first non-empty
reasoning across all selections.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…ead (#1213)

* fix(llm): persist refreshed Anthropic OAuth token after Keychain re-read (#1136)

The Anthropic OAuth provider stored its token as an immutable SecretString.
When a 401 triggered a Keychain re-read, the fresh token was used for a
single retry but never persisted — every subsequent request reused the
expired original token, causing repeated auth failures.

Changes:
- Wrap token in RwLock<SecretString> so it can be updated after refresh
- Persist refreshed token via update_token() on successful retry
- Add 500ms delay before Keychain re-read to give Claude Code time to
  complete its async token refresh write (reduces race window)
- Add regression test verifying token updates persist across reads

Closes #1136

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* style: fix formatting

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… race errors (#1068)

* fix(jobs): make completed->completed transition idempotent to prevent race errors

Both execution_loop and the worker wrapper in execute() can race to call
mark_completed(). Previously the second call hit "Cannot transition from
completed to completed" and errored the job despite successful completion.

This narrowly allows only the Completed->Completed self-transition as
idempotent (early return with debug log, no duplicate history entry).
All other self-transitions remain rejected to preserve state machine
strictness.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* style: fix assert! formatting in idempotent completion test

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat(sandbox): add retry logic for transient container failures (#1224)

SandboxManager::execute_with_policy() had no retry logic. Transient Docker
errors (daemon temporarily unavailable, container creation race conditions,
container start failures) caused immediate job failure.

Adds up to 2 retries (3 total attempts) with exponential backoff (2s, 4s)
for transient error types only:
- DockerNotAvailable
- ContainerCreationFailed
- ContainerStartFailed

Non-transient errors (Timeout, ExecutionFailed, NetworkBlocked, Config)
are returned immediately without retry.

Container cleanup on retry is safe: ContainerRunner::execute() always
force-removes the container before returning.

Closes #1224

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* style: fix formatting

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…#1119) (#1203)

Unify config resolution with Settings fallback (Phase 2)
Route messages and replies to the correct Telegram forum topic via
message_thread_id. Key behaviors:

- Parse message_thread_id, is_topic_message, is_forum from incoming updates
- Thread agent sessions by "chat_id:topic_id" for forum groups only
  (non-forum reply threads are excluded via is_forum guard)
- Pass message_thread_id through all send methods (text, photo, document)
- Normalize thread_id=1 (General topic) to None for sendMessage/sendPhoto/
  sendDocument since Telegram rejects it, but preserve it for sendChatAction
  where Telegram requires it for typing indicators
- Hoist bot_username workspace read to avoid duplicate WASM host call per
  group message

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…03-16 05:35 UTC) (#1236)

* refactor(setup): extract init logic from wizard into owning modules (#1210)

* refactor(setup): extract init logic from wizard into owning modules

Move database, LLM model discovery, and secrets initialization logic
out of the setup wizard and into their owning modules, following the
CLAUDE.md principle that module-specific initialization must live in
the owning module as a public factory function.

Database (src/db/mod.rs, src/config/database.rs):
- Add DatabaseConfig::from_postgres_url() and from_libsql_path()
- Add connect_without_migrations() for connectivity testing
- Add validate_postgres() returning structured PgDiagnostic results

LLM (src/llm/models.rs — new file):
- Extract 8 model-fetching functions from wizard.rs (~380 lines)
- fetch_anthropic_models, fetch_openai_models, fetch_ollama_models,
  fetch_openai_compatible_models, build_nearai_model_fetch_config,
  and OpenAI sorting/filtering helpers

Secrets (src/secrets/mod.rs):
- Add resolve_master_key() unifying env var + keychain resolution
- Add crypto_from_hex() convenience wrapper

Wizard restructuring (src/setup/wizard.rs):
- Replace cfg-gated db_pool/db_backend fields with generic
  db: Option<Arc<dyn Database>> + db_handles: Option<DatabaseHandles>
- Delete 6 backend-specific methods (reconnect_postgres/libsql,
  test_database_connection_postgres/libsql, run_migrations_postgres/
  libsql, create_postgres/libsql_secrets_store)
- Simplify persist_settings, try_load_existing_settings,
  persist_session_to_db, init_secrets_context to backend-agnostic
  implementations using the new module factories
- Eliminate all references to deadpool_postgres, PoolConfig,
  LibSqlBackend, Store::from_pool, refinery::embed_migrations

Net: -878 lines from wizard, +395 lines in owning modules, +378 new.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test(settings): add wizard re-run regression tests

Add 10 tests covering settings preservation during wizard re-runs:
- provider_only rerun preserves channels/embeddings/heartbeat
- channels_only rerun preserves provider/model/embeddings
- quick mode rerun preserves prior channels and heartbeat
- full rerun same provider preserves model through merge
- full rerun different provider clears model through merge
- incremental persist doesn't clobber prior steps
- switching DB backend allows fresh connection settings
- merge preserves true booleans when overlay has default false
- embeddings survive rerun that skips step 5

These cover the scenarios where re-running the wizard would
previously risk resetting models, providers, or channel settings.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor(setup): eliminate cfg(feature) gates from wizard methods

Replace compile-time #[cfg(feature)] dispatch in the wizard with
runtime dispatch via DatabaseBackend enum and cfg!() macro constants.

- Merge step_database_postgres + step_database_libsql into step_database
  using runtime backend selection
- Rewrite auto_setup_database without feature gates
- Remove cfg(feature = "postgres") from mask_password_in_url (pure fn)
- Remove cfg(feature = "postgres") from test_mask_password_in_url

Only one internal #[cfg(feature = "postgres")] remains: guarding the
call to db::validate_postgres() which is itself feature-gated.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor(db): fold PG validation into connect_without_migrations

Move PostgreSQL prerequisite validation (version >= 15, pgvector)
from the wizard into connect_without_migrations() in the db module.
The validation now returns DatabaseError directly with user-facing
messages, eliminating the PgDiagnostic enum and the last
#[cfg(feature)] gate from the wizard.

The wizard's test_database_connection() is now a 5-line method that
calls the db module factory and stores the result.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address PR review comments [skip-regression-check]

- Use .as_ref().map() to avoid partial move of db_config.libsql_path
  (gemini-code-assist)
- Default to available backend when DATABASE_BACKEND is invalid, not
  unconditionally to Postgres which may not be compiled (Copilot)
- Match DatabaseBackend::Postgres explicitly instead of _ => wildcard
  in connect_with_handles, connect_without_migrations, and
  create_secrets_store to avoid silently routing LibSql configs through
  the Postgres path when libsql feature is disabled (Copilot)
- Upgrade Ollama connection failure log from info to warn with the
  base URL for better visibility in wizard UX (Copilot)
- Clarify crypto_from_hex doc: SecretsCrypto validates key length,
  not hex encoding (Copilot)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address zmanian's PR review feedback [skip-regression-check]

- Update src/setup/README.md to reflect Arc<dyn Database> flow
- Remove stale "Test PostgreSQL connection" doc comment
- Replace unwrap_or(0) in validate_postgres with descriptive error
- Add NearAiConfig::for_model_discovery() constructor
- Narrow pub to pub(crate) for internal model helpers

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address Copilot review comments (quick-mode postgres gate, empty env vars) [skip-regression-check]

- Gate DATABASE_URL auto-detection on POSTGRES_AVAILABLE in quick mode
  so libsql-only builds don't attempt a postgres connection
- Match empty-env-var filtering in key source detection to align with
  resolve_master_key() behavior
- Filter empty strings to None in DatabaseConfig::from_libsql_path()
  for turso_url/turso_token

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: Telegram bot token validation fails intermittently (HTTP 404) (#1166)

* fix: Telegram bot token validation fails intermittently (HTTP 404)

* fix: code style

* fix

* fix

* fix

* review fix

---------

Co-authored-by: Illia Polosukhin <ilblackdragon@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Nick Pismenkov <50764773+nickpismenkov@users.noreply.github.com>
* feat(telegram): verify owner during hot activation

* fix(ci): satisfy no-panics and clippy checks

* fix(web): preserve relay activation status

* fix(telegram): redact setup errors

* fix(telegram): require owner verification code

* fix(telegram): allow code in conversational dm
#1255)

* fix: web/CLI routine mutations do not refresh live event trigger cache

* review fix
Base automatically changed from staging-promote/e74214dc-23104855330 to main March 16, 2026 20:28
…2222

chore: promote staging to staging-promote/946c040f-23134229055 (2026-03-16 15:23 UTC)
…9055

chore: promote staging to staging-promote/57c397bd-23120362128 (2026-03-16 08:20 UTC)
@github-actions github-actions bot added size: XL 500+ changed lines risk: high Safety, secrets, auth, or critical infrastructure and removed size: M 50-199 changed lines risk: medium Business logic, config, or moderate-risk modules labels Mar 16, 2026
@nickpismenkov nickpismenkov force-pushed the staging-promote/e0f393bf-23105705354 branch from 549a495 to e397546 Compare March 16, 2026 20:57
nickpismenkov and others added 6 commits March 16, 2026 14:47
Resolved merge conflicts in 5 files:

1. src/agent/job_monitor.rs - Used is_internal flag approach (HEAD) for safe internal message marking. Removed metadata-based approach which could be spoofed by external channels.

2. src/agent/agent_loop.rs - Used is_internal check (HEAD) for routing internal messages, consistent with security model where is_internal field cannot be spoofed.

3. src/agent/dispatcher.rs - Included notify_metadata in job context (main), needed for job routing through JobMonitorRoute.

4. src/setup/wizard.rs - Added build_nearai_model_fetch_config() function (main) for model selection during setup.

5. src/tools/builtin/job.rs - Used both comments from HEAD (clarifying notify_channel and notify_user logic) while removing metadata field from JobMonitorRoute (consistent with job_monitor.rs).

All conflicts resolved with security-first approach: use is_internal boolean field for internal message marking (cannot be spoofed), while passing routing metadata through context.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
- Remove duplicate build_nearai_model_fetch_config() definition from setup/wizard.rs
  (function already exists in llm/models.rs and is imported)
- Add missing cheap_model and smart_routing_cascade fields to LlmConfig
  initializer in build_nearai_model_fetch_config() (llm/models.rs)
- Pass request_timeout_secs to create_registry_provider() call
  (llm/mod.rs:432)

All clippy checks pass with zero warnings (--no-default-features --features libsql).

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
* fix staging CI coverage regressions

* ci: cover all e2e scenarios in staging

* ci: restrict staging PR checks and fix webhook assertions

* ci: keep code style checks on PRs

* ci: preserve e2e PR coverage

* test: stabilize staging e2e coverage

* fix: propagate postgres tls builder errors
…6794

chore: promote staging to staging-promote/878a67cd-23166116689 (2026-03-16 22:08 UTC)
…6689

chore: promote staging to staging-promote/e0f393bf-23105705354 (2026-03-16 21:11 UTC)
@github-actions github-actions bot added scope: tool/wasm WASM tool sandbox scope: db/postgres PostgreSQL backend scope: db/libsql libSQL / Turso backend labels Mar 16, 2026
henrypark133 and others added 10 commits March 16, 2026 16:05
* fix staging CI coverage regressions

* ci: cover all e2e scenarios in staging

* ci: restrict staging PR checks and fix webhook assertions

* ci: keep code style checks on PRs

* ci: preserve e2e PR coverage

* test: stabilize staging e2e coverage

* fix: propagate postgres tls builder errors

* ci: isolate heavy integration tests

* fix: clean up heavy integration CI follow-up
* fix: misleading UI message

* review fixes

* review fixes

* enhance test
…1776

chore: promote staging to staging-promote/1f209db0-23170138026 (2026-03-16 23:13 UTC)
…8026

chore: promote staging to staging-promote/e0f393bf-23105705354 (2026-03-16 23:06 UTC)
* test(e2e): fix approval waiting regression coverage

* test(e2e): address Copilot review notes
* Fix Telegram auto-verify flow and routing

* Fix CI formatting and clippy follow-ups

* Simplify Telegram waiting state update

* Fix notification fallback scopes

* Fix message metadata routing and zh-CN copy
…2462

chore: promote staging to staging-promote/90655277-23176260323 (2026-03-17 03:24 UTC)
…0323

chore: promote staging to staging-promote/e0f393bf-23105705354 (2026-03-17 02:56 UTC)
@github-actions github-actions bot added the scope: tool Tool infrastructure label Mar 17, 2026
@henrypark133
Copy link
Collaborator

Flaky OAuth wildcard callback tests race on OAUTH_CALLBACK_HOST in CI #1280
#1280

@henrypark133 henrypark133 merged commit deee24c into main Mar 17, 2026
33 of 35 checks passed
@henrypark133 henrypark133 deleted the staging-promote/e0f393bf-23105705354 branch March 17, 2026 03:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor: core 20+ merged PRs risk: high Safety, secrets, auth, or critical infrastructure scope: agent Agent core (agent loop, router, scheduler) scope: channel/cli TUI / CLI channel scope: channel/wasm WASM channel runtime scope: channel/web Web gateway channel scope: channel Channel infrastructure scope: ci CI/CD workflows scope: config Configuration scope: db/libsql libSQL / Turso backend scope: db/postgres PostgreSQL backend scope: db Database trait / abstraction scope: dependencies Dependency updates scope: docs Documentation scope: extensions Extension management scope: llm LLM integration scope: sandbox Docker sandbox scope: secrets Secrets management scope: setup Onboarding / setup scope: tool/builtin Built-in tools scope: tool/wasm WASM tool sandbox scope: tool Tool infrastructure scope: worker Container worker size: XL 500+ changed lines staging-promotion

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants