Skip to content

chore: promote staging to main (2026-03-18 23:07 UTC)#1387

Merged
henrypark133 merged 17 commits intomainfrom
staging-promote/ec04354c-23271447493
Mar 19, 2026
Merged

chore: promote staging to main (2026-03-18 23:07 UTC)#1387
henrypark133 merged 17 commits intomainfrom
staging-promote/ec04354c-23271447493

Conversation

@ironclaw-ci
Copy link
Contributor

@ironclaw-ci ironclaw-ci bot commented Mar 18, 2026

Auto-promotion from staging CI

Batch range: 428303af1128e7f124ad623fc1338393a4d06fcc..ec04354c6b031ff45b10c88592813f9b01564a22
Promotion branch: staging-promote/ec04354c-23271447493
Base: main
Triggered by: Staging CI batch at 2026-03-18 23:07 UTC

Commits in this batch (13):

Current commits in this promotion (11)

Current base: main
Current head: staging-promote/ec04354c-23271447493
Current range: origin/main..origin/staging-promote/ec04354c-23271447493

Auto-updated by staging promotion metadata workflow

Waiting for gates:

  • Tests: pending
  • E2E: pending
  • Claude Code review: pending (will post comments on this PR)

Auto-created by staging-ci workflow

henrypark133 and others added 2 commits March 18, 2026 15:33
…1374)

* fix: full_job routine runs stay running until linked job completion (#1317)

Previously, execute_full_job() returned RunStatus::Ok immediately after
dispatching the job, causing routine runs to be marked as completed before
the linked worker job had actually finished. This meant failure notifications
were never sent and max_concurrent guardrails stopped applying once the run
was prematurely finalized.

Changes:
- execute_full_job() now returns RunStatus::Running instead of Ok
- execute_routine() skips finalization for Running status (leaves run open)
- New sync_dispatched_runs() polls on each cron tick, checks linked job
  state, and finalizes runs when jobs reach terminal states
- New list_dispatched_routine_runs() DB method on both backends
- Deferred notifications are sent when the run is actually finalized
- consecutive_failures is preserved (not reset) while outcome is unknown

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address PR review feedback (watcher predicate, running_count safety)

- FullJobWatcher: use is_parallel_blocking() instead of is_active() so
  the watcher exits when a job reaches Completed (not terminal but
  finished executing). Fixes infinite-poll for routine jobs.
- Remove running_count decrement from sync_dispatched_runs() — in normal
  flow execute_routine() handles it; sync only runs for crash recovery
  where the counter is already 0.
- Update PR description to match actual FullJobWatcher behavior.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: sync only at startup to prevent double-completion race

- Move sync_dispatched_runs() out of cron loop into startup-only path.
  During normal operation FullJobWatcher handles finalization inline;
  running sync on every tick would race with the watcher.
- Update complete_dispatched_run() to properly advance runtime fields
  (last_run_at, next_fire_at, run_count) for crash recovery — in that
  scenario execute_routine() never reached its runtime update.
- Fix stale doc comment on complete_dispatched_run().

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: use boot_time filter for safe periodic sync of orphaned runs

- Add boot_time field to RoutineEngine, set to Utc::now() at creation.
- sync_dispatched_runs() now filters runs by started_at < boot_time,
  so it only processes orphans from a previous process — never races
  with FullJobWatcher instances from the current process.
- Move sync back into the cron loop (safe with boot_time filter) and
  run it BEFORE check_cron_triggers to avoid picking up freshly
  dispatched runs.
- Fix doc comments to match actual behavior.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Cache discovery_schema() with OnceLock for routine tools (fixes #1361, #1371)
- Early-return on empty event cache before allocating Vec (fixes #1369)
- Extract batch concurrent count query helper to reduce duplication
- Fix ROUTINE_OK sentinel substring matching
- Migrate crate::safety import to ironclaw_safety per project convention

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions github-actions bot added size: XL 500+ changed lines risk: medium Business logic, config, or moderate-risk modules contributor: core 20+ merged PRs scope: agent Agent core (agent loop, router, scheduler) scope: tool/builtin Built-in tools scope: db Database trait / abstraction scope: db/postgres PostgreSQL backend labels Mar 18, 2026
ilblackdragon and others added 4 commits March 18, 2026 16:18
* feat(gateway): full settings page polish with all tiers

- Backend: add ActiveConfigSnapshot to expose resolved LLM backend,
  model, and enabled channels via /api/gateway/status
- Add missing Agent settings (daily cost cap, actions/hour, local tools)
- Add Sandbox, Routines, Safety, Skills, and Search setting groups
- Settings import/export (JSON download + file upload)
- Active env defaults shown as placeholders in Inference settings
- Styled confirmation modals replace window.confirm() for remove actions
- Global restart banner persists across settings subtab switches
- Client-side validation with min/max constraints on number inputs
- Accessibility: aria-label on inputs, role=status on save indicators
- Settings search filters rows across current subtab
- Smooth CSS transitions for conditional field visibility (showWhen)
- Tunnel settings in Channels subtab
- Mobile responsive settings layout at 768px breakpoint
- i18n keys for toolbar, search, and import/export in en + zh-CN

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(gateway): polish settings page and remove registered tools debug section

Remove the "Registered Tools" table from the extensions tab (debug info
not useful to end users), clean up associated CSS/i18n/JS. Additional
settings page UI polish: extension card state styling, layout refinements.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(gateway): address PR review feedback [skip-regression-check]

- Use refreshCurrentSettingsTab() in SSE event handlers to reduce duplication
- Remove unused formatGroupName/formatSettingLabel helpers
- Use i18n keys for MCP Configure/Reconfigure buttons
- Add data-i18n-placeholder to settings search input
- Remove data-i18n from confirm modal button (set dynamically by showConfirmModal)
- Fix cargo fmt in main.rs

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(e2e): update tests for unified settings tab layout [skip-regression-check]

- Update TABS list: replace extensions/skills with settings
- Add settings_subtab/settings_subpanel selectors to helpers
- Update test_connection, test_skills, test_extensions, test_wasm_lifecycle
  to navigate via Settings > subtab instead of top-level tabs
- Move MCP card tests to use go_to_mcp() helper (MCP is now a separate subtab)
- Remove tools table tests and mock_ext_apis tools= parameter
- Fix CSP violation: replace inline onclick on confirm modal cancel button

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(gateway): address second round of PR review feedback [skip-regression-check]

- Use I18n.t() for MCP empty state, export/import toasts, confirm modal
- Fix CLI channel card using wrong channel key ('repl' -> 'cli')
- Fix settings search counting hidden rows as visible
- Add aria-label i18n for settings search input
- Add common.loadFailed i18n key (en + zh-CN)
- Update E2E tests: WASM channel tests use Channels subtab,
  remove tests use custom confirm modal instead of window.confirm

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(e2e): fix WASM channel card selector and skills remove confirm [skip-regression-check]

- WASM channel tests: filter by display name to avoid matching built-in
  channel cards in the Channels subtab
- Skills remove test: click confirm modal button instead of using
  window.confirm (skill removal now uses custom confirm modal)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(gateway): address third round of PR review feedback [skip-regression-check]

- approval_needed SSE: refresh any active settings subtab, not just
  Extensions — approvals can surface from Channels/MCP setup flows too
- renderCardsSkeleton: remove nested .extensions-list wrapper that
  caused skeleton cards to render constrained inside grid cells

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(e2e): fix auth_completed reload test race condition [skip-regression-check]

Use expect_response to deterministically wait for the /api/extensions
reload triggered by handleAuthCompleted → refreshCurrentSettingsTab,
instead of a fixed 600ms sleep that was too short under CI load.
Also remove stale /api/extensions/tools route handler.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(e2e): debug auth_completed reload test with function counter [skip-regression-check]

Inject a counter wrapper around refreshCurrentSettingsTab to verify it's
actually called, and wait for the async fetch to complete before
asserting the reload count.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(gateway): localize all settings labels, descriptions, and channel cards [skip-regression-check]

Move 120+ hardcoded strings in settings definitions (INFERENCE_SETTINGS,
AGENT_SETTINGS, NETWORKING_SETTINGS) and channel card labels to i18n
keys. Render functions now resolve labels via I18n.t() so the settings
page translates when switching locales.

Covers: group titles, setting labels/descriptions, built-in channel
names/descriptions, and the "No settings found" empty state.

Both en.js and zh-CN.js updated with all new cfg.* and channels.* keys.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(gateway): localize remaining hardcoded UI strings [skip-regression-check]

- Fix export error toast using wrong i18n key (importFailed → exportFailed)
- Replace "Failed to load settings:" with I18n.t('common.loadFailed')
- Localize renderBuiltinChannelCard: "Built-in", "Active", "Inactive"
- Localize settings placeholders: "env: ", "env default", "use env default"
- Localize "✓ Saved" indicator
- Add new i18n keys to both en.js and zh-CN.js

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(gateway): confirm modal a11y, Esc/click-outside, search guard [skip-regression-check]

- Add role="dialog", aria-modal="true", aria-labelledby to confirm modal
- Focus confirm button when modal opens
- Close modal on Escape key or overlay click
- Skip settings search on non-settings panels (Extensions/MCP/Skills/Channels)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(gateway): boolean tri-state, search reset on subtab switch, stale model suggestions [skip-regression-check]

Address PR review feedback:
- Boolean settings now use a tri-state select (env default / On / Off)
  instead of a checkbox, matching the pattern used by other select settings
  and allowing users to revert to the env default
- Clear search input when switching settings subtabs so stale filters
  don't carry over to the new panel
- Always assign model suggestions (even empty array) so stale IDs from a
  previous successful /v1/models fetch don't persist when the endpoint
  later returns empty

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(gateway): auth_completed handler, bedrock_cross_region select, integer-only number inputs [skip-regression-check]

Address PR review feedback:
- auth_completed SSE listener now delegates to handleAuthCompleted(data)
  instead of inlining logic with a bare closeConfigureModal() call, so
  only the matching extension's modal is dismissed
- bedrock_cross_region changed from free text to select with the four
  valid values (us/eu/apac/global), matching backend validation
- Number settings now use step=1 and parseInt() instead of parseFloat(),
  preventing fractional values that the backend (u32/u64) would reject

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* fix: remove debug_assert guards that panic on valid error paths (#1312)

Two debug_assert! calls added in #1312 fire on expected runtime error
paths (not programmer bugs), turning graceful error returns into panics
in debug/test builds:

- state.rs: Completed→Cancelled is a user-facing error handled by
  transition_to() returning Err — not a bug
- execute.rs: empty tool_name from malformed LLM output is handled by
  ToolError::NotFound — not a bug

Removes both asserts; keeps the circuit-breaker assert (genuinely guards
a caller invariant).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: tighten empty tool name test to assert ToolError::NotFound variant

Address review feedback: assert the specific error variant instead of
just is_err() so the regression test actually enforces the expected
error path.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* style: cargo fmt

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
chore: sync main and staging
…3042

chore: promote staging to staging-promote/ec04354c-23271447493 (2026-03-19 00:12 UTC)
zmanian and others added 8 commits March 18, 2026 20:37
* feat(telegram): support auto split large message

* fix(telegram): strengthen split_message test assertion

Replace word-by-word contains check with assert_eq! on rejoined chunks,
ensuring split_message preserves content exactly.

send_response is still used (lines 745, 753) so it is intentionally kept.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(telegram): add missing split_message tests and document limitations

- Add test for sentence-boundary splitting
- Add test for hard-cut on pathological input (no spaces)
- Add test for multi-byte character safety (emoji)
- Document CJK sentence punctuation limitation
- Document trim behavior at chunk boundaries

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* ci: re-trigger CI with latest changes

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Hans <me@hans00.me>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(testing): add FaultInjector framework for StubLlm (#1220)

Adds a configurable fault injection framework for testing retry, failover,
and circuit breaker behavior. The FaultInjector attaches to StubLlm and
provides per-call control over failure type, timing, and sequencing.

Components:
- FaultType: maps to LlmError variants (RequestFailed, RateLimited,
  AuthFailed, InvalidResponse, IoError, ContextLengthExceeded, SessionExpired)
- FaultAction: Succeed, Fail(FaultType), Delay(Duration)
- FaultMode: SequenceOnce (play then succeed), SequenceLoop (repeat forever),
  Random (seeded xorshift64 PRNG for reproducibility)
- FaultInjector: thread-safe (AtomicU32 counter + Mutex RNG)

Integration:
- StubLlm gains optional fault_injector field via with_fault_injector()
- When set, takes precedence over should_fail/error_kind
- Backward compatible: existing StubLlm usage unchanged

Closes #1220

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor(testing): address review feedback on FaultInjector

- Remove redundant .abs() in random fault comparison
- Extract check_faults() helper to DRY up StubLlm methods
- Guard xorshift seed=0 (fixed point) by mapping to 1
- Add StubLlm integration test (stub_llm_fault_injector_sequence)
- Remove dead seed field from FaultMode::Random
- Move pub mod fault_injection to top of mod.rs
- Add Debug impl for FaultInjector
- Add empty_sequence_always_succeeds test
- Add random_seed_zero_does_not_always_fail test

* fix(testing): address #1233 review -- seed-0 bug, reset(), Debug derive

- Store seed in FaultMode::Random so reset() can re-init the RNG
- Add reset() method for test reproducibility (re-seeds RNG, zeros counter)
- Strengthen seed=0 regression test to 100 iterations with stricter assertion
- Add reset_restores_random_rng_from_stored_seed test
- Debug impl and empty_sequence test were already present from prior commit

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* ci: re-trigger CI with latest changes

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* ci: trigger new run with skip-regression-check label

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(testing): address PR #1233 review -- error_rate validation and edge cases

- Validate error_rate is in 0.0..=1.0 and not NaN (panics on invalid input)
- Fix error_rate==1.0 edge case: use <= instead of < so 1.0 always fails
- Add regression tests for error_rate validation (NaN, negative, >1.0)
- Add tests for error_rate boundary values (0.0 never fails, 1.0 always fails)
- Add delay action test using tokio::time::pause() for deterministic timing

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(self-repair): wire stuck_threshold, store, and builder (#647)

Wire the previously dead-code fields in DefaultSelfRepair:

- stuck_threshold: detect_stuck_jobs() now filters by duration, only
  reporting jobs stuck longer than the configured threshold
- with_store(): wired in agent_loop.rs from AgentDeps.store for
  tool failure tracking via Database trait
- with_builder(): wired from register_builder_tool() return value
  through AppComponents and AgentDeps for automatic tool rebuilding
- tools: passed alongside builder for hot-reload logging

Remove all #[allow(dead_code)] annotations. Add regression tests for
threshold-based filtering (both above and below threshold).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: add missing `builder` field to AgentDeps in gateway workflow harness

After rebase onto staging, AgentDeps gained a `builder` field for
self-repair tool rebuilding. The gateway workflow test harness was
missing this field, causing CI compilation failure.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* ci: retrigger CI

* fix: force CI refresh after path_routing_tests dedup

* test: add E2E test for stuck job repair and tool rebuild cycle

Tests the full self-repair flow requested in review:
1. Job transitions Pending -> InProgress -> Stuck
2. detect_stuck_jobs() finds it (zero threshold)
3. repair_stuck_job() recovers it back to InProgress
4. A broken tool is repaired via MockBuilder
5. Verify builder was invoked and repair succeeded

Uses a MockBuilder (impl SoftwareBuilder) that returns successful
BuildResult without requiring an LLM or filesystem. Uses libsql
test database for the store (increment_repair_attempts, mark_tool_repaired).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(self-repair): measure stuck_duration from Stuck transition, not started_at

- Use ctx.transitions to find the most recent Stuck transition timestamp
  instead of ctx.started_at (which reflects job start, not stuck time)
- Fix StuckJob.last_activity to use stuck transition timestamp
- Remove misleading "hot-reloaded into registry" log
- Remove stray "// ci fix" comment in memory.rs
- Add regression test: backdated started_at must not inflate stuck_duration

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* ci: re-trigger CI with latest changes

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: add type annotation to Ok(()) in test to resolve E0282

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…tion (#1400)

- Add `builder: None` to AgentDeps initializer in e2e_telegram_message_routing
  test (field added in #712 but test not updated)
- Update go_to_extensions() in test_telegram_hot_activation to navigate via
  settings tab -> extensions subtab (extensions tab was moved to settings)

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: navigate telegram E2E tests to channels subtab

wasm_channel extensions (like telegram) are now rendered in the
Settings → Channels subtab, not the Extensions subtab. Update
test_telegram_hot_activation to navigate there and use the correct
card selector. Also mock /api/gateway/status which loadChannelsStatus
fetches.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: select telegram card by name, not first card in channels subtab

Built-in channel cards (Web Gateway, HTTP, etc.) render first in the
channels subtab content, so .first matches them instead of the
telegram extension card. Select by has_text="Telegram" to target
the correct card.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: make gateway_status_handler parameterizable in mock helper

Address review feedback: extract default gateway status handler and
accept an optional gateway_status_handler kwarg in mock_extension_lists
for test flexibility.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…6242

chore: promote staging to staging-promote/b9e5acf6-23283208580 (2026-03-19 15:15 UTC)
…8580

chore: promote staging to staging-promote/3dcccc1e-23280048384 (2026-03-19 06:44 UTC)
…8384

chore: promote staging to staging-promote/ec04354c-23271447493 (2026-03-19 04:37 UTC)
@github-actions github-actions bot added the scope: tool Tool infrastructure label Mar 19, 2026
CPU-216 and others added 3 commits March 19, 2026 09:35
Bump registry version to pass check-version-bumps.sh after
channels-src/telegram/ changes.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…6661

chore: promote staging to staging-promote/ec04354c-23271447493 (2026-03-19 16:48 UTC)
@github-actions github-actions bot added the scope: ci CI/CD workflows label Mar 19, 2026
@henrypark133 henrypark133 merged commit e1774e9 into main Mar 19, 2026
35 checks passed
@henrypark133 henrypark133 deleted the staging-promote/ec04354c-23271447493 branch March 19, 2026 17:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor: core 20+ merged PRs risk: medium Business logic, config, or moderate-risk modules scope: agent Agent core (agent loop, router, scheduler) scope: channel/web Web gateway channel scope: ci CI/CD workflows scope: db/postgres PostgreSQL backend scope: db Database trait / abstraction scope: tool/builtin Built-in tools scope: tool Tool infrastructure size: XL 500+ changed lines staging-promotion

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants