chore: promote staging to staging-promote/89203225-23327092672 (2026-03-20 04:32 UTC)#1451
Merged
henrypark133 merged 108 commits intomainfrom Mar 25, 2026
Merged
Conversation
The installer fails on systems with glibc < 2.35 (e.g. Amazon Linux 2023) because only gnu targets are built and there is no static fallback. - Add x86_64-unknown-linux-musl and aarch64-unknown-linux-musl to the cargo-dist target list so the installer can fall back to statically linked binaries when glibc is too old. - Switch rig-core from reqwest-tls (OpenSSL) to reqwest-rustls (pure Rust TLS) to avoid a system OpenSSL dependency that breaks musl builds. Closes #1008
Address review feedback: - Regenerate Cargo.lock to reflect rig-core reqwest-rustls switch, removing openssl-sys and native-tls from the dependency tree - Add github-custom-runners entries for musl targets
* fix: restore libSQL vector search with dynamic embedding dimensions (#655) The V9 migration dropped the libsql_vector_idx and changed memory_chunks.embedding from F32_BLOB(1536) to BLOB, but the documented brute-force cosine fallback was never implemented. hybrid_search silently returned empty vector results — search was FTS5-only on libSQL. Add ensure_vector_index() which dynamically creates the vector index with the correct F32_BLOB(N) dimension, inferred from EMBEDDING_DIMENSION / EMBEDDING_MODEL env vars during run_migrations(). Uses _migrations version=0 as a metadata row to track the current dimension (no-op if unchanged, rebuilds table on dimension change). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * style: move safety comments above multi-line assertions for rustfmt stability Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: remove unnecessary safety comments from test code Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address review comments from PR #1393 [skip-regression-check] - Share model→dimension mapping via config::embeddings::default_dimension_for_model() instead of duplicating the match table (zmanian, Copilot) - Add dimension bounds check (1..=65536) to prevent overflow (zmanian, Copilot) - DROP stale memory_chunks_new before CREATE to handle crashed previous attempts (zmanian, Copilot) - Use plain INSERT instead of INSERT OR IGNORE to surface constraint errors (Copilot) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: add missing builder field to AgentDeps in telegram routing test [skip-regression-check] The self-repair builder field was added to AgentDeps in #712 but this test was not updated. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address zmanian's second review on PR #1393 - Add tracing::info when resolve_embedding_dimension returns None (#2) - Document connection scoping for transaction safety (#1) - Document _rowid preservation for FTS5 consistency (#4) - Document precondition that migrations must run first (#5) - Note F32_BLOB dimension enforcement in insert_chunk (#3) - Add unit tests for resolve_embedding_dimension (#6) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…outines (#769) * feat(db): add list_dispatched_routine_runs to RoutineStore trait Add method to query routine runs with status='running' AND job_id IS NOT NULL, enabling the routine engine to sync completion status from background jobs. Implements for both PostgreSQL and libSQL backends. [skip-regression-check] Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(routines): sync dispatched full-job runs with background job status (#697) Full-job routines were immediately marked Ok on dispatch, so failures/completions were never reflected in the routine run record. Now dispatch returns Running status, and a periodic sync checks linked jobs to update the run when the job completes, fails, or is cancelled. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(routines): fail fast when sandbox unavailable at dispatch time (#697) Thread sandbox_available bool from Docker detection through AgentDeps to RoutineEngine. Full-job routines now fail immediately with a clear error message when sandbox is enabled but Docker is not available, instead of dispatching a job that silently fails. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(startup): notify user when sandbox unavailable (#697) When sandbox is enabled but Docker is not installed or not running, send a user-visible warning through all channels at startup (with a 2s delay to let channels connect). Previously this was only logged via tracing::warn, invisible to TUI/web users. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * style: fix formatting in routine_engine.rs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(tests): set sandbox_available=true in test rig for full_job traces Test rig doesn't use real Docker — full_job routines execute via trace replay. Setting sandbox_available=true allows the routine_news_digest trace test to dispatch full_job routines as before. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(routines): address review feedback on sync_dispatched_runs (#697) - Sanitize last_reason from job transitions before using in notifications (truncate to 500 chars, strip control characters) - Treat Submitted as in-progress (can still transition to Failed), only Completed and Accepted are terminal success states - Add test for sanitize_summary Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(tests): add missing sandbox_available field to test constructors Staging added sandbox_available to AgentDeps and RoutineEngine::new. Add the missing field/argument in test files to fix CI compilation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: sanitize job reason in notifications, fix state handling for Submitted/Accepted - Enhance sanitize_summary to strip HTML tags and collapse whitespace, preventing injection via untrusted container job reasons - Use char-boundary-safe truncation to avoid panics on multi-byte strings - Treat Submitted and Accepted as in-progress states (continue polling) rather than terminal success, since they can still transition to Failed - Increase channel-connect delay from 2s to 5s and add debug log for sandbox-unavailable warning delivery Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Replace sandbox_available bool with SandboxReadiness enum Distinguishes DisabledByConfig from DockerUnavailable so full-job routine errors give actionable guidance instead of a generic message. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * ci: re-trigger CI with latest changes Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: add missing owner_id arg to send_notification call Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: update e2e tests to use SandboxReadiness enum Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: ilblackdragon@gmail.com <ilblackdragon@gmail.com>
Code reviewFound 5 issues:
Positive: SandboxReadiness enum well-designed, error handling uses .map_err() with context, UTF-8-safe, both database backends considered, test isolation via ENV_MUTEX. Generated with Claude Code |
…rors (#1450) * fix: f32→f64 precision artifact in temperature causes provider 400 errors Direct f32-as-f64 preserves the binary representation, producing values like 0.699999988079071 instead of 0.7. Some OpenAI-compatible providers (e.g. Zhipu GLM-5) reject these with a 400 error. Add round_f32_to_f64() that formats to 6 decimal places before parsing back to f64. * fix: address clippy redundant_closure lint (takeover #1418) [skip-regression-check] Co-Authored-By: Boomboomdunce <liweizhu0708@gmail.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: use numeric rounding, update doc comment, remove duplicate assertion [skip-regression-check] Address review feedback on #1450: - Replace format!+parse with numeric rounding to avoid allocation - Update doc comment to only mention temperature (not top_p) - Remove duplicate assert_eq in test Co-Authored-By: Boomboomdunce <liweizhu0708@gmail.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Boomboomdunce <liweizhu0708@gmail.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: port NPA psychographic profiling system into IronClaw
Port the complete psychographic profiling system from NPA into IronClaw,
including enriched profile schema, conversational onboarding, profile
evolution, and three-tier prompt augmentation.
Personal onboarding moved from wizard Step 9 to first assistant
interaction per maintainer feedback — the First Contact system prompt
block now instructs the LLM to conduct a natural onboarding conversation
that builds the psychographic profile via memory_write.
Changes:
- Enrich profile.rs with 5 new structs, 9-dimension analysis framework,
custom deserializers for backward compatibility, and rendering methods
- Add conversational onboarding engine with one-step-removed questioning
technique, personality framework, and confidence-scored profile generation
- Add profile evolution with confidence gating, analysis metadata tracking,
and weekly update routine
- Replace thin interaction style injection with three-tier system gated on
confidence > 0.6 and profile recency
- Replace wizard Step 9 with First Contact system prompt block that drives
conversational onboarding during the user's first interaction
- Add autonomy progression to SOUL.md seed and personality framework to
AGENTS.md seed
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat: replace chat-based onboarding with bootstrap greeting and workspace seeds
Remove the interactive onboarding_chat.rs engine in favor of a simpler
bootstrap flow: fresh workspaces get a proactive LLM greeting that
naturally profiles the user. Identity files are now seeded from
src/workspace/seeds/ instead of being hardcoded. Also removes the
identity-file write protection (seeds are now managed), adds routine
advisor integration, and includes an e2e trace for bootstrap greeting.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat(safety): sanitize identity file writes via Sanitizer to prevent prompt injection
Identity files (SOUL.md, AGENTS.md, USER.md, IDENTITY.md) are injected into
every system prompt. Rather than hard-blocking writes (which broke onboarding),
scan content through the existing Sanitizer and reject writes with High/Critical
severity injection patterns. Medium/Low warnings are logged but allowed.
Also clarifies AGENTS.md identity file roles (USER.md = user info, IDENTITY.md =
agent identity) and adds IDENTITY.md setup as an explicit bootstrap step.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* docs: update profile_onboarding_completed comment to reflect current wiring
The field is now actively used by the agent loop to suppress BOOTSTRAP.md
injection — remove the stale "not yet wired" TODO.
[skip-regression-check]
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(setup): use env_or_override for NEARAI_API_KEY in model fetch config
When the user authenticates via NEAR AI Cloud API key (option 4),
api_key_login() stores the key via set_runtime_env(). But
build_nearai_model_fetch_config() was using std::env::var() which
doesn't check the runtime overlay — so model listing fell back to
session-token auth and re-triggered the interactive NEAR AI
authentication menu.
Switch to env_or_override() which checks both real env vars and the
runtime overlay.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(agent): correct channel/user_id in bootstrap greeting persist call
persist_assistant_response was called with channel="default",
user_id="system" but the assistant thread was created via
get_or_create_assistant_conversation("default", "gateway") which owns
the conversation as user_id="default", channel="gateway". The mismatch
caused ensure_writable_conversation to reject the write with:
WARN Rejected write for unavailable thread id user=system channel=default
[skip-regression-check]
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(web): remove all inline event handlers for CSP compliance
The Content-Security-Policy header (added in f48fe95) blocks inline JS
via script-src 'self'. All onclick/onchange attributes in index.html
are replaced with getElementById().addEventListener() calls. Dynamic
inline handlers in app.js (jobs, routines, memory breadcrumb, code
blocks, TEE report) are replaced with data-action attributes and a
single delegated click handler on document.
[skip-regression-check]
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(agent): align bootstrap message user/channel and update fixture schema field
- Bootstrap IncomingMessage now uses ("default", "gateway") consistently
with persist and session registration calls
- Update bootstrap_greeting.json fixture: schema_version → version to
match current PROFILE_JSON_SCHEMA
[skip-regression-check]
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* style: cargo fmt
[skip-regression-check]
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(safety): address PR review — expand injection scanning and harden profile sync
- BOOTSTRAP.md: fix target "profile" → "context/profile.json" so the
write hits the correct path and triggers profile sync
- IDENTITY_FILES: add context/assistant-directives.md to the scanned
set since it is also injected into the system prompt
- sync_profile_documents(): scan derived USER.md and assistant-directives
content through Sanitizer before writing, rejecting High/Critical
injection patterns
- profile_evolution_prompt(): wrap recent_messages_summary in <user_data>
delimiters with untrusted-data instruction to mitigate indirect
prompt injection
- routine-advisor skill: update cron examples from 6-field to standard
5-field format for consistency with routine_create tool docs
[skip-regression-check]
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* style: cargo fmt
[skip-regression-check]
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(setup): detect env-provided LLM keys during quick-mode onboarding
Quick-mode wizard now checks LLM_BACKEND, NEARAI_API_KEY,
ANTHROPIC_API_KEY, and OPENAI_API_KEY env vars to pre-populate
the provider setting, so users aren't re-prompted for credentials
they already supplied. Also teaches setup_nearai() to recognize
NEARAI_API_KEY from env (previously only checked session tokens).
Includes web UI cleanup (remove duplicate event listeners) and
e2e test response count adjustment.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(test): update routine_create_list to expect 7-field normalized cron
The cron normalizer now always expands to 7-field format, so the
stored schedule is "0 0 9 * * * *" not "0 0 9 * * *".
[skip-regression-check]
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(setup): skip LLM provider prompts when NEARAI_API_KEY is present
In quick mode, if NEARAI_API_KEY is set in the environment and the
backend was auto-detected as nearai, skip the interactive inference
provider and model selection steps. The API key is persisted to the
secrets store and a default model is set automatically.
Also simplify the static fallback model list for nearai to a single
default entry.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: unify default model, static bootstrap greeting, and web UI cleanup
- Add DEFAULT_MODEL const and default_models() fallback list in
llm/nearai_chat.rs; use from config, wizard, and .env.example so the
default model is defined in one place
- Restore multi-model fallback list in setup wizard (was reduced to 1)
- Move BOOTSTRAP_GREETING to module-level const (out of run() body)
- Replace LLM-based bootstrap with static greeting (persist to DB before
channels start, then broadcast — eliminates startup LLM call and race)
- Fix double env::var read for NEARAI_API_KEY in quick setup path
- Move thread sidebar buttons into threads-section-header (web UI)
- Remove orphaned .thread-sidebar-header CSS and fix double blank line
- Update bootstrap e2e test for static greeting (no LLM trace needed)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(safety): move prompt injection scanning into Workspace write/append
Addresses PR #927 review comments (#1, #3) — identity file write
protection and unsanitized profile fields in system prompt.
Instead of scanning at the tool layer (memory.rs) or the sync layer
(sync_profile_documents), injection scanning now lives in
Workspace::write() and Workspace::append() for all files that are
injected into the system prompt. This ensures every code path that
writes to these files is protected, including future ones.
- Add SYSTEM_PROMPT_FILES const and reject_if_injected() in workspace
- Add WorkspaceError::InjectionRejected variant
- Add map_write_err() in memory.rs to convert InjectionRejected to
ToolError::NotAuthorized
- Remove redundant IDENTITY_FILES/Sanitizer from memory.rs
- Remove redundant sanitizer calls from sync_profile_documents()
- Move sanitization tests to workspace::tests
- Existing integration test (test_memory_write_rejects_injection)
continues to pass through the new path
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* style: cargo fmt
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: address Copilot review — merge marker order, orphan thread, stale fixture
- merge_profile_section: search for END marker after BEGIN position to
avoid matching a stray END earlier in the file
- Bootstrap phase 2: use get_or_create_session + Thread::with_id instead
of resolve_thread(None) to avoid creating an orphan thread
- setup_nearai: use env_or_override for NEARAI_API_KEY consistency with
runtime overlay
- Delete orphaned bootstrap_greeting.json fixture (no test references it)
- Add test_merge_end_marker_must_follow_begin regression test
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* style: cargo fmt
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* style: fmt agent_loop.rs (CI stable rustfmt)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: lazy-init sanitizer, check profile non-empty before skipping bootstrap
Address Copilot review:
- Use LazyLock<Sanitizer> to avoid rebuilding Aho-Corasick + regexes
on every workspace write
- has_profile check now requires non-empty content, not just file
existence, to prevent empty profile.json from suppressing onboarding
- Add seed_tests integration tests (libsql-backed) verifying:
- Empty profile.json does not suppress BOOTSTRAP.md seeding
- Non-empty profile.json correctly suppresses bootstrap for upgrades
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* style: cargo fmt
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: duplicate language handler, empty LLM_BACKEND, test_rig style
Address Copilot review on PR #927:
- Remove duplicate language-option click listeners (delegated
data-action handler already covers them)
- Guard LLM_BACKEND env prefill against empty string to prevent
suppressing API-key-based auto-detection
- Use destructured local `keep_bootstrap` instead of `self.keep_bootstrap`
in test_rig for consistency after destructure
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: update stale BOOTSTRAP.md write-protection comment [skip-regression-check]
BOOTSTRAP.md is now in SYSTEM_PROMPT_FILES and gets injection scanning
on write. The old comment incorrectly stated it was not write-protected.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: replace debug_assert panics with graceful error returns [skip-regression-check]
debug_assert! in execute_tool_with_safety and JobContext::transition_to
panicked in test builds before the graceful error path could run.
Existing tests (test_cancel_job_completed, test_execute_empty_tool_name_returns_not_found)
already cover these paths — they were the ones failing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: address Copilot review — schema label, env var check, path normalization, profile validation
1. Label ANALYSIS_FRAMEWORK and PROFILE_JSON_SCHEMA sections separately
in bootstrap prompt so the LLM knows which blob is the target structure.
2. Wizard quick-mode backend auto-detection now rejects empty env vars
(std::env::var().is_ok_and(|v| !v.is_empty())) to avoid selecting the
wrong backend when e.g. NEARAI_API_KEY="" is set.
3. Normalize the target path before comparing with paths::PROFILE in
memory_write so non-canonical variants like "context//profile.json"
still trigger profile sync.
4. seed_if_empty now requires valid JSON parse of context/profile.json
before treating it as a populated profile. Corrupted content no longer
permanently suppresses bootstrap seeding.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* style: cargo fmt
* fix: address Copilot review — append scan, profile validation, env_or_override
1. Workspace::append() now scans the combined content (existing + new)
for prompt injection, not just the appended chunk. Prevents split-
injection evasion across multiple appends.
2. seed_if_empty() now deserializes into PsychographicProfile instead of
serde_json::Value for profile validation. Stray/legacy JSON that
doesn't match the expected schema no longer suppresses bootstrap.
3. Wizard quick-mode backend auto-detection now uses env_or_override()
to honor runtime overlays and injected secrets. LLM_BACKEND value
is trimmed before storage.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* test: add bootstrap_onboarding_clears_bootstrap E2E trace test
Exercises the full onboarding flow end-to-end:
1. Bootstrap greeting fires automatically on fresh workspace
2. User converses for 3 turns (name, tools, work style)
3. Agent writes psychographic profile to context/profile.json
4. Profile sync generates USER.md and assistant-directives.md
5. Agent writes IDENTITY.md (chosen persona)
6. Agent clears BOOTSTRAP.md via memory_write(target: "bootstrap")
Verifies:
- BOOTSTRAP.md is non-empty before onboarding, empty after
- bootstrap_completed flag is set
- Profile contains expected user data (name, profession, interests)
- USER.md contains profile-derived content (name, tone, profession)
- Assistant-directives.md references user and communication style
- IDENTITY.md contains agent's chosen persona name
- All memory_write calls succeed
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: address Copilot review — slash collapse, env_or_override, cron trim [skip-regression-check]
1. memory.rs path normalization now uses the same char-by-char loop as
Workspace::normalize_path() to fully collapse consecutive slashes
(e.g. "context///profile.json" → "context/profile.json").
2. Quick-mode NEARAI_API_KEY check (line 239) now uses env_or_override()
consistently with the backend auto-detection block above it.
3. normalize_cron_expression() trims input before field counting so the
passthrough branch (7+ fields) also strips whitespace.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Jay Zalowitz <jayzalowitz@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…ion (#1234) * feat(agent): activate stuck_threshold for time-based stuck job detection (#1223) The stuck_threshold field on DefaultSelfRepair was defined but never used (marked #[allow(dead_code)]). Jobs that got stuck in InProgress without transitioning to Stuck state (e.g., deadlock, unhandled timeout) were never detected by self-repair. Changes: - Add find_stuck_jobs_with_threshold() to ContextManager that detects InProgress jobs running longer than the threshold - Wire stuck_threshold into detect_stuck_jobs() so it uses threshold-based detection alongside explicit Stuck state detection - Remove dead_code annotation from stuck_threshold - Accept InProgress jobs in the stuck job detection filter Configurable via AGENT_STUCK_THRESHOLD_SECS (default: 300s). Closes #1223 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(agent): address PR #1234 review feedback for stuck_threshold - Transition InProgress jobs to Stuck before returning them from detect_stuck_jobs(), so attempt_recovery() (which requires Stuck state) works correctly on threshold-detected jobs - Add detect-and-repair E2E test covering the full InProgress -> Stuck -> recovery -> InProgress cycle - Rename idle_threshold -> elapsed_threshold in find_stuck_jobs_with_threshold for clarity - Add `use std::time::Duration` import and remove fully qualified paths - Update CLAUDE.md to reflect that stuck_threshold is now actively used Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: measure stuck_duration from Stuck transition, handle InProgress→Stuck in repair - Fix stuck_duration computation to use the most recent Stuck transition timestamp instead of started_at, preventing jobs that ran for hours before becoming stuck from immediately exceeding the threshold - Fix last_activity to also use the Stuck transition timestamp - Transition InProgress jobs to Stuck before calling attempt_recovery() in repair_stuck_job(), since attempt_recovery() requires JobState::Stuck - Add regression test verifying a recently-stuck job with old started_at is not misdetected as exceeding a 5-minute threshold Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(agent): address Copilot review comments on PR #1234 - Add comment in find_stuck_jobs_with_threshold() noting that started_at is not reset on Stuck->InProgress recovery, which may cause false positives for recovered jobs. Suggests tracking in_progress_since or using the most recent StateTransition as a future improvement. - Fix misleading test comment in stuck_duration_measured_from_stuck_transition test: explicitly Stuck jobs are always returned regardless of threshold. The test verifies stuck_duration is near-zero, not that the job is excluded. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: ilblackdragon@gmail.com <ilblackdragon@gmail.com>
* fix(security): validate embedding base URLs to prevent SSRF (#1103) User-configurable base URLs (OLLAMA_BASE_URL, EMBEDDING_BASE_URL) were passed directly to reqwest with no validation, allowing SSRF attacks against cloud metadata endpoints, internal services, or file:// URIs. Adds validate_base_url() that rejects: - Non-HTTP(S) schemes (file://, ftp://) - HTTP to non-localhost destinations (prevents credential leakage) - HTTPS to private/loopback/link-local/metadata IPs (169.254.169.254, 10.x, 192.168.x, 172.16-31.x, CGN 100.64/10) - IPv4-mapped IPv6 bypass attempts Validation runs at config resolution time so bad URLs fail at startup. Closes #1103 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(security): add DNS resolution check, ULA blocking, and NEARAI_BASE_URL validation Address review feedback: - Resolve hostnames to IPs and check all resolved addresses against the blocklist (prevents DNS-based SSRF bypass where attacker uses a domain pointing to 169.254.169.254) - Add IPv6 Unique Local Address (fc00::/7) to the blocklist - Validate NEARAI_BASE_URL in llm config (was missing — especially dangerous since bearer tokens are forwarded to the configured URL) - Allow DNS resolution failure gracefully (don't block startup when DNS is temporarily unavailable) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * style: fix formatting Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(security): add SSRF validation to all base URL chokepoints - Add validate_base_url() in resolve_registry_provider() covering all LLM providers (OpenAI, Anthropic, Ollama, openai_compatible, etc.) - Add validate_base_url() for NEARAI_AUTH_URL in LlmConfig::resolve() - Add validate_base_url() for TRANSCRIPTION_BASE_URL in TranscriptionConfig - Add missing SSRF test cases: CGN range, IPv4-mapped IPv6, ULA IPv6, URLs with credentials, empty/invalid URLs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * ci: re-trigger CI with latest changes Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * ci: trigger new run with skip-regression-check label Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(security): validate embedding base URLs to prevent SSRF (#1103) User-configurable base URLs (OLLAMA_BASE_URL, EMBEDDING_BASE_URL) were passed directly to reqwest with no validation, allowing SSRF attacks against cloud metadata endpoints, internal services, or file:// URIs. Adds validate_base_url() that rejects: - Non-HTTP(S) schemes (file://, ftp://) - HTTP to non-localhost destinations (prevents credential leakage) - HTTPS to private/loopback/link-local/metadata IPs (169.254.169.254, 10.x, 192.168.x, 172.16-31.x, CGN 100.64/10) - IPv4-mapped IPv6 bypass attempts Validation runs at config resolution time so bad URLs fail at startup. Closes #1103 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(security): add DNS resolution check, ULA blocking, and NEARAI_BASE_URL validation Address review feedback: - Resolve hostnames to IPs and check all resolved addresses against the blocklist (prevents DNS-based SSRF bypass where attacker uses a domain pointing to 169.254.169.254) - Add IPv6 Unique Local Address (fc00::/7) to the blocklist - Validate NEARAI_BASE_URL in llm config (was missing — especially dangerous since bearer tokens are forwarded to the configured URL) - Allow DNS resolution failure gracefully (don't block startup when DNS is temporarily unavailable) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * style: fix formatting Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(security): add SSRF validation to all base URL chokepoints - Add validate_base_url() in resolve_registry_provider() covering all LLM providers (OpenAI, Anthropic, Ollama, openai_compatible, etc.) - Add validate_base_url() for NEARAI_AUTH_URL in LlmConfig::resolve() - Add validate_base_url() for TRANSCRIPTION_BASE_URL in TranscriptionConfig - Add missing SSRF test cases: CGN range, IPv4-mapped IPv6, ULA IPv6, URLs with credentials, empty/invalid URLs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * ci: re-trigger CI with latest changes Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * ci: trigger new run with skip-regression-check label Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: ilblackdragon@gmail.com <ilblackdragon@gmail.com>
* fix: prefer execution-local message routing metadata * test: cover message routing fallback metadata * refactor: simplify message target resolution * fix: ignore stale channel defaults for notify user metadata
#1426) * fix: register sandbox jobs in ContextManager for query tool visibility Sandbox jobs created via execute_sandbox() were persisted to the database but never registered in the in-memory ContextManager. Since all query tools (list_jobs, job_status, job_events, cancel_job) only search the ContextManager, sandbox jobs were invisible to the agent despite running successfully in Docker containers. Changes: - Add register_sandbox_job() to ContextManager (pre-determined UUID, starts InProgress, respects max_jobs) - Extract insert_context() helper to deduplicate create_job_for_user and register_sandbox_job - Add update_context_state / update_context_state_async to sync ContextManager state on sandbox job completion/failure - Extend job_monitor with spawn_job_monitor_with_context() and spawn_completion_watcher() so fire-and-forget jobs transition out of InProgress when the container finishes - Make CancelJobTool sandbox-aware (stops container + updates DB) - Wire sandbox deps into CancelJobTool in register_job_tools() - 8 regression tests across context manager and job monitor Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: add missing allow_always field in PendingApproval test literal Upstream commit 09e1c97 added the allow_always field to PendingApproval but missed updating the test struct literal, breaking compilation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Bedrock uses IAM credentials (instance roles, env vars, SSO) resolved by the AWS SDK at call time, so `provider` is never set during startup. Exclude it from the post-init validation that checks for missing API keys. Closes #1009 Co-authored-by: brajul <brajul@amazon.com> Co-authored-by: Illia Polosukhin <ilblackdragon@gmail.com>
* channels/wasm: implement telegram broadcast path for message tool * channels/wasm: tighten telegram broadcast contract and tests * fix: resolve merge conflicts with staging for wasm broadcast - Remove duplicate broadcast() impls from WasmChannel and SharedWasmChannel (staging already has the generic call_on_broadcast path) - Remove obsolete telegram-specific test helpers and tests that tested the old telegram-only broadcast logic - Add test_broadcast_delegates_to_call_on_broadcast for the generic path - Fix missing fallback_deliverable field in job_monitor test SseEvents Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: davidpty <127684147+davidpty@users.noreply.github.com> Co-authored-by: firat.sertgoz <f@nuff.tech> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(web): add light theme with dark/light/system toggle (#761) Add three-state theme toggle (dark → light → system) to the Web Gateway: - Extract 101 hardcoded CSS colors into 30+ CSS custom properties - Add [data-theme='light'] overrides for all variables - Add theme toggle button in tab-bar (moon/sun/monitor icons) - Theme persists via localStorage, defaults to 'system' - System mode follows OS prefers-color-scheme in real-time - FOUC prevention via inline script in <head> - Delayed CSS transition to avoid flash on initial load - Pure CSS icon switching via data-theme-mode attribute Closes #761 * fix: address review feedback and code improvements (takeover #853) - Fix dark-mode readability bug: .stepper-step.failed and .image-preview-remove used --text-on-accent (#09090b) on var(--danger) background, making text unreadable. Changed to --text-on-danger (#fff). - Restore hover visual feedback on .image-preview-remove:hover using filter: brightness(1.2) instead of redundant var(--danger). - Use const/let instead of var in theme-init.js for consistency with app.js (per gemini-code-assist review feedback). Co-Authored-By: CPU-216 <3125034290@stu.cpu.edu.cn> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address CI failures and Copilot review feedback (takeover #853) - Fix missing `fallback_deliverable` field in job_monitor test constructors (pre-existing staging issue surfaced by merge) - Validate localStorage theme value against whitelist in both theme-init.js and app.js to prevent broken state from invalid values - Add matchMedia addEventListener fallback for older Safari/WebKit - Add i18n keys for theme tooltip and aria-live announcement strings (en + zh-CN) to match existing localization patterns - Move .sr-only utility from inline <style> to style.css [skip-regression-check] Co-Authored-By: CPU-216 <3125034290@stu.cpu.edu.cn> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Gao Zheng <3125034290@stu.cpu.edu.cn> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…1461) * feat(llm): add OpenAI Codex backend config and OAuth session manager Add OpenAiCodex as a new LLM backend variant with config for auth endpoint, API base URL, client ID, and session persistence path. The session manager implements OpenAI's device code auth flow (headless-friendly, no browser required on the server) with automatic token refresh, following the same persistence pattern as the existing NEAR AI session manager. Closes #742 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(llm): add Responses API client and token-refreshing decorator Native Responses API client for chatgpt.com/backend-api/codex/responses, the endpoint that works with ChatGPT subscription tokens. Handles SSE streaming, text completions, and tool call round-trips. Token-refreshing decorator wraps the provider to pre-emptively refresh OAuth tokens before API calls and retry once on auth failures. Reports zero cost since billing is through subscription. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(llm): wire OpenAI Codex into provider factory, CLI, and setup wizard Connect the new provider to the LLM factory, add openai_codex to the CLI --backend flag, and add it as an option in the onboarding wizard. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(llm): address PR #744 review feedback (20 items) Review fixes for the OpenAI Codex provider PR: - Remove dead `generate_pkce()` code (device flow gets PKCE from server) - Fix `refresh_tokens()` to use `.form()` instead of `.json()` per OAuth spec - Inline codex dispatch into `build_provider_chain()` (single async function, no separate `assemble_provider_chain()` helper — matches main's pattern) - Remove Clone from `OpenAiCodexSession`, restrict fields to `pub(crate)` - Propagate HTTP client builder error instead of silent fallback - Redact device code response body from debug log - Change `set_model()` in TokenRefreshingProvider to delegate to inner - Replace hardcoded `/tmp/` test path with `tempfile::tempdir()` - Accept `request_timeout_secs` from config instead of hardcoded 300s - Parse `Retry-After` header on 429 responses (matches nearai_chat.rs pattern) - Reuse `normalize_schema_strict()` for Codex tool definitions - Add warning log for dropped image attachments - Add doc comments on `list_models()` and `include` field - Add `OPENAI_CODEX_API_URL` to `.env.example` - Fix codex error message in `create_llm_provider()` for clarity - Revert unrelated `.worktrees` addition to `.gitignore` - Update `src/llm/CLAUDE.md` with Codex provider docs [skip-regression-check] Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address review feedback and harden OpenAI Codex provider (takeover #744) Security: - Add SSRF validation (validate_base_url) on OPENAI_CODEX_AUTH_URL and OPENAI_CODEX_API_URL, matching the pattern used by all other base URL configs (regression test for #1103 included) Correctness: - Add missing cache_write_multiplier() and cache_read_discount() trait delegation in TokenRefreshingProvider - Cap device-code polling backoff at 60s to prevent unbounded interval growth on repeated 429 responses - Default expires_in to 3600s when server returns 0, preventing immediately-expired sessions - Fix pre-existing SseEvent::JobResult missing fallback_deliverable field in job_monitor.rs tests Cleanup: - Extract duplicated make_test_jwt() and test_codex_config() into shared codex_test_helpers module Co-Authored-By: Sanjeev-S <Sanjeev-S@users.noreply.github.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address PR review feedback on OpenAI Codex provider (#1461) - Login command now resolves OPENAI_CODEX_* env overrides even when LLM_BACKEND isn't set to openai_codex (Copilot review) - Setup wizard "Keep current provider?" for codex no longer re-triggers device code login — mirrors Bedrock's keep-and-return pattern (Copilot) - Revert provider init log from info back to debug (Copilot) - Add warning log when token expires_in=0, before defaulting to 3600s (Gemini review) Co-Authored-By: Sanjeev-S <Sanjeev-S@users.noreply.github.com> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Sanjeev Suresh <Sanjeev-S@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* Use live owner tool scope for autonomous runs * Address autonomous tool scope review feedback * Normalize routine context paths again
* Fix CI approval flows and stale fixtures * Backfill approval thread mapping across channels
…env (#1448) * fix(setup): remove redundant LLM vars and API keys from bootstrap .env Only true chicken-and-egg vars belong in ~/.ironclaw/.env — things needed to connect to the DB or decrypt secrets (DATABASE_BACKEND, DATABASE_URL, LIBSQL_PATH, SECRETS_MASTER_KEY, ONBOARD_COMPLETED). LLM settings (LLM_BACKEND, LLM_BASE_URL, OLLAMA_BASE_URL, model name, provider-specific URLs) are persisted to the DB via persist_settings() and loaded by Config::from_db_with_toml() after connection. API keys are stored encrypted in the secrets DB and injected via inject_llm_keys_from_secrets(). Writing them as plaintext to .env was redundant and a security regression. Also fixes for_model_discovery() and build_nearai_model_fetch_config() to use env_or_override() instead of std::env::var(), so they can read NEARAI_API_KEY from the thread-safe overlay during the onboarding wizard (where inject_single_var() sets the key after the user enters it). Also fixes incorrect secret names in README (anthropic_api_key → llm_anthropic_api_key, openai_api_key → llm_openai_api_key). Supersedes #266 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: add missing fallback_deliverable field to job_monitor tests Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: address review comments on bootstrap .env and README - Update write_bootstrap_env() docstring to reflect current behavior (no LLM vars, no credentials) - Fix Layer 1 .env examples in README to remove LLM_BACKEND/LLM_BASE_URL - Fix legacy secret name in README example (anthropic_api_key → llm_anthropic_api_key) - Document channel/sandbox vars in bootstrap vars list - Add cleanup comment in test explaining empty-value-as-unset behavior Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(webhooks): add public webhook trigger endpoint for routines
Add POST /api/webhooks/{path} endpoint that matches incoming webhooks
against routines with Trigger::Webhook, validates secrets using
constant-time comparison (subtle crate), and fires the matched routine
through the message pipeline.
Closes #651
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(webhooks): address PR review feedback - access control, targeted query, rate limiting
[skip-regression-check]
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(ci): add missing webhook_rate_limiter field and fix formatting
Add the webhook_rate_limiter field to the GatewayState initializer in
gateway_workflow_harness.rs and fix rustfmt formatting for the webhook
tuple in types.rs.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(security): require webhook secret, add rate limiting, improve tests
Extract validate_webhook_secret() from the handler so the security-critical
secret validation logic (mandatory secret, constant-time comparison) is
directly testable without mocking the database layer. Improves the error
message for misconfigured routines to guide users toward the fix.
Replaces the previous unit tests (which only tested Rust pattern matching
and status code constants) with tests that exercise the actual validation
function against all rejection paths: missing secret (403), non-webhook
trigger (403), wrong secret (401), empty secret (401), and different-length
secret (401).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Route webhook triggers through RoutineEngine instead of chat pipeline
Adds fire_webhook() to RoutineEngine and updates the webhook handler
to use it. This ensures webhook-triggered routines get proper run
tracking, guardrail enforcement (cooldown + max_concurrent),
notifications, and FullJob dispatch support.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* style: fix formatting in webhook handler
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: ilblackdragon@gmail.com <ilblackdragon@gmail.com>
* Expand AGENTS.md with repo guidance for coding agents * Format AGENTS deeper docs as a multiline list * Move scoping guidance to change-discipline section * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
…1280) (#1468) Replace `unwrap_or_else(|e| e.into_inner())` with `expect("env mutex poisoned")` in bind_rejects_wildcard_ipv4 and bind_rejects_wildcard_ipv6 tests to match the ENV_MUTEX pattern used in oauth_defaults.rs. The old pattern silently recovered from a poisoned mutex, potentially allowing concurrent env var access when a prior test panicked while holding the lock. [skip-regression-check] Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…3006 chore: promote staging to staging-promote/d9358b0f-23426138451 (2026-03-23 13:43 UTC)
…8451 chore: promote staging to staging-promote/abba0831-23415935143 (2026-03-23 07:37 UTC)
…5143 chore: promote staging to staging-promote/fbce9a5f-23403885064 (2026-03-23 01:32 UTC)
…5064 chore: promote staging to staging-promote/1a62febe-23398066063 (2026-03-22 13:21 UTC)
…6063 chore: promote staging to staging-promote/86388958-23397163010 (2026-03-22 07:15 UTC)
…3010 chore: promote staging to staging-promote/b58b4215-23396456254 (2026-03-22 06:14 UTC)
…6254 chore: promote staging to staging-promote/89394ebd-23395764012 (2026-03-22 05:25 UTC)
…4012 chore: promote staging to staging-promote/b97d82db-23390775365 (2026-03-22 04:36 UTC)
…5365 chore: promote staging to staging-promote/9d538136-23389762470 (2026-03-21 23:04 UTC)
…2470 chore: promote staging to staging-promote/8ad7d78a-23387609319 (2026-03-21 22:03 UTC)
…9319 chore: promote staging to staging-promote/62326090-23374571867 (2026-03-21 20:02 UTC)
…1867 chore: promote staging to staging-promote/9964d5da-23372765633 (2026-03-21 07:13 UTC)
…5633 chore: promote staging to staging-promote/0d1a5c21-23372030005 (2026-03-21 05:17 UTC)
…0005 chore: promote staging to staging-promote/e6277a39-23371263100 (2026-03-21 04:30 UTC)
…3100 chore: promote staging to staging-promote/6d847c60-23366109539 (2026-03-21 03:42 UTC)
…9539 chore: promote staging to staging-promote/9603fefd-23364438978 (2026-03-20 23:06 UTC)
…8978 chore: promote staging to staging-promote/d3b69e7b-23359661011 (2026-03-20 22:04 UTC)
…-version-bumps fix: bump registry versions for staging promotion 1451
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Auto-promotion from staging CI
Batch range:
c4ab382522c86e7e19d55fee760b125fb1970518..455f543ba50d610eb9e181fd41bf4c77615d3af6Promotion branch:
staging-promote/455f543b-23329172268Base:
staging-promote/89203225-23327092672Triggered by: Staging CI batch at 2026-03-20 04:32 UTC
Commits in this batch (5):
Current commits in this promotion (70)
Current base:
mainCurrent head:
staging-promote/455f543b-23329172268Current range:
origin/main..origin/staging-promote/455f543b-23329172268ironclaw hooks listsubcommand ( feat(cli): addironclaw hooks listsubcommand #1023)Auto-updated by staging promotion metadata workflow
Waiting for gates:
Auto-created by staging-ci workflow