Skip to content

fix: shorten manual cron profile lock scope#1749

Closed
Michaelyklam wants to merge 3 commits intonesquena:masterfrom
Michaelyklam:fix/issue-1574-cron-profile-lock
Closed

fix: shorten manual cron profile lock scope#1749
Michaelyklam wants to merge 3 commits intonesquena:masterfrom
Michaelyklam:fix/issue-1574-cron-profile-lock

Conversation

@Michaelyklam
Copy link
Copy Markdown
Contributor

@Michaelyklam Michaelyklam commented May 6, 2026

Thinking Path

  • Issue follow-up(cron): long cron runs hold the global serialization lock for the entire run_job() execution #1574 pointed at manual cron runs holding the process-wide cron/profile environment lock for the full run_job() duration.
  • That lock is still needed when mutating in-process cron/profile globals, but long jobs should not freeze unrelated WebUI cron/profile operations.
  • The safe split is to keep short parent-process metadata writes under the existing profile context, while running the long job body in an isolated spawned child process pinned to the selected execution profile.
  • A regression test now proves an unrelated cron_profile_context_for_home() can enter while a manual cron job is still running.

What Changed

  • Added _run_cron_job_in_profile_subprocess() and a spawned child-process target for manual cron execution in api/routes.py.
  • Changed _run_cron_tracked() so the long run_job(job) body executes in the child process under the selected execution profile home instead of holding the parent _cron_env_lock.
  • Kept output and run metadata persistence scoped to the owning cron store in the parent process.
  • Added issue follow-up(cron): long cron runs hold the global serialization lock for the entire run_job() execution #1574 regression coverage and updated cron/profile/manual-run tests for the new subprocess execution boundary.

Fixes #1574

Why It Matters

  • Long manual cron runs no longer serialize unrelated cron/profile UI/API work behind the parent process _cron_env_lock.
  • Profile isolation is preserved because the child process still runs run_job() under cron_profile_context_for_home(execution_profile_home).
  • The owning profile still receives output/run metadata even when execution uses a different selected profile.
  • Using a spawned process avoids forking the live WebUI server/thread state while still isolating process-global cron module constants.

Verification

  • RED: tests/test_issue1574_cron_profile_lock.py::test_manual_cron_run_does_not_hold_profile_lock_for_job_duration failed before the fix because the contender profile context stayed blocked while fake run_job() was active.
  • tests/test_cron_manual_run_persistence.py tests/test_issue1574_cron_profile_lock.py tests/test_issue617_cron_profile_selector.py tests/test_cron_run_job_import.py tests/test_scheduled_jobs_profile_isolation.py -q -> 22 passed in 2.07s
  • tests/test_sprint3.py tests/test_sprint6.py tests/test_sprint7.py tests/test_sprint10.py tests/test_sprint13.py tests/test_cron_manual_run_persistence.py tests/test_scheduled_jobs_profile_isolation.py tests/test_issue1574_cron_profile_lock.py tests/test_issue617_cron_profile_selector.py tests/test_cron_run_job_import.py -q -> 115 passed in 8.00s
  • git diff --check -> passed
  • Full suite attempt with config isolation completed once with 2 unrelated session-search failures; both failed tests passed when rerun directly: tests/test_sprint3.py::test_session_search_returns_matches tests/test_sprint4.py::test_session_search_returns_matches -q -> 2 passed in 1.64s.
  • A second full-suite attempt later lost its shared test HTTP server and cascaded into ConnectionRefusedError failures; representative checks passed afterward: tests/test_sprint1.py::test_health tests/test_regressions.py::test_chat_start_returns_stream_id tests/test_issue1680_codex_spark.py::test_openai_codex_group_uses_provider_model_ids_for_spark -q -> 3 passed in 6.09s.

Risks / Follow-ups

  • Manual cron execution now starts a child process. This is intentional for isolation, but reviewers should sanity-check any platform-specific assumptions if non-Linux WebUI deployments need first-class support.
  • No UI changes; no screenshots included.

Model Used

  • Provider: OpenAI Codex
  • Model: gpt-5.5
  • Tool use: Hermes Kanban worker, terminal/git/pytest, file editing, GitHub CLI

@Michaelyklam
Copy link
Copy Markdown
Contributor Author

Closing this duplicate worker PR. The Kanban task for issue #1574 was already completed through PR #1746, which is mergeable and CI-green: #1746

@Michaelyklam Michaelyklam deleted the fix/issue-1574-cron-profile-lock branch May 6, 2026 17:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

follow-up(cron): long cron runs hold the global serialization lock for the entire run_job() execution

1 participant