feat(runner): fire on_interval on a real timer + thread-safe finding flush by ocervell · Pull Request #1179 · freelabz/secator

ocervell · 2026-06-15T16:19:16Z

Stacked on #1176 (base = feat/mongodb-batch-finding-writes) so that one stays a clean, deployable batch fix. Review/merge #1176 first.

1. `on_interval` on a real timer

Today on_interval only fires after the runner produces an item (__iter__ line 560), so it stalls during quiet periods — a task that bursts then goes quiet won't flush/update until the next item or on_end. (It's already time-throttled in run_hooks via backend_update_frequency; the problem is it's only checked on item production.)

Add a daemon interval thread that fires on_interval every backend_update_frequency seconds:

created lazily in __iter__ (not __init__) and nulled in __getstate__ → runner stays picklable for Celery, mirroring the monitor thread;
stopped in _finalize before the final on_end flush;
not started when backend_update_frequency <= 0 (-1 = no time-based backend updates → rely on size cap + on_end).

The per-item on_interval call + the existing throttle are kept (harmless; throttle dedupes item- vs timer-triggered firings).

2. Thread-safety (your point about the discarded `on_interval` return)

With flushing now possible from the interval thread, the per-runner findings buffer is touched by two threads (append on on_item / main, flush on on_interval / thread):

guard the buffer with a lock; swap it out under the lock, then bulk_write outside the lock;
toDict() snapshots its mutable collections (errors/warnings/celery_ids) since the interval thread may read it while the main thread appends.

Context invariant (your point 2): update_finding adds all in-memory context to the item synchronously at on_item, before the yield (it mints item._uuid client-side); the batched flush is DB-write-only and never mutates the item, so nothing is yielded missing context even though the flush is deferred / off-thread.

⚠️ Needs your review + local validation

Concurrency on a core runner — please validate against the local MongoDB repro (counts, chaining, no RuntimeError: changed size during iteration under load). Open question for you: the buffer is lock-guarded and toDict reads are snapshotted, but if you want stronger guarantees on all runner-state reads from the thread we could add a runner-level state lock — flagged rather than assumed.

🤖 Generated with Claude Code

…flush Builds on the batch-finding-writes PR. Two improvements: 1. on_interval ran only when the runner produced an item, so it stalled during quiet periods. Add a daemon interval thread that fires on_interval every backend_update_frequency seconds. Created lazily in __iter__ (not __init__) and nulled in __getstate__ so the runner stays picklable for Celery, like the monitor thread; stopped in _finalize before the final on_end flush. Disabled when backend_update_frequency <= 0 (-1 = no time-based backend updates). 2. Thread-safety: the findings buffer can now be appended (on_item, main thread) and flushed (on_interval, interval thread) concurrently. Guard the buffer with a lock and swap it out under the lock before the bulk_write (done outside the lock). toDict() now snapshots its mutable collections (errors/warnings/ celery_ids) since it may be read from the interval thread. Context invariant preserved: update_finding adds all in-memory context to the item synchronously at on_item (before the yield); the batched flush is DB-only. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-06-15T16:19:25Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8e4095a4-6d81-475b-9ded-debd4469b506

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/runner-interval-thread

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(runner): fire on_interval on a real timer + thread-safe finding flush#1179

feat(runner): fire on_interval on a real timer + thread-safe finding flush#1179
ocervell wants to merge 1 commit into
feat/mongodb-batch-finding-writesfrom
feat/runner-interval-thread

ocervell commented Jun 15, 2026

Uh oh!

coderabbitai Bot commented Jun 15, 2026

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

ocervell commented Jun 15, 2026

1. on_interval on a real timer

2. Thread-safety (your point about the discarded on_interval return)

⚠️ Needs your review + local validation

Uh oh!

coderabbitai Bot commented Jun 15, 2026

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

1. `on_interval` on a real timer

2. Thread-safety (your point about the discarded `on_interval` return)