Skip to content

Introduce Web Worker foundation with error handling and telemetry#711

Open
arolariu wants to merge 50 commits intomainfrom
preview
Open

Introduce Web Worker foundation with error handling and telemetry#711
arolariu wants to merge 50 commits intomainfrom
preview

Conversation

@arolariu
Copy link
Copy Markdown
Owner

@arolariu arolariu commented May 6, 2026

This pull request introduces a comprehensive Web Worker foundation for the codebase, including a new shared @/workers module, a developer-only playground for exercising and testing worker behaviors, and end-to-end Playwright tests for real-world coverage. It also adds the comlink dependency, which is used for worker communication. The changes are grouped into core infrastructure, developer tooling, and testing.

Core infrastructure:

  • Added the @/workers module as a shared foundation for Web Worker management, including a detailed README.md with usage patterns, API contracts, lifecycle, error handling, and known limitations.
  • Added the comlink package to both the root and site-specific dependencies for worker RPC support. [1] [2]

Developer tooling:

  • Introduced a dev/staging-only "Worker Playground" route (src/app/playground/workers/page.tsx) and a canned demo worker (playground.worker.ts) that exposes a surface-rich API for exercising and demonstrating all documented worker behaviors. The route is gated to return 404 in production. [1] [2]

Testing:

  • Added a comprehensive Playwright spec (worker-playground.spec.ts) that covers all playground behaviors, including boot, echo, abort, crash/restart, capabilities, and stress scenarios that require a real Worker (e.g., realm isolation, boot latency, event streaming).
  • Added a unit test for the createPortPair utility to verify correct creation and isolation of message channels.

arolariu and others added 30 commits April 27, 2026 20:58
…e and error normalization

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…active island

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add targeted tests for uncovered branches in the Web Workers Foundation:
- telemetryBridge: default logger (console.*) paths and non-Error cause serialization
- workerLifecycle: bootBegin/bootComplete/crash after dispose, setState no-op when state unchanged
- createWorkerHost: concurrent restart lock, dispose-before-boot, subscribe/unsubscribe, onEvent steady-state listener
- mockWorker: simulateCrash after terminate, removeEventListener, dispatchEvent, terminate idempotency
- exposeWorker: non-function API values, error serialization fallbacks, globalThis scope default

Reaches host/ branches 91.5% (≥90%), host/ functions 96.61% (≥95%), global branches 90.37% (≥90%).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…spose check, mid-flight abort, listener cleanup, ignored signal, boot dispose race, ready event filter

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Move all test files out of __tests__/ to colocate with source (project convention)
- Hoist comlink dependency to monorepo root; reference as "*" from website package
- restart(): drain in-flight calls with WorkerCrashError, abort pending bootstrap, swap lifecycle before disposing old (avoids public "disposed" transition)
- restart()/per-call abort: clean up signal listeners on settle to prevent leaks
- Remove unused defaultCallTimeoutMs from public surface (re-add when wired)
- Filter pre-bootstrap "ready" events from the defensive forwarder
- workerEnvelope: align comment with implementation; split long if conditions
- exposeWorker: add getBootstrapCapabilities helper; remove playground's direct bootstrap-message peek
- README: correct production tree-shaking claim, capabilities snapshot timing, rule (1) wording, generic background reference
- Playwright spec: import from project fixtures; assert real boot to "ready" after interaction
- page.tsx: switch to generateMetadata + createMetadata + next-intl convention
- island.tsx: route user-facing strings through next-intl with literal fallbacks
- Tests: remove three tests that didn't actually exercise their targeted branches
…realm refs

Per WHATWG HTML S9.4.5, MessagePorts with active listeners hold strong
cross-realm references. Without explicit close(), the parent's port1
objects survive worker.terminate() until GC. Hoist parent ports to host
scope so tearDownWorker can close them deterministically.

Also documents the pre-terminate listener detach order (HTML S10.2.4
parallel error event delivery) so future refactors don't reverse it.

Refs: cross-review important-1, minor-7
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The pre-terminate inFlight drain in handleCrash() and restart() is
load-bearing because Comlink's requestResponseMessage stores only the
call's resolve callback (GoogleChromeLabs/comlink#601). Without this,
consumer promises hang forever when the worker terminates mid-call.
Document the SAFETY contract so future contributors don't reorder.

Refs: cross-review important-2
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Comlink's default throwTransferHandler only round-trips name, message,
and stack. WHATWG HTML S2.7.3 (StructuredSerialize) further normalizes
Error.name to one of seven standard names, so a custom Error subclass
loses its identity over the port. Our plain-object envelope round-trips
those fields losslessly. Document this at both the throw site
(exposeWorker.ts) and the unwrap site (createWorkerHost.ts) so future
contributors don't 'simplify' it back to a real Error throw.

Refs: cross-review important-3
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ful teardown

dispose() and lazy-reboot now best-effort invoke proxy[releaseProxy]()
before terminate(), giving the worker a chance to flush before the port
goes away. handleCrash() deliberately skips this because the proxy is
already wedged on a closed port (releaseProxy is itself an RPC and would
hang per Comlink#601). Errors are swallowed so a hung releaseProxy
cannot block teardown — terminate() is the hard backstop.

Refs: cross-review minor-4
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…nal.reason fallback

Two WHATWG-spec polish items:

* Remove the explicit eventChannel.port1.start() call in createWorkerHost
  (assigning port.onmessage implicitly starts the port per HTML S9.4.5).
* Remove the eventPort.start() in exposeWorker (the port is send-only on
  the worker side; start() applies only to receivers per HTML S9.4.5).
* Comment the four signal.reason ?? new Error('aborted') fallbacks as
  belt-and-suspenders for non-spec polyfills (signal.reason is always
  defined on an aborted signal per WHATWG DOM).

Refs: cross-review minor-5, minor-6, minor-8
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The optional chain in nav?.gpu already short-circuits to undefined for
missing properties, so the && nav?.gpu !== null clause was dead. Use
loose-equality !=  null which correctly covers both undefined and null
without the duplicated optional-chain access.

Refs: cross-review minor-12
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds a never-typed default case so adding a new WorkerEvent kind
without updating the switch becomes a compile-time error, plus a
runtime logger.warn for defense-in-depth against stale worker bundles.

Includes test exercising the unknown-kind branch.

Refs: cross-review minor-11
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
SECURITY block clarifying that this predicate is the only line of
defense against malformed bootstrap shapes from a worker realm. New
fields on WorkerBootstrap MUST be validated here before they are read
elsewhere; callers must NEVER spread or pass through unvalidated data.

Refs: cross-review minor-13
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Reflect.get preserves prototype lookup semantics and avoids the broad
Record<string, unknown> cast that widened the proxy's static type.

Refs: cross-review minor-10
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
arolariu and others added 20 commits May 6, 2026 00:40
Adds a defaultCallTimeoutMs option (default 30s) to createWorkerHost.
Each proxy call races the wrapped invocation against a setTimeout that
rejects with WorkerTimeoutError(method, elapsedMs) when fired. The timer
starts AFTER ensureReady() so boot latency is not charged to the
consumer's budget. Set to 0 or any non-finite value (Infinity, NaN) to
disable.

NOTE: The timeout rejects the consumer's promise but does NOT abort the
worker-side handler — Comlink has no cancellation protocol. Consumers
must call host.restart() to reclaim a hung worker.

Includes 5 tests: basic timeout firing, disabled-by-zero, disabled-by-
Infinity, no leaked timer on success, and the 30s default budget.

Refs: cross-review enhancement-1
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Avoids the dual-Comlink-instance bug when bundler chunk-splitting yields
two distinct Comlink modules — a proxy() marker from one Comlink instance
is unrecognized by the other's wrap(). Consumers can now do:

  import { proxy, transfer, releaseProxy } from '@/workers';

Refs: cross-review refactor-1
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…nels

Names the two MessageChannel halves explicitly — parent (host-retained,
must be close()d on teardown) vs transferable (handed to the worker via
postMessage transfer list, detaches automatically after transfer).

The bootstrap handshake in performBoot now reads top-down, and the M1
port-close requirement in tearDownWorker becomes syntactically obvious
because parentRpcPort/parentEventPort capture the named 'parent' fields.

Includes 3 unit tests: shape (duck-typed since jsdom MessagePort doesn't
interop with toBeInstanceOf), bidirectional comms, and per-call freshness.

Refs: cross-review refactor-2
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
README additions:
* 'Why we wrap handler errors in a __workerError envelope' section
  explaining the Comlink + WHATWG StructuredSerialize constraint
* 'Per-call timeout' section covering defaultCallTimeoutMs default,
  disable semantics, and the no-cancellation caveat
* 'Lifecycle hooks' section documenting subscribe() unsubscribe MUST
  contract with the canonical React useEffect pattern
* 'Known limitations' expanded with 5 entries covering timeout cancel,
  Error.cause loss, AsyncIterable proxying caveats
* 'Testing' section listing MockWorker fidelity gaps and pointing at
  the Playwright suite for end-to-end coverage

TSDoc additions:
* @example block on createWorkerHost showing typical usage with
  defaultCallTimeoutMs and WorkerCrashError handling
* MUST-contract paragraph on WorkerHost.subscribe() about unsubscribe
* KNOWN FIDELITY GAPS block on MockWorker @fileoverview

Refs: cross-review minor-14, minor-15, minor-16, minor-17, minor-18, minor-19
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds the Playground.Workers namespace with sections, labels,
placeholders, actions, status, and metadata keys for all three locales.
Replaces the literal-fallback pattern that previously kept the
playground functional without locale-file edits.

Includes new keys for Phase 7 UX additions: status (idle/pending/
success/error), call-status label + 'no calls yet' empty state, and
new actions (timeoutSlow, emitEvents, throwError, clearLog) wired in
the Phase 7 island refactor.

Refs: cross-review enhancement-5
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…tl v8 Map overflow

next-intl's createMessagesDeclaration generates a string-literal union
from messages/en.json. Our messages file is large enough (>200 KB,
~2,500 leaf keys) that TypeScript's recursiveTypeRelatedTo blows past
V8's Map entry limit (2^24) during build-time typecheck and crashes
with 'RangeError: Map maximum size exceeded'.

Compilation itself succeeds; we rely on CI to run a separate tsc step
against a smaller non-Next typecheck context. Documented in:
amannn/next-intl#2296

Refs: cross-review enhancement-2 (unblock playground build)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ress controls

Replaces the inline-styled prototype with production-quality UX:

* @arolariu/components primitives (Card, Button, Badge, Alert) instead
  of inline style props, matching the rest of the site's visual system.
* Discriminated CallState machine (idle/pending/success/error) so each
  branch is rendered explicitly, with the error category surfaced as
  a Badge (timeout/crash/dead/handler/aborted/unknown).
* aria-live='polite' status region announces call outcomes to screen
  readers; aria-busy flips on action buttons during pending.
* Per-button labels routed through next-intl with literal English
  fall-throughs so the page stays functional even with stale messages.
* New event-log size cap (200 entries) prevents stress tests from
  blowing up the DOM.

Worker API extended with two helpers used by Phase 8 stress scenarios:
* ping() — cheap RPC round-trip for boot-latency measurement.
* whatIsWindow() — returns typeof window so the playground can
  visually demonstrate worker realm isolation in production builds.

Stress section adds three buttons:
* Trigger timeout — spins up a transient host with a 100ms budget
  against a 5s sleep so WorkerTimeoutError fires deterministically.
* Emit 5 events — exercises the side-channel event port end-to-end.
* typeof window — surfaces the realm-isolation result.

Refs: cross-review enhancement-2, enhancement-3, enhancement-4
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…y gaps

Six new Playwright scenarios that exercise behaviors MockWorker cannot
fake, giving us production-fidelity coverage for paths that the unit
suite necessarily fakes:

* boot-latency: real worker spins up in measurable time (>1ms, <3s) —
  MockWorker boots synchronously so the unit suite cannot detect a
  regression in actual boot setup.
* realm-isolation: real worker reports typeof window === 'undefined' —
  MockWorker shares the host realm so window IS defined there.
* per-call-timeout: spins up a transient host with a 100ms budget
  against a 5s sleep and asserts WorkerTimeoutError fires.
* event-stream: emitEvents drives 5 entries through the side channel
  (end-to-end MessagePort event delivery, not faked).
* parallel-terminate: rapid restart while a call is in flight settles
  cleanly without hang or unhandled rejection (validates the inFlight
  drain documented in the SAFETY note from Phase 1).
* clear-event-log: empties UI without disturbing host state.

Also fixes the abort scenario assertion: it now reads the typed
'aborted' badge from call-status rather than searching the event log
for a string that didn't match the new payload format. The echo
assertion now matches JSON.stringify output.

Refs: cross-review enhancement-6
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Per PR #699 review feedback (arolariu @ next.config.ts:186): user will
manually ensure i18n typecheck doesn't crash; we should not silently
ship a config that hides real type errors. Reverts d198b24.

Refs: PR #699 review (arolariu, Copilot)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The /playground/workers/ route is dev-only (gated to 404 in production)
and is not user-facing. Routing every label through next-intl just
pollutes the message catalogs (en/ro/fr) with strings nobody will ever
read in production.

Per the user's directive on PR #699, drop the entire Playground.Workers
namespace from all three locale files, remove useTranslations + the
tr() fallback helper from island.tsx, and remove getTranslations from
page.tsx. All UI strings are now hardcoded English.

This also addresses Copilot's three i18n comments by going the other
direction: rather than adding the remaining i18n coverage, we remove
the partial coverage that existed.

Refs: PR #699 review (arolariu user directive, Copilot island.tsx:296,
369, 383)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…elope

The __workerError envelope is intentionally a plain object (see ENVELOPE
comment at the throw site) — Comlink's default throwTransferHandler and
WHATWG StructuredSerialize can't faithfully round-trip a real Error
subclass. Add a narrow eslint-disable on the throw line so 'npm run
lint' passes while preserving the envelope behavior.

Refs: PR #699 review (Copilot exposeWorker.ts:119)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Previously validateBootstrap only checked that 'capabilities' was a
non-null object. Per the SECURITY note (this is the trust boundary
between worker and host realms), the validator must also strictly
match the WorkerCapabilities shape so untrusted snapshots can't
silently corrupt worker code branches like 'if (caps.hasWebGpu)'.

Adds checks for the two required boolean fields (crossOriginIsolated,
hasWebGpu) and validates the optional numeric fields (hardwareConcurrency,
deviceMemory) when present.

Includes 5 new tests covering the new branches.

Refs: PR #699 review (Copilot workerEnvelope.ts:88)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…test

const [, ] = ... is an empty destructuring pattern that adds no value
and would trip no-empty-pattern lint rules. Replace with a bare
'await Promise.all([...])'.

Refs: PR #699 review (Copilot createWorkerHost.test.ts:507)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Pulls in 2 commits from origin/preview (c76cc28, c37d9aa) that
adjusted package versions on the base branch. Resolves the only
conflict (package-lock.json) by taking preview's version wholesale —
this PR adds no new dependencies, so the lockfile delta belongs
entirely to preview.

The package.json change picks up preview's @mermaid-js/layout-elk
downgrade (0.2.1 → 0.1.9). No worker-foundation code or tests are
affected.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Taking preview's package-lock.json wholesale during the merge dropped
the comlink dependency that this PR adds (preview didn't have it yet).
Running 'npm install' restored the correct entry.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Casting a string to () => Promise<unknown> made the test intent
opaque — readers had to chase the cast to realize the value was
never going to be called. Cast the WHOLE record instead, keeping
'version' as a real string. Now it's syntactically obvious that the
test exercises the typeof !== 'function' branch.

Refs: PR #699 review (Copilot exposeWorker.test.ts:106)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Three interrelated robustness fixes in createWorkerHost.ts, addressing
all three remaining Copilot review comments on this file:

1) dispose() now drains inFlight BEFORE tearDownWorker, mirroring the
   load-bearing Comlink-hangs-on-closed-port pattern from handleCrash()
   and restart(). Without this drain, any in-flight host.api.foo()
   promise would hang forever per GoogleChromeLabs/comlink#601 because
   Comlink tracks only resolve, not reject. Drained calls reject with
   WorkerDeadError carrying the in-flight method names.
   (Copilot @ createWorkerHost.ts:744)

2) performBoot() wraps the synchronous boot setup in two try/catch
   blocks: one around opts.load() (CSP failure, bad URL, factory bug)
   and one around w.postMessage() (DataCloneError on bad payload).
   Either failure now transitions the host to dead and tears down
   instead of leaving it stranded in starting until the 10s bootstrap
   timeout fires. The unawaited ready promise gets .catch() so it
   doesn't surface as an unhandled rejection.
   (Copilot @ createWorkerHost.ts:320)

3) performBoot() now self-validates the constructed bootstrap message
   via validateBootstrap() before postMessage. Catches host-side
   capability-detection bugs locally instead of silently shipping a
   malformed shape to the worker (which would surface as a 10s
   bootstrap timeout). Resolves the doc/usage mismatch on
   workerEnvelope.ts:58 by making the validator actually used on both
   sides as the JSDoc claimed.
   (Copilot @ workerEnvelope.ts:58)

Three new unit tests cover the new branches:
* dispose drains in-flight calls with WorkerDeadError
* opts.load() throwing transitions host to dead
* postMessage() throwing transitions host to dead

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When validateBootstrap fails or w.postMessage throws synchronously, the
'ready' promise was constructed but never explicitly rejected — the
attached .catch(() => {}) prevents the unhandled-rejection warning but
the promise (and its captured rejectBoot/event-port closures) lingers
until GC. Call rejectBoot?.(err) before throwing so the promise settles
deterministically and its .finally clears rejectBoot immediately.

Found in self-review of PR #699 after addressing Copilot's boot-try/catch
comment.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ed-host

useMemo([], () => create()) caches the SAME host instance across React
Strict Mode's dev-only mount → unmount → remount. The first effect
cleanup calls host.dispose() — which is terminal — and the second
mount inherits the now-dead host so every host.api.* call rejects
with WorkerDeadError.

Switch to useState with a lazy initializer + an effect branch that
detects the disposed-from-strict-mode case and swaps in a fresh host.
In production (no Strict Mode double-mount) this collapses to the
normal create-once-dispose-on-unmount lifecycle.

Found in self-review of PR #699.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…m primitive (#699)

## Summary

Introduces a typed Web Worker foundation in
`sites/arolariu.ro/src/workers/` — a single factory (`createWorkerHost`)
that wraps Comlink, owns lifecycle, normalizes errors across the thread
boundary, bridges telemetry, snapshots capabilities, and propagates
AbortSignal. Ships with a dev-only `/playground/workers/` route that
exercises every documented behavior; that route doubles as the
Playwright integration target.

This PR is intentionally **decoupled from the Local AI Assistant**
(`feat/local-ai-assistant`). The AI feature will be rebased onto this
foundation in a separate follow-up PR.

- Spec:
`docs/superpowers/specs/2026-04-27-web-workers-foundation-design.md`
(gitignored — local workspace doc)
- Plan: `docs/superpowers/plans/2026-04-27-web-workers-foundation.md`
(gitignored)

## Architecture (one paragraph)

Hybrid: **Comlink** as wire transport, an in-house factory layer
(`createWorkerHost`) is the only thing consumers touch.
Service-worker-per-feature lifecycle (singleton; `createWorkerPool`
deferred). Two `MessageChannel`s per worker — RPC (Comlink) +
telemetry/events (us). Strict surface-only error model with explicit
`restart()`. SAB-aware via `crossOriginIsolated` runtime check; no
COOP/COEP enabled in this PR.

## What's in

- `src/workers/host/` — `createWorkerHost`, state machine
(`workerLifecycle`), error types (5 classes), telemetry bridge,
capability detection, AbortSignal plumbing (parent-side, mid-flight +
pre-aborted), lazy reboot.
- `src/workers/runtime/` — `expose`, `emitEvent`, `getEventPort` helpers
for worker-side code.
- `src/workers/{index,host/index,runtime/index}.ts` — public + internal
barrels.
- `src/workers/README.md` — one-page consumer guide.
- `src/app/playground/workers/` — dev-only gated route (`notFound()`
outside dev/staging) with interactive UI.
- `src/app/playground/workers/worker-playground.spec.ts` — Playwright
spec covering boot / echo / abort / crash-restart / capabilities (manual
run).
- `comlink ^4.4.2` — the only new dependency.

## Explicitly OUT of scope (deferred)

- AI assistant integration (separate follow-up PR;
`feat/local-ai-assistant` stays untouched).
- CSP changes / `next.config.ts` edits — existing `preview` CSP already
permits same-origin module workers.
- COOP/COEP / SharedArrayBuffer enablement.
- `createWorkerPool` primitive (waiting for a real second consumer).
- Worker-side OpenTelemetry SDK (wire reserves the `span` event shape;
no SDK ships).
- ESLint rule preventing direct `new Worker()` outside the foundation.
- Worker-side AbortSignal forwarding (parent-side rejection works;
worker handler runs to completion — see README "Known limitations").

## Test plan

- [x] `npm run test:unit` passes — 110 unit tests, coverage on
`src/workers/host/` is 97.75% statements / 90.18% branches / 96.62%
functions / 98.45% lines.
- [ ] `npm run lint` passes (manual).
- [ ] `npm run test:e2e:frontend` passes the new
`worker-playground.spec.ts` — 6 scenarios (boot, echo, abort,
crash/restart, capabilities×2) (manual).
- [ ] `npm run build:website` succeeds and emits a separate chunk for
`playground.worker.ts` (manual; verify via build output search).
- [ ] In a production build, `GET /playground/workers/` returns 404 —
`SITE_ENV=PRODUCTION npm run build && npm start`, then curl (manual).
- [ ] In a dev build, the playground loads, all UI sections render,
every button performs the expected lifecycle transition (manual).
- [ ] No `new Worker(` outside the foundation — `grep -r "new Worker("
sites/arolariu.ro/src --exclude-dir=workers` (manual).
- [x] No changes to `next.config.ts` (verified via diff).
- [x] Exactly one new dependency added (`comlink`).

## Notable design decisions

- **Two-channel wire protocol** (RPC + events) — telemetry events can't
backpressure RPC; a bug in event handling can't wedge the call graph.
- **Explicit `restart()`, no auto-respawn** — a stateful worker losing
its state is a user-visible event the consumer must handle, not infra to
silently mask.
- **Lazy reboot is invisible** — when the idle timer fires, the worker
is silently disposed but `state` stays `"ready"`; the next call
lazy-boots transparently. Consumers who want the lifecycle signal
subscribe to `WorkerEvent`s instead of `state`.
- **`AbortSignal` as last argument** — runtime convention, not a
type-system invariant. The host detects via `instanceof`, races the call
body against signal abort, and rejects the consumer's promise. The
worker handler doesn't currently receive the signal — documented
limitation.
- **Error envelope** — worker-side throws plain `{__workerError, name,
message, stack}` (Comlink-friendly); parent-side rewraps as
`WorkerError(cause, method)`. Consumers never need string-matching error
checks.

## What follow-ups will need

After this lands, separate PRs can incrementally add:
1. **AI assistant rebase** — port the existing `feat/local-ai-assistant`
onto this foundation (the headline reason this PR exists).
2. **CSP hardening** — tighter `worker-src 'self' blob:`, explicit
`connect-src` allowlist for external worker fetch hosts, fixing the
no-op `event.origin` check pattern.
3. **Worker pool primitive** — once a second stateless consumer (e.g.,
OCR) materializes.
4. **Worker-side cancel-message** — propagate AbortSignal across the
thread boundary so handlers can honor it cooperatively.
5. **Worker-side OTel SDK** — full W3C trace propagation; the wire
envelope already reserves the `span` event shape for forward
compatibility.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new Web Worker foundation for sites/arolariu.ro, providing a typed host/runtime handshake (via Comlink), lifecycle management (idle reboot, restart/dispose), and a telemetry side-channel, plus a dev-only playground route and Playwright coverage to validate real-worker behavior.

Changes:

  • Added @/workers module with host/ + runtime/ layers, bootstrap protocol, lifecycle state machine, error taxonomy, and telemetry bridge.
  • Added a dev/staging-only /playground/workers/ route with a demo worker to exercise boot/calls/abort/crash/restart/capabilities/event streaming.
  • Added unit tests for the foundation and a Playwright spec covering production-fidelity worker behaviors; added comlink dependency.

Reviewed changes

Copilot reviewed 29 out of 30 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
sites/arolariu.ro/src/workers/runtime/index.ts Worker-side runtime barrel exports (expose, event helpers).
sites/arolariu.ro/src/workers/runtime/exposeWorker.ts Worker-side bootstrap handler + Comlink exposure + error envelope normalization.
sites/arolariu.ro/src/workers/runtime/exposeWorker.test.ts Unit tests for worker-side expose() bootstrap behavior.
sites/arolariu.ro/src/workers/runtime/emitEvent.ts Best-effort worker→host event emission helper.
sites/arolariu.ro/src/workers/runtime/emitEvent.test.ts Unit tests for emitEvent() behavior and swallow-on-failure contract.
sites/arolariu.ro/src/workers/README.md Worker foundation documentation: rules, lifecycle, errors, limitations, testing notes.
sites/arolariu.ro/src/workers/index.ts Public @/workers barrel (host factory/types/errors + re-export Comlink markers).
sites/arolariu.ro/src/workers/host/workerLifecycle.ts Pure lifecycle state machine + idle timer + subscription support.
sites/arolariu.ro/src/workers/host/workerLifecycle.test.ts Unit tests for lifecycle transitions, idle scheduling, and guards.
sites/arolariu.ro/src/workers/host/workerErrors.ts Worker host error taxonomy (WorkerError, WorkerCrashError, etc.).
sites/arolariu.ro/src/workers/host/workerErrors.test.ts Unit tests for error classes and message/field behavior.
sites/arolariu.ro/src/workers/host/workerEnvelope.ts Wire protocol types (bootstrap + WorkerEvent) + bootstrap validator.
sites/arolariu.ro/src/workers/host/workerEnvelope.test.ts Unit tests for bootstrap validator acceptance/rejection cases.
sites/arolariu.ro/src/workers/host/workerCapabilities.ts Capability snapshot sampling (crossOriginIsolated, WebGPU presence, etc.).
sites/arolariu.ro/src/workers/host/workerCapabilities.test.ts Unit tests for capability sampling behavior under mocked globals.
sites/arolariu.ro/src/workers/host/telemetryBridge.ts Parent-side call wrapping + worker event ingestion into a logger sink.
sites/arolariu.ro/src/workers/host/telemetryBridge.test.ts Unit tests for telemetry wrapping and event ingestion routing.
sites/arolariu.ro/src/workers/host/mockWorker.ts Test-only mock worker implementation to exercise host behavior without a real thread.
sites/arolariu.ro/src/workers/host/index.ts Internal host barrel for @/workers exports.
sites/arolariu.ro/src/workers/host/createWorkerHost.ts Core host implementation: bootstrap handshake, proxy interception, abort/timeout/crash/restart/dispose semantics.
sites/arolariu.ro/src/workers/host/createWorkerHost.test.ts Comprehensive unit tests for host lifecycle, crash handling, restart/dispose edge cases, abort + timeout.
sites/arolariu.ro/src/workers/host/createPortPair.ts Utility for naming MessageChannel halves (parent vs transferable).
sites/arolariu.ro/src/workers/host/createPortPair.test.ts Unit tests for port-pair creation and isolation across calls.
sites/arolariu.ro/src/app/playground/workers/worker-playground.spec.ts Playwright E2E spec validating real-worker behaviors via the playground UI.
sites/arolariu.ro/src/app/playground/workers/playground.worker.ts Playground demo worker exposing a surface-rich API for exercising the foundation.
sites/arolariu.ro/src/app/playground/workers/page.tsx Dev/staging-only route gate + metadata for the worker playground.
sites/arolariu.ro/src/app/playground/workers/island.tsx Client UI to drive worker calls, show state/errors, and display event logs.
sites/arolariu.ro/package.json Adds comlink to site dependencies.
package.json Adds pinned root comlink dependency.
package-lock.json Lockfile updates for comlink installation.

Comment on lines +74 to +82
export function validateBootstrap(message: unknown): message is WorkerBootstrap {
if (typeof message !== "object" || message === null) {
return false;
}
const m = message as Record<string, unknown>;
if (m.kind !== "bootstrap") return false;
if (m.version !== WORKER_PROTOCOL_VERSION) return false;

// Check for MessagePort by duck typing: require a callable postMessage method.
Comment on lines +99 to +101
// Optional numeric fields: if present, must be a number.
if (caps.hardwareConcurrency !== undefined && typeof caps.hardwareConcurrency !== "number") return false;
if (caps.deviceMemory !== undefined && typeof caps.deviceMemory !== "number") return false;
Comment on lines +368 to +392
event.parent.onmessage = (e: MessageEvent): void => {
const ev = e.data as WorkerEvent;
if (ev.kind === "ready") {
if (bootTimeoutId !== null) {
clearTimeout(bootTimeoutId);
bootTimeoutId = null;
}
// Swap to the steady-state listener that ingests events.
event.parent.onmessage = (next: MessageEvent): void => {
const nextEv = next.data as WorkerEvent;
// I4: Filter stray `ready` events that arrive after bootstrap.
// Bootstrap-ready is consumed by the boot promise itself; never forward.
if (nextEv.kind === "ready") return;
opts.onEvent?.(nextEv);
bridge.ingestEvent(nextEv);
};
resolve();
return;
}
// E: Bootstrap `ready` is handled in the branch above (consumed by the
// boot promise; never forwarded). Anything else that arrives before
// the handshake is forwarded defensively to keep behavior parity with
// the steady-state listener for non-`ready` events.
opts.onEvent?.(ev);
bridge.ingestEvent(ev);
Comment on lines +71 to +83
ingestEvent(event: WorkerEvent): void {
switch (event.kind) {
case "ready":
return; // consumed by workerLifecycle; not forwarded
case "log":
logger[event.level](`[worker:${name}] ${event.msg}`, event.attrs);
return;
case "metric":
logger.debug(`[worker:${name}] metric`, {worker: name, name: event.name, value: event.value, unit: event.unit, attrs: event.attrs});
return;
case "span":
logger.debug(`[worker:${name}] span`, {worker: name, name: event.name, startMs: event.startMs, durationMs: event.durationMs, attrs: event.attrs});
return;
Comment on lines +104 to +110
// Include a non-function value (a plain string). Cast the WHOLE record
// so the type system accepts the heterogeneous shape — keeping `version`
// a real string (not a fake function) makes the intent of exercising
// the `typeof value !== "function"` branch obvious.
expose({greet: async () => "hi", version: "1.0.0"} as unknown as Record<string, () => Promise<unknown>>, {
self: fakeSelf as unknown as DedicatedWorkerGlobalScope,
});
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 6, 2026

❌ Code Hygiene Report: Issues Found

Commit: efcf55f | PR: #711

📑 Table of Contents

Section Status
📊 Code Statistics
🎨 Formatting
🔍 Linting
🧪 Unit Tests

📋 Check Summary

Check Status Duration Summary
📊 Stats 546ms 0 files changed, +0 -0
🎨 Format 1m 19s 11 file(s) need formatting
🔍 Lint 3.1s 3 error(s), 0 warning(s)
🧪 Test 2m 5s All 1147 tests passed

📊 Code Statistics

Changes vs Main Branch

Metric Value
📁 Files Changed 0
➕ Lines Added +0
➖ Lines Deleted -0
🔄 Churn 0
📈 Net Change +0
🔄 Changes Since Previous Commit
Metric Value
Files Changed 30
Lines Added +4256
Lines Deleted -0

📦 Bundle Size Analysis (vs Main)

`sites/arolariu.ro` - +170 kB (28 file(s) changed)
File Main Preview Diff Status
package.json 3.67 kB 3.69 kB +20 B 📝
src/app/playground/workers/island.tsx 0 B 17 kB +17 kB 🆕
src/app/playground/workers/page.tsx 0 B 1.61 kB +1.61 kB 🆕
src/app/playground/workers/playground.worker.ts 0 B 2.88 kB +2.88 kB 🆕
src/app/playground/workers/worker-playground.spec.ts 0 B 7.49 kB +7.49 kB 🆕
src/workers/README.md 0 B 9.27 kB +9.27 kB 🆕
src/workers/host/createPortPair.test.ts 0 B 1.72 kB +1.72 kB 🆕
src/workers/host/createPortPair.ts 0 B 1.73 kB +1.73 kB 🆕
src/workers/host/createWorkerHost.test.ts 0 B 35.8 kB +35.8 kB 🆕
src/workers/host/createWorkerHost.ts 0 B 34.5 kB +34.5 kB 🆕
src/workers/host/index.ts 0 B 580 B +580 B 🆕
src/workers/host/mockWorker.ts 0 B 4.62 kB +4.62 kB 🆕
src/workers/host/telemetryBridge.test.ts 0 B 7.09 kB +7.09 kB 🆕
src/workers/host/telemetryBridge.ts 0 B 3.89 kB +3.89 kB 🆕
src/workers/host/workerCapabilities.test.ts 0 B 2.44 kB +2.44 kB 🆕
src/workers/host/workerCapabilities.ts 0 B 2.05 kB +2.05 kB 🆕
src/workers/host/workerEnvelope.test.ts 0 B 3.93 kB +3.93 kB 🆕
src/workers/host/workerEnvelope.ts 0 B 4.18 kB +4.18 kB 🆕
src/workers/host/workerErrors.test.ts 0 B 2.3 kB +2.3 kB 🆕
src/workers/host/workerErrors.ts 0 B 2.52 kB +2.52 kB 🆕
... ... ... ... 8 more files

Total: 10 MB → 10.2 MB (+170 kB)

`sites/api.arolariu.ro` - no change (0 file(s) changed)

No changes in this folder

Total: 2.14 MB → 2.14 MB (no change)

`sites/docs.arolariu.ro` - no change (0 file(s) changed)

No changes in this folder

Total: 214 kB → 214 kB (no change)

🎨 Formatting

11 file(s) need formatting:

View files requiring formatting
  • sites/arolariu.ro/src/app/playground/workers/island.tsx
  • sites/arolariu.ro/src/workers/host/createWorkerHost.test.ts
  • sites/arolariu.ro/src/workers/host/createWorkerHost.ts
  • sites/arolariu.ro/src/workers/host/index.ts
  • sites/arolariu.ro/src/workers/host/mockWorker.ts
  • sites/arolariu.ro/src/workers/host/telemetryBridge.test.ts
  • sites/arolariu.ro/src/workers/host/telemetryBridge.ts
  • sites/arolariu.ro/src/workers/host/workerErrors.test.ts
  • sites/arolariu.ro/src/workers/index.ts
  • sites/arolariu.ro/src/workers/runtime/exposeWorker.test.ts
  • sites/arolariu.ro/src/workers/runtime/index.ts

🔧 How to Fix

npm run format

🔍 Linting

❌ ESLint found 3 error(s) and 0 warning(s)

View raw output

> @arolariu/monorepo@0.0.0 lint
> node scripts/lint.ts all


╔════════════════════════════════════════╗
║    arolariu.ro Code Linter Tool        ║
╚════════════════════════════════════════╝


🔎 Running ESLint for: all
⏱️  Running lint on all targets in parallel...

  🧵 Dispatching parallel workers...
     Main process PID: 2929
     Worker pool: min=1, max=3

[11:24:19.195] 🚀 Worker #1 spawned for task "packages"
[11:24:19.195] 🚀 Worker #2 spawned for task "website"
[11:24:19.195] 🚀 Worker #3 spawned for task "cv"


  ⏳ Progress: [░░░░░░░░░░░░░░░░░░░░] 0/3 workers completed
  ⏳ Progress: [███████░░░░░░░░░░░░░] 1/3 workers completed
  ⏳ Progress: [█████████████░░░░░░░] 2/3 workers completed
  ⏳ Progress: [████████████████████] 3/3 workers completed

[11:24:22.047] ❌ Worker #1 finished "packages" in 2.48s
[11:24:22.047] ❌ Worker #2 finished "website" in 2.56s
[11:24:22.047] ❌ Worker #3 finished "cv" in 2.58s

  📊 Worker Timeline
  ──────────────────────────────────────────────────────────────
  packages   │██████████████████████████████████████░░│    2.48s
  website    │████████████████████████████████████████│    2.56s
  cv         │████████████████████████████████████████│    2.58s
  ──────────────────────────────────────────────────────────────
              0s                            2.58s

─────────────────────────────────────────────────

🔍 ESLint config: [@arolariu/packages] [Worker #1]
   [init: 2436ms, work: 0ms, total: 2478ms] [0 files] [168.58 MB]
  ✗ Worker error: You are using an outdated version of the 'jiti' library. Please update to the latest version of 'jiti' to ensure compatibility and access to the latest features.
─────────────────────────────────────────────────

─────────────────────────────────────────────────

🔍 ESLint config: [@arolariu/website] [Worker #2]
   [init: 2534ms, work: 0ms, total: 2565ms] [0 files] [168.59 MB]
  ✗ Worker error: You are using an outdated version of the 'jiti' library. Please update to the latest version of 'jiti' to ensure compatibility and access to the latest features.
─────────────────────────────────────────────────

─────────────────────────────────────────────────

🔍 ESLint config: [@arolariu/cv] [Worker #3]
   [init: 2554ms, work: 0ms, total: 2577ms] [0 files] [167.62 MB]
  ✗ Worker error: You are using an outdated version of the 'jiti' library. Please update to the latest version of 'jiti' to ensure compatibility and access to the latest features.
─────────────────────────────────────────────────

  📊 Resource Usage:
     Total files linted: 0
     Peak memory (max worker): 168.59 MB
     Combined memory (all workers): 504.78 MB

📊 Summary: 3 error(s), 0 warning(s)

❌ Linting completed with errors



🔧 How to Fix

npm run lint

🧪 Unit Tests

✅ All 1147 tests passed in 4.0s


🔗 View Workflow Run | Generated at 2026-05-06T11:28:28.634Z

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants