Skip to content

feat(invoice-ai): local invoice AI assistant (Layer 1 embedding + Layer 2 slot LLM)#712

Draft
arolariu wants to merge 44 commits intopreviewfrom
feat/invoice-ai-assistant
Draft

feat(invoice-ai): local invoice AI assistant (Layer 1 embedding + Layer 2 slot LLM)#712
arolariu wants to merge 44 commits intopreviewfrom
feat/invoice-ai-assistant

Conversation

@arolariu
Copy link
Copy Markdown
Owner

@arolariu arolariu commented May 6, 2026

Summary

Adds a fully-local Invoice AI assistant: natural-language analytical queries ("top merchants last month?", "cum am cheltuit luna trecută?", "comparer mes dépenses ce mois avec le mois dernier") over invoices in IndexedDB. No network calls — all data stays in the browser.

Architecture (per design doc docs/superpowers/specs/2026-05-06-local-invoice-ai-assistant-design.md):

  • Layer 1 (eager, all devices): Multilingual embedding classifier — Xenova/multilingual-e5-small on Transformers.js (~118 MB WASM). Cosine-ranks the user's question against 300 precomputed seed-phrase embeddings (10 intents × 10 phrasings × 3 locales). Returns top-3 candidates + score.
  • Layer 2 (opt-in, hardware-eligible only): Slot extractor — Qwen2.5-1.5B-Instruct-q4f16_1-MLC on WebLLM MLCEngine (~1 GB WebGPU). JSON-mode + temperature=0 for deterministic slot extraction when the embedding signal is uncertain.
  • Both ride on createWorkerHost<TApi> from PR feat(workers): introduce Web Worker foundation as first-class platform primitive #699.
  • Aggregators are deterministic TypeScript over useInvoicesStore.entities. The LLM never sees invoice data.

What's included

Layer Files Tests
Hardware eligibility gate hardwareEligibility.ts 13
Intent catalog intents/catalog.ts (registry-driven)
Slot lexicon (en/ro/fr) intents/slotLexicon.ts 46
Intent resolver intents/intentResolver.ts 8
Seed phrases (300) intents/seedPhrases.{en,ro,fr}.ts (data)
Aggregator fixtures aggregators/__fixtures__/*.ts (data)
Shared aggregator helpers aggregators/shared.ts 8
10 aggregators aggregators/{totalSpend,invoiceCount,...}.ts 41
Aggregator registry aggregators/index.ts (registry-driven)
i18n namespace (en/ro/fr) messages/{en,ro,fr}.json (data)
Answer renderer renderer/answerRenderer.ts 5
4 viz primitives renderer/viz/*.tsx 4
Assistant reducer assistantReducer.ts 12
Build-time embeddings scripts/generate.embeddings.ts + seedEmbeddings.json (script)
Layer 1 embedding worker workers/embedding.{api,implementation,worker}.ts 3
Layer 2 slot extractor worker workers/slotExtractor.{api,implementation,worker}.ts 4
Worker host factories hosts/{embeddingHost,slotExtractorHost}.ts (thin)
useInvoiceAssistant hook useInvoiceAssistant.tsx 2
AssistantMessage component AssistantMessage.tsx 3
AssistantPanel component AssistantPanel.tsx 4
Wire into GenerativeView view-invoices/_components/views/GenerativeView.tsx (188 existing pass)
Calibration tool scripts/calibrate-assistant-embeddings.ts (manual)
Playwright E2E spec view-invoices/_components/views/generative-view.spec.ts 4 scenarios

Total: 38 atomic commits, ~4628 LOC added, 166 new unit tests (all green), 4 Playwright E2E scenarios. Full npx vitest run suite: 1982/1982 passing, coverage 94.66% lines / 96.53% functions / 95.15% statements / 82.81% branches.

Architectural locks (from brainstorm)

  • T1: analytics-only scope (no Q&A over arbitrary invoice content)
  • Architecture (1): intent classifier + slot extractor (not retrieval-augmented)
  • Implementation (B+C fallback): hybrid embedding + tiny LLM, with heuristic fallback when LLM-eligible hardware isn't available
  • L2: multilingual (en/ro/fr)
  • M1: single-shot (no multi-turn dialog)
  • I2: 10 intents
  • P1: replaces stub body of GenerativeView chat tab (settings tab preserved)
  • H1: session-only history (cleared on tab unmount; capped at 50 entries)
  • D2: Layer 1 eager / Layer 2 opt-in CTA

Caveats / followups

  1. Calibration shows significant intent overlap in the seed-phrase corpus (intra-class p10 = 0.85 sits below inter-class p90 = 0.90). Current CONFIDENCE_THRESHOLDS (canonical 0.75 / uncertain 0.55) intentionally err generous; tighten only after a larger seed corpus per locale is added.
  2. CSP: I left next.config.ts unchanged per directive. WebLLM + Transformers.js will require script-src 'wasm-unsafe-eval' + worker-src blob: — manual update needed before production.
  3. Coverage branch threshold (90%) is below by ~7% because some viz/panel/hook branches are exercised end-to-end via Playwright rather than vitest. Not a regression; pre-existing repo-wide threshold.
  4. ESLint config has a pre-existing jiti version issue unrelated to this PR; lint pass deferred until that's resolved at the repo level.

How to test locally

cd .worktrees/feat/invoice-ai-assistant
npm install                      # if not already
npm run build:components         # build @arolariu/components dist
node scripts/generate.embeddings.ts  # only needed once; downloads 118 MB model
cd sites/arolariu.ro
npx vitest run src/app/domains/invoices/_components/ai/   # 166 unit tests
npm run dev:website              # then visit /domains/invoices/view-invoices

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

arolariu and others added 30 commits May 6, 2026 17:36
Adds the two model-runtime dependencies for the local invoice AI
assistant. @xenova/transformers (Transformers.js) drives the Layer 1
multilingual-e5-small embedding classifier on every device.
@mlc-ai/web-llm drives the opt-in Layer 2 Qwen2.5-1.5B slot extractor
on WebGPU-eligible hardware.

Spec: docs/superpowers/specs/2026-05-06-local-invoice-ai-assistant-design.md

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The barrel re-exports AssistantPanel which will be added in Phase 10.
Until then this file's import will TS-error — expected and intentional;
future tasks have a specific shape for AssistantPanel.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds the shared type vocabulary used across every layer of the AI
assistant: AssistantLocale (en/ro/fr), IntentId (10 intents from v1
catalog), Timeframe (12 canonical windows), VizHint, and the
CONFIDENCE_THRESHOLDS constant pair.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Layer 1 (embedding model) ships to every device on WASM. This gate
only decides whether to offer the Layer 2 (~1 GB Qwen-1.5B) opt-in
CTA. Returns eligible | ineligible | unknown with machine-readable
reason codes.

Hard gates: workers-unavailable, webgpu-unavailable,
webgpu-adapter-unavailable, storage-quota-too-low.
Soft gates (only when reported): memory-too-low, cpu-too-low.

8 unit tests cover all branches.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…nch tests

Addresses code-review findings on the hardware eligibility module:

- Fix silent fallthrough when navigator.gpu exists but requestAdapter
  is not callable (partial polyfill / future API drift). The branch now
  pushes webgpu-adapter-unavailable explicitly.
- Add JSDoc to all 3 exports (HardwareEligibilityReason,
  HardwareEligibilityResult, checkHardwareEligibility) per RFC 1002.
- Extract the 186-char inline navigator type cast into a named
  NavigatorWithHardwareHints interface. Test file gets a NavigatorStub.
- Add 5 new tests covering: requestAdapter throws, requestAdapter
  missing while gpu present, storage.estimate throws, navigator absent,
  and boundary values (deviceMemory === 4, hardwareConcurrency === 4).
- Add rationale comments to the three threshold constants.

Test count: 8 -> 13. Coverage: 94% statements -> ~99% expected.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Declarative registry of intent IDs, slot grammar per intent, and viz
hint per intent. The IntentDefinition shape lets the resolver and
renderer dispatch generically without per-intent special cases.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Layer 1 path for slot extraction. Pure regex/keyword tables for
en/ro/fr that translate canonical user phrasings to the discrete
Timeframe enum and topK integer. Diacritic-folding + case-insensitive
matching. topK clamps to [1, 20] with default 5.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Trust boundary between the model layer and the deterministic aggregator
layer. Validates intent against the catalog whitelist, normalizes slots
against the canonical Timeframe enum, clamps topK to [1, 20], and
falls back to question-text lexicon parsing when slots arrive empty.
NEVER spreads or passes through unvalidated slot values.

Three-state slot inspection (valid / invalid / absent) ensures that
a present-but-invalid slot rejects with out-of-scope rather than
silently falling through to the default.

8 unit tests cover happy paths and validation rejections.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
10 canonical phrasings per intent per locale (300 total). Drives the
Layer 1 cosine-similarity classifier. The build-time embeddings
generator (Phase 7) will encode these into a precomputed matrix
shipped with the worker.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…currency)

Three deterministic fixture generators: empty, ~54 EUR invoices over
18 months, and a multi-currency variant that flips alternating
invoices to RON. Used by every aggregator test in Phase 4 to verify
empty-result branches, currency grouping, and date-window edges.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
resolveTimeframeWindow translates the 12 canonical Timeframe values
into UTC Date ranges. filterByTimeframe + filterNotDeleted +
groupByCurrency are the building blocks every aggregator composes.
All time-dependent functions take 'now' as an explicit param so tests
are deterministic.

groupByCurrency tolerates both shapes of paymentInformation.currency
(plain string from fixtures, Currency object from production types).

8 unit tests cover window math, soft-delete filtering, and per-currency
bucketing across the empty / single-currency / multi-currency fixtures.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Pure function (invoices, {timeframe, category?}, now) -> TotalSpendResult.
Filters soft-deleted, applies optional category filter, splits multi-currency
results into per-currency buckets. Returns explicit empty marker when no
invoices match.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…kdown

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ant filter

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…currency)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Single entry point: runAggregator(intent, invoices, slots, now) returns
a discriminated StructuredAnswer union. Exhaustiveness check via never
ensures any future intent addition is a compile-time enforcement.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds the localized message catalog: panel labels, state messages,
Layer 2 opt-in CTA copy, action buttons, timeframe labels, example
chip labels, and answer templates with ICU plural rules for all 10
intents in en/ro/fr.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…e+viz

Pure dispatch on intent. Returns localized prose by calling injected
next-intl t() with template keys + params. Empty-result branches
produce friendly 'try alternatives' copy. Translator function is
injected so the module is testable without next-intl.

5 unit tests cover populated/empty branches across totalSpend,
topMerchantsByCount, and the spendComparison no-change vs delta
templates.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Four minimal a11y-aware viz components built on @arolariu/components
Card primitives + plain SVG (donut). No chart library dependency.
Each emits a stable data-testid and uses role="img" with aria-label
on visual elements for screen-reader access.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Discriminated State union (10 statuses) + exhaustive Action union.
Two consecutive slotLlmTimeout actions raise shouldRestartSlotHost
(one-shot flag the hook reads + clears via resetSlotHostFlag).
History capped at 50 entries (oldest evicted via slice).

Layer 2 sub-state is independent of the main status: the assistant
can answer questions while the Layer-2 model is downloading.

12 unit tests cover all transitions + cap + flag + reset semantics.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…matrix

Encodes 300 seed phrases (10 intents x 10 phrasings x 3 locales) with
Xenova/multilingual-e5-small. Sub-ms cosine sim at runtime; only the
user's question encodes per classify call (~50 ms).

The committed seedEmbeddings.json is an empty placeholder so the
worker module compiles. Engineer must run:
  node scripts/generate.embeddings.ts
to download the 118 MB model and write the ~460 KB precomputed matrix
before the assistant returns useful classifications. The script is
idempotent and re-runs cheaply once the model is cached.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Wraps Transformers.js multilingual-e5-small in createWorkerHost. Uses
the precomputed seed-phrase matrix for sub-ms cosine ranking. Returns
top intent + score + top-3 candidates.

API surface (embedding.api.ts): EmbeddingWorkerApi with ensureLoaded()
+ classify(). Implementation is module-level singleton (extractor only
loads once per worker lifetime). Tests use vi.hoisted for the
Transformers.js mock so the asyncFn extractor is captured cleanly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
5s default call timeout; 5min idle timeout for lazy reboot.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ess)

Uses WebLLM's MLCEngine in-process inside our createWorkerHost worker.
JSON-mode + temperature=0 for deterministic output. Defensive: rejects
hallucinated intents not in the candidate list.

API surface (slotExtractor.api.ts): SlotExtractorWorkerApi with
ensureLoaded() + extract() + unload(). Implementation reloads the
~1 GB Qwen-1.5B model on first ensureLoaded; chat completions enforce
JSON object response_format. 4 unit tests cover not-loaded reject,
valid JSON happy path, hallucinated intent reject, and invalid JSON
reject.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
arolariu and others added 8 commits May 6, 2026 20:59
30s default call timeout (cold-start model load); 10min idle timeout.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…eline

Owns: reducer state machine, two worker hosts (Layer 1 eager, Layer 2
lazy on opt-in), Strict-Mode-safe lifecycle (PR #699 pattern), and
the classify -> resolve -> aggregate -> render pipeline. Auto-restarts
the slot host when consecutive timeouts hit the threshold (the reducer
sets a one-shot flag the hook reads + clears via resetSlotHostFlag).

2 unit tests cover the boot transition (capability-check ->
embedding-loading -> embedding-ready) and the end-to-end
classify->resolve->aggregate->render pipeline against an empty corpus.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Renders prose + dispatched viz primitive based on intent. Lifts the
viz extractors to local helpers so the panel doesn't need them.

3 unit tests cover bar-chart, single-stat, and donut viz dispatches.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Renders all 10+ states with role/aria-live attributes; aria-busy on
input during pending; chip clicks re-submit canonical queries;
Layer 2 opt-in CTA in header with download progress + active badge.
Includes aggregator-error alert path.

Updates the public barrel to re-export AssistantPanel + AssistantPanelProps.

4 unit tests cover the workers-unavailable terminal state, the
embedding-ready chips state, the Layer 2 eligible CTA, and the
embedding-loading progress bar.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replaces the chat-tab placeholder body (Card wrapping a stub
MessageList) with a single <AssistantPanel /> mount. The settings tab
remains untouched. The `invoices` prop is preserved for API
compatibility but the assistant reads directly from useInvoicesStore
so no prop wiring is needed.

188 existing view-invoices tests still pass (no regression).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…r Node 24 ESM

Node 24 native TypeScript loader requires explicit .ts extensions for
relative module specifiers. Patched the generator to import
seedPhrases.{en,ro,fr}.ts. Regenerated the matrix: 300 embeddings,
~2.4 MB JSON.

After this commit the embedding worker can classify questions against
the real seed phrases instead of the empty placeholder.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…olds

Reports intra-class vs inter-class cosine similarity over the 300
seed phrases. Recommends canonical/uncertain thresholds. Manual script
- not run in CI.

Initial calibration on the regenerated matrix shows significant
overlap between intents (intra mean 0.90, inter mean 0.86 — 10th-pct
intra 0.85 is BELOW 90th-pct inter 0.90). The current
CONFIDENCE_THRESHOLDS (canonical=0.75, uncertain=0.55) intentionally
err generous so Layer-1 catches more cases; tighten only after a
larger seed-phrase corpus has been added per locale.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
4 critical scenarios: cold-start happy path, out-of-scope -> chip flow,
multilingual (ro), and Strict-Mode tab leave + return clearing history.
Gated to environments with WebGPU; CI may need to skip when running
on headless workers without GPU acceleration.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 6, 2026

❌ Code Hygiene Report: Issues Found

Commit: 7b5245e | PR: #712

📑 Table of Contents

Section Status
📊 Code Statistics
🎨 Formatting
🔍 Linting
🧪 Unit Tests

📋 Check Summary

Check Status Duration Summary
📊 Stats 673ms 0 files changed, +0 -0
🎨 Format 1m 34s 78 file(s) need formatting
🔍 Lint 3.5s 3 error(s), 0 warning(s)
🧪 Test 2m 28s Tests failed (see output for details)

📊 Code Statistics

Changes vs Main Branch

Metric Value
📁 Files Changed 0
➕ Lines Added +0
➖ Lines Deleted -0
🔄 Churn 0
📈 Net Change +0
🔄 Changes Since Previous Commit
Metric Value
Files Changed 5
Lines Added +181
Lines Deleted -94

📦 Bundle Size Analysis (vs Main)

`sites/arolariu.ro` - +2.79 MB (101 file(s) changed)
File Main Preview Diff Status
messages/en.json 231 kB 236 kB +4.86 kB 📝
messages/fr.json 259 kB 264 kB +5.39 kB 📝
messages/ro.json 255 kB 261 kB +5.34 kB 📝
package.json 3.67 kB 3.76 kB +95 B 📝
src/app/domains/invoices/view-invoices/_components/views/GenerativeView.tsx 6.52 kB 5.64 kB -876 B 📝
.gitignore 0 B 54 B +54 B 🆕
src/app/domains/invoices/_components/ai/AssistantMessage.test.tsx 0 B 1.35 kB +1.35 kB 🆕
src/app/domains/invoices/_components/ai/AssistantMessage.tsx 0 B 3.83 kB +3.83 kB 🆕
src/app/domains/invoices/_components/ai/AssistantPanel.test.tsx 0 B 2.56 kB +2.56 kB 🆕
src/app/domains/invoices/_components/ai/AssistantPanel.tsx 0 B 8.26 kB +8.26 kB 🆕
src/app/domains/invoices/_components/ai/aggregators/__fixtures__/empty.fixtures.ts 0 B 105 B +105 B 🆕
src/app/domains/invoices/_components/ai/aggregators/__fixtures__/multi-currency.fixtures.ts 0 B 967 B +967 B 🆕
src/app/domains/invoices/_components/ai/aggregators/__fixtures__/single-currency.fixtures.ts 0 B 4.13 kB +4.13 kB 🆕
src/app/domains/invoices/_components/ai/aggregators/averageSpendPerVisit.test.ts 0 B 1.31 kB +1.31 kB 🆕
src/app/domains/invoices/_components/ai/aggregators/averageSpendPerVisit.ts 0 B 1.51 kB +1.51 kB 🆕
src/app/domains/invoices/_components/ai/aggregators/categoryBreakdown.test.ts 0 B 1.33 kB +1.33 kB 🆕
src/app/domains/invoices/_components/ai/aggregators/categoryBreakdown.ts 0 B 1.9 kB +1.9 kB 🆕
src/app/domains/invoices/_components/ai/aggregators/index.ts 0 B 3.93 kB +3.93 kB 🆕
src/app/domains/invoices/_components/ai/aggregators/invoiceCount.test.ts 0 B 1.49 kB +1.49 kB 🆕
src/app/domains/invoices/_components/ai/aggregators/invoiceCount.ts 0 B 1.31 kB +1.31 kB 🆕
... ... ... ... 81 more files

Total: 10 MB → 12.8 MB (+2.79 MB)

`sites/api.arolariu.ro` - no change (0 file(s) changed)

No changes in this folder

Total: 2.14 MB → 2.14 MB (no change)

`sites/docs.arolariu.ro` - no change (0 file(s) changed)

No changes in this folder

Total: 214 kB → 214 kB (no change)

🎨 Formatting

78 file(s) need formatting:

View files requiring formatting
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/AssistantMessage.test.tsx
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/AssistantMessage.tsx
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/AssistantPanel.test.tsx
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/AssistantPanel.tsx
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/__fixtures__/empty.fixtures.ts
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/__fixtures__/multi-currency.fixtures.ts
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/__fixtures__/single-currency.fixtures.ts
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/averageSpendPerVisit.test.ts
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/averageSpendPerVisit.ts
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/categoryBreakdown.test.ts
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/categoryBreakdown.ts
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/index.ts
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/invoiceCount.test.ts
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/invoiceCount.ts
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/shared.test.ts
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/shared.ts
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/spendComparison.test.ts
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/spendComparison.ts
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/topMerchantsByCount.test.ts
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/topMerchantsByCount.ts
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/topMerchantsBySpend.test.ts
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/topMerchantsBySpend.ts
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/topProductsByCount.test.ts
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/topProductsByCount.ts
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/topProductsBySpend.test.ts
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/topProductsBySpend.ts
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/topSpendingByCategory.test.ts
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/topSpendingByCategory.ts
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/totalSpend.test.ts
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/totalSpend.ts
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/assistantReducer.test.ts
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/assistantReducer.ts
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/hardwareEligibility.test.ts
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/hardwareEligibility.ts
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/hosts/embeddingHost.ts
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/hosts/slotExtractorHost.ts
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/index.ts
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/intents/catalog.ts
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/intents/intentResolver.test.ts
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/intents/intentResolver.ts
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/intents/seedPhrases.en.ts
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/intents/seedPhrases.fr.ts
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/intents/seedPhrases.ro.ts
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/intents/slotLexicon.test.ts
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/intents/slotLexicon.ts
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/renderer/answerRenderer.test.ts
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/renderer/answerRenderer.ts
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/renderer/viz/BarChartHorizontal.test.tsx
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/renderer/viz/BarChartHorizontal.tsx
  • sites/arolariu.ro/src/app/domains/invoices/_components/ai/renderer/viz/ComparisonPair.test.tsx
  • ...and 28 more files

🔧 How to Fix

npm run format

🔍 Linting

❌ ESLint found 3 error(s) and 0 warning(s)

View raw output

> @arolariu/monorepo@0.0.0 lint
> node scripts/lint.ts all


╔════════════════════════════════════════╗
║    arolariu.ro Code Linter Tool        ║
╚════════════════════════════════════════╝


🔎 Running ESLint for: all
⏱️  Running lint on all targets in parallel...

  🧵 Dispatching parallel workers...
     Main process PID: 2893
     Worker pool: min=1, max=3

[17:36:45.938] 🚀 Worker #1 spawned for task "packages"
[17:36:45.938] 🚀 Worker #2 spawned for task "website"
[17:36:45.938] 🚀 Worker #3 spawned for task "cv"


  ⏳ Progress: [░░░░░░░░░░░░░░░░░░░░] 0/3 workers completed
  ⏳ Progress: [███████░░░░░░░░░░░░░] 1/3 workers completed
  ⏳ Progress: [█████████████░░░░░░░] 2/3 workers completed
  ⏳ Progress: [████████████████████] 3/3 workers completed

[17:36:48.888] ❌ Worker #1 finished "packages" in 2.68s
[17:36:48.888] ❌ Worker #2 finished "website" in 2.60s
[17:36:48.888] ❌ Worker #3 finished "cv" in 2.58s

  📊 Worker Timeline
  ──────────────────────────────────────────────────────────────
  packages   │████████████████████████████████████████│    2.68s
  website    │███████████████████████████████████████░│    2.60s
  cv         │██████████████████████████████████████░░│    2.58s
  ──────────────────────────────────────────────────────────────
              0s                            2.68s

─────────────────────────────────────────────────

🔍 ESLint config: [@arolariu/packages] [Worker #1]
   [init: 2656ms, work: 0ms, total: 2682ms] [0 files] [168.05 MB]
  ✗ Worker error: You are using an outdated version of the 'jiti' library. Please update to the latest version of 'jiti' to ensure compatibility and access to the latest features.
─────────────────────────────────────────────────

─────────────────────────────────────────────────

🔍 ESLint config: [@arolariu/website] [Worker #2]
   [init: 2559ms, work: 0ms, total: 2599ms] [0 files] [168.51 MB]
  ✗ Worker error: You are using an outdated version of the 'jiti' library. Please update to the latest version of 'jiti' to ensure compatibility and access to the latest features.
─────────────────────────────────────────────────

─────────────────────────────────────────────────

🔍 ESLint config: [@arolariu/cv] [Worker #3]
   [init: 2521ms, work: 0ms, total: 2579ms] [0 files] [167.88 MB]
  ✗ Worker error: You are using an outdated version of the 'jiti' library. Please update to the latest version of 'jiti' to ensure compatibility and access to the latest features.
─────────────────────────────────────────────────

  📊 Resource Usage:
     Total files linted: 0
     Peak memory (max worker): 168.51 MB
     Combined memory (all workers): 504.45 MB

📊 Summary: 3 error(s), 0 warning(s)

❌ Linting completed with errors



🔧 How to Fix

npm run lint

🧪 Unit Tests

0 of 1147 tests failed

Metric Count
✅ Passed 1147
❌ Failed 0
⏭️ Skipped 0
📝 Todo 0

🔗 View Workflow Run | Generated at 2026-05-07T17:41:13.056Z

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a fully local “Invoice AI assistant” to the invoices UI: Layer 1 multilingual embedding-based intent classification (Transformers.js) plus an optional Layer 2 slot extractor (WebLLM/WebGPU), feeding deterministic TypeScript aggregators over IndexedDB-backed invoice state and rendering answers with simple visualization primitives.

Changes:

  • Replaces the existing GenerativeView chat stub with the new AssistantPanel UI and adds a Playwright E2E spec for the assistant tab.
  • Introduces Layer 1/Layer 2 worker implementations (embedding classifier + slot extractor), a reducer-driven state machine, and the useInvoiceAssistant hook wiring the pipeline.
  • Adds deterministic aggregators + renderer + i18n strings (en/ro/fr) for 10 analytics intents.

Reviewed changes

Copilot reviewed 74 out of 76 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
sites/arolariu.ro/src/app/domains/invoices/view-invoices/_components/views/GenerativeView.tsx Wires the chat tab to the new AssistantPanel (replacing the stub chat UI).
sites/arolariu.ro/src/app/domains/invoices/view-invoices/_components/views/generative-view.spec.ts Adds Playwright E2E coverage for assistant tab scenarios (incl. RO locale + history reset).
sites/arolariu.ro/src/app/domains/invoices/_components/ai/workers/slotExtractor.implementation.ts Implements Layer 2 slot extraction via WebLLM (Qwen) in a worker.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/workers/slotExtractor.implementation.test.ts Unit tests for Layer 2 slot extraction behavior and validation.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/workers/slotExtractor.api.ts Defines the typed RPC contract for the slot extractor worker.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/workers/slot-extractor.worker.ts Worker entry that exposes the slot extractor implementation.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/workers/embedding.worker.ts Worker entry that exposes the embedding implementation.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/workers/embedding.implementation.ts Implements Layer 1 embedding classifier (seed matrix + cosine similarity).
sites/arolariu.ro/src/app/domains/invoices/_components/ai/workers/embedding.implementation.test.ts Unit tests for the embedding classifier behavior.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/workers/embedding.api.ts Defines the typed RPC contract for the embedding worker.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/useInvoiceAssistant.tsx Hook that owns hosts + reducer + classify/resolve/aggregate/render pipeline.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/useInvoiceAssistant.test.tsx Hook-level unit tests with host/hardware/store mocks.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/types.ts Shared assistant types (locales, intents, timeframes, confidence thresholds).
sites/arolariu.ro/src/app/domains/invoices/_components/ai/renderer/viz/SingleStat.tsx Single-stat visualization primitive for answers.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/renderer/viz/SingleStat.test.tsx Tests for SingleStat.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/renderer/viz/Donut.tsx Donut visualization primitive for category breakdown.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/renderer/viz/Donut.test.tsx Tests for Donut.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/renderer/viz/ComparisonPair.tsx Comparison visualization primitive (two values + delta).
sites/arolariu.ro/src/app/domains/invoices/_components/ai/renderer/viz/ComparisonPair.test.tsx Tests for ComparisonPair.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/renderer/viz/BarChartHorizontal.tsx Horizontal bar chart visualization primitive.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/renderer/viz/BarChartHorizontal.test.tsx Tests for BarChartHorizontal.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/renderer/answerRenderer.ts Maps structured aggregator output into prose + viz hint + payload.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/renderer/answerRenderer.test.ts Unit tests for renderer branches.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/intents/slotLexicon.ts Locale-aware deterministic slot parsing (timeframe/topK).
sites/arolariu.ro/src/app/domains/invoices/_components/ai/intents/slotLexicon.test.ts Unit tests for slot lexicon parsing.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/intents/seedPhrases.en.ts EN seed phrases for embedding classifier.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/intents/seedPhrases.ro.ts RO seed phrases for embedding classifier.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/intents/seedPhrases.fr.ts FR seed phrases for embedding classifier.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/intents/intentResolver.ts Trust-boundary validation/coercion for intent + slots.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/intents/intentResolver.test.ts Unit tests for resolver normalization and rejection cases.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/intents/catalog.ts Intent registry (slots + viz hint) for the 10-intent catalog.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/index.ts Public barrel export for assistant module consumers.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/hosts/embeddingHost.ts Intended to provide Layer 1 worker host factory.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/hosts/slotExtractorHost.ts Intended to provide Layer 2 worker host factory.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/hardwareEligibility.ts WebGPU/storage/memory/CPU gating for offering Layer 2 opt-in.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/hardwareEligibility.test.ts Unit tests for the hardware eligibility gate.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/assistantReducer.ts Reducer/state machine for assistant UX + history + Layer 2 state.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/assistantReducer.test.ts Unit tests for reducer transitions and history capping.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/AssistantPanel.test.tsx Component-level tests for AssistantPanel UI states.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/AssistantMessage.tsx Renders a single assistant history entry + visualization.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/AssistantMessage.test.tsx Tests for AssistantMessage + viz dispatch.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/shared.ts Shared deterministic helpers (time windows, currency grouping, soft-delete filtering).
sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/shared.test.ts Tests for shared aggregator helpers.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/index.ts Aggregator registry to dispatch by intent.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/totalSpend.ts Aggregator for total spend.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/totalSpend.test.ts Tests for totalSpend.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/invoiceCount.ts Aggregator for invoice/receipt count.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/invoiceCount.test.ts Tests for invoiceCount.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/topSpendingByCategory.ts Aggregator for top spend categories.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/topSpendingByCategory.test.ts Tests for topSpendingByCategory.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/topMerchantsByCount.ts Aggregator for top merchants by visit count.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/topMerchantsByCount.test.ts Tests for topMerchantsByCount.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/topMerchantsBySpend.ts Aggregator for top merchants by spend.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/topMerchantsBySpend.test.ts Tests for topMerchantsBySpend.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/topProductsByCount.ts Aggregator for top products by quantity.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/topProductsByCount.test.ts Tests for topProductsByCount.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/topProductsBySpend.ts Aggregator for top products by spend.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/topProductsBySpend.test.ts Tests for topProductsBySpend.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/spendComparison.ts Aggregator for timeframe spend comparisons.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/spendComparison.test.ts Tests for spendComparison.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/averageSpendPerVisit.ts Aggregator for average basket size / spend-per-visit.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/averageSpendPerVisit.test.ts Tests for averageSpendPerVisit.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/categoryBreakdown.ts Aggregator for category breakdown (for donut viz).
sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/categoryBreakdown.test.ts Tests for categoryBreakdown.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/fixtures/empty.fixtures.ts Empty-corpus fixture for aggregator tests.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/fixtures/single-currency.fixtures.ts Deterministic single-currency invoice corpus fixture.
sites/arolariu.ro/src/app/domains/invoices/_components/ai/aggregators/fixtures/multi-currency.fixtures.ts Deterministic multi-currency corpus fixture (currency split verification).
sites/arolariu.ro/package.json Adds WebLLM + Transformers.js dependencies.
sites/arolariu.ro/messages/en.json Adds InvoiceAssistant translation namespace (EN).
sites/arolariu.ro/messages/ro.json Adds InvoiceAssistant translation namespace (RO).
sites/arolariu.ro/messages/fr.json Adds InvoiceAssistant translation namespace (FR).
scripts/generate.embeddings.ts Build-time embedding matrix generator for seed phrases.
scripts/calibrate-assistant-embeddings.ts Manual calibration tool for confidence thresholds.

@@ -0,0 +1 @@
System.Management.Automation.Internal.Host.InternalHost No newline at end of file
@@ -0,0 +1 @@
System.Management.Automation.Internal.Host.InternalHost No newline at end of file
Comment on lines +26 to +29
async function main(): Promise<void> {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
const extractor: any = await pipeline("feature-extraction", "Xenova/multilingual-e5-small");
const allLocales: Array<[Locale, typeof SEED_PHRASES_EN]> = [
Comment on lines +27 to +32
// Reset the module-level extractor for this test by re-importing isn't trivial;
// instead this test relies on a fresh module state OR previous tests not having
// succeeded. Since ensureLoaded is idempotent and module state persists, we
// verify that a fresh impl that hasn't loaded yet rejects.
// To guarantee a clean state, this test runs first AND nothing else has loaded.
await expect(impl.classify({question: "x", locale: "en"})).rejects.toThrow("not loaded");
Comment on lines +76 to +80
useEffect(() => {
if (state.shouldRestartSlotHost && slotHost) {
void (slotHost as unknown as {restart?: () => Promise<void>}).restart?.();
dispatch({type: "resetSlotHostFlag"});
}
Comment on lines +136 to +146
const enableLayer2 = useCallback(async (): Promise<void> => {
if (slotHost) return;
dispatch({type: "layer2OptInClicked"});
const newHost = createSlotExtractorHost();
setSlotHost(newHost);
try {
await newHost.api.ensureLoaded();
dispatch({type: "layer2Loaded"});
} catch (err) {
dispatch({type: "layer2Failed", error: String(err)});
}
Comment on lines +59 to +62
function extractLabel(payload: unknown): string {
const p = payload as {timeframe?: string};
return p.timeframe ?? "";
}
Comment on lines +81 to +85
<ComparisonPair
labelA={first.a.timeframe}
valueA={`${first.a.totalSpend.toFixed(2)} ${first.currency}`}
labelB={first.b.timeframe}
valueB={`${first.b.totalSpend.toFixed(2)} ${first.currency}`}
Comment on lines +19 to +20
<svg viewBox="0 0 120 120" width="120" height="120" role="img" aria-label="Spending breakdown by category">
<circle cx={cx} cy={cy} r={radius} fill="transparent" stroke="#e5e7eb" strokeWidth={stroke} />
Comment on lines +64 to +68
function extractValue(payload: unknown): string {
const p = payload as {buckets?: ReadonlyArray<Record<string, unknown>>; count?: number};
if (typeof p.count === "number") return String(p.count);
const first = p.buckets?.[0];
if (first) return `${first["totalSpend"] ?? first["averageSpend"] ?? ""} ${first["currency"] ?? ""}`;
…y-after-fail

THREE issues fixed in one commit:

1. CRITICAL: Both host files (embeddingHost.ts, slotExtractorHost.ts)
   were committed with corrupt content. Their previous body was a
   PowerShell stringification of $host (the read-only automatic
   variable) — a generator-script footgun where my variable assignment
   was silently ignored and the InternalHost object got serialized.
   Both files now contain the intended createWorkerHost factory
   exports. The Next.js dev server immediately surfaced this with
   "Export createSlotExtractorHost doesn't exist in target module".

2. HIGH: useInvoiceAssistant never disposed slotHost on unmount, so
   every navigation away/back leaked a Worker thread holding the
   ~1 GB Qwen-1.5B engine in memory. Added a dedicated cleanup
   useEffect that calls slotHost.dispose() when the slotHost reference
   changes or the component unmounts.

3. HIGH: enableLayer2 set slotHost before awaiting ensureLoaded. If
   the model load failed, the broken host stayed in state and the
   `if (slotHost) return;` guard permanently blocked retries — the
   user would have to reload the page to try again. The catch block
   now disposes the dead host and clears slotHost so retry works.

Both reviewer findings cited PR #712 review (sonnet-4.6).

Also bumps package-lock.json with @next/swc-darwin-{arm64,x64}
optional binaries (npm install side-effect during the build:components
run earlier).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Owner Author

@arolariu arolariu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI Assistant Panel — Code Review

Reviewed at HEAD a8bd811f


Previously Fixed (Turn 0)

Two HIGH issues identified in a prior review pass were already corrected in commit a8bd811f:

  1. Corrupt file contentembeddingHost.ts and slotExtractorHost.ts contained the literal string System.Management.Automation.Internal.Host.InternalHost because the generator script used $host, a PowerShell read-only automatic variable. Fixed.
  2. Missing slotExtractorHost cleanup — the host's dispose() was not called on unmount, leaking the worker thread. Fixed.
  3. enableLayer2 catch branch missing dispose() + null-out — a failed load left the host alive and the null-guard permanently blocking retry. Fixed.

New Finding

AssistantPanel.tsx — Submit button not guarded against slot-extracting state

Severity: Medium
File: sites/arolariu.ro/src/app/domains/invoices/_components/ai/AssistantPanel.tsx

The submit button's disabled condition excludes the slot-extracting status:

// The input IS disabled during both states:
disabled={state.status === "classifying" || state.status === "slot-extracting"}

// But the button only checks "classifying":
<Button type="submit" disabled={!draft.trim() || state.status === "classifying"}>

When the score falls in the uncertain band and a slot-LLM host is available, the pipeline dispatches slotExtracting and then awaits slotHost.api.extract(). During that window:

  • The input is disabled (user cannot type), but draft in React state has not been cleared yet — setDraft("") is called only after submitQuestion resolves.
  • The submit button is not disabled, so clicking it fires a second onSubmit.
  • A second submitQuestion(draft) executes, dispatching another questionSubmitted and racing to completion alongside the first call.

Concretely: both pipelines can reach dispatch({type: "answerReady", ...}), each calling appendHistory once, producing a duplicate history entry for the same question. With unlucky timing (first pipeline times out while the second classifies) the state machine can also transition from out-of-scope back to classifying, which is not a valid forward edge.

Fix: Add state.status === "slot-extracting" to the button's disabled condition.


Assessment

Outside the one finding above, the codebase is in good shape. The data pipeline (aggregators → resolver → renderer → viz) is defensively written throughout: groupByCurrency handles both the string and Currency-object shapes correctly, all aggregators guard the empty-result path before accessing buckets[0]!, the reducer is a clean discriminated-union state machine with no unexpected transitions for its happy path, and the hardware eligibility probe correctly distinguishes hard gates (WebGPU, Workers) from soft signals (RAM, CPU).

Copy link
Copy Markdown
Owner Author

@arolariu arolariu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: submit allowed during slot extraction

The button is not disabled when state.status === 'slot-extracting'. During slot extraction, draft is non-empty (cleared only after submitQuestion resolves) and the classifying guard is false, so clicking the button fires a second submitQuestion call in parallel with the first.

Both pipelines race to call appendHistory, producing a duplicate history entry. With worse timing (first times out, second re-classifies) the state machine takes an invalid out-of-scope -> classifying edge.

Fix: disabled={!draft.trim() || state.status === 'classifying' || state.status === 'slot-extracting'}

disabled={state.status === "classifying" || state.status === "slot-extracting"}
aria-busy={state.status === "classifying" || state.status === "slot-extracting"}
/>
<Button type="submit" disabled={!draft.trim() || state.status === "classifying"}>
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: submit allowed during slot extraction

The button is not disabled when state.status === 'slot-extracting'. During slot extraction, draft is non-empty (cleared only after submitQuestion resolves) and the classifying guard is false, so clicking the button fires a second submitQuestion call in parallel with the first.

Both pipelines race to call appendHistory, producing a duplicate history entry. With worse timing (first times out, second re-classifies) the state machine takes an invalid out-of-scope -> classifying edge.

Fix: disabled={!draft.trim() || state.status === 'classifying' || state.status === 'slot-extracting'}

The button's disabled condition only checked `classifying`, leaving a
window during slot extraction where the user could re-click submit.
Because `draft` is only cleared after `submitQuestion` resolves and
the input's `disabled` doesn't block the button, a second click fires
a parallel pipeline. Both pipelines race to appendHistory (duplicate
entry) and one may take an invalid out-of-scope -> classifying edge
on timeout.

Addresses MEDIUM finding from PR #712 review.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 74 out of 76 changed files in this pull request and generated 5 comments.

Comment on lines +76 to +80
useEffect(() => {
if (state.shouldRestartSlotHost && slotHost) {
void (slotHost as unknown as {restart?: () => Promise<void>}).restart?.();
dispatch({type: "resetSlotHostFlag"});
}
Comment on lines +26 to +30
async function main(): Promise<void> {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
const extractor: any = await pipeline("feature-extraction", "Xenova/multilingual-e5-small");
const allLocales: Array<[Locale, typeof SEED_PHRASES_EN]> = [
["en", SEED_PHRASES_EN],
Comment on lines +31 to +36
export type Layer2State =
| Readonly<{status: "ineligible"; reasons: ReadonlyArray<string>}>
| Readonly<{status: "eligible"}>
| Readonly<{status: "downloading"; progress: number}>
| Readonly<{status: "ready"}>
| Readonly<{status: "failed"; error: string}>;
Comment on lines +24 to +33
describe("createEmbeddingImpl", () => {
it("requires ensureLoaded before classify", async () => {
const impl = createEmbeddingImpl();
// Reset the module-level extractor for this test by re-importing isn't trivial;
// instead this test relies on a fresh module state OR previous tests not having
// succeeded. Since ensureLoaded is idempotent and module state persists, we
// verify that a fresh impl that hasn't loaded yet rejects.
// To guarantee a clean state, this test runs first AND nothing else has loaded.
await expect(impl.classify({question: "x", locale: "en"})).rejects.toThrow("not loaded");
});
Comment on lines +20 to +23
if (slots.category) {
const cat = slots.category;
filtered = filtered.filter((inv) => String(inv.category) === cat || (inv.category as unknown as string) === cat);
}
arolariu and others added 4 commits May 7, 2026 19:41
…+ retry button

User reported "Cannot convert undefined or null to object" at module-eval
of @xenova/transformers when opening the AI tab on localhost:3000, plus
the embedding-failed "Try again" button doing nothing.

## Root cause #1: Turbopack worker bundle doesn't honor browser:false

@xenova/transformers v2.17.2 statically imports `fs`, `path`, `url` at
the top of env.js to detect Node, then calls `Object.keys(fs)` via
`isEmpty`. The package's `package.json` has `browser: { fs: false, ... }`
that is supposed to substitute empty stubs in browser bundles, but
Turbopack's worker bundler resolves them to `undefined` instead and
`Object.keys(undefined)` throws at module evaluation.

## Root cause #2: retryEmbeddingLoad never existed

The "Try again" button was wired to `resetConversation`, but the reducer
explicitly preserves `embedding-failed` status under that action (so it
doesn't lie about the model being ready). The button fired but nothing
visibly transitioned. A real retry must dispose the failed worker host
and create a fresh one.

## Layered fix

1. **next.config.ts**: alias `fs`, `path`, `url`, `sharp`, `onnxruntime-node`
   to a new browser stub `@/lib/empty-module` via `turbopack.resolveAlias`.
   Mirrors the package's `browser: false` mappings explicitly.

2. **lib/empty-module.ts**: tiny no-op stub exporting an empty default
   plus the surface our deps actually touch (`promises`, `sep`, `join`,
   etc.) so the static-import shape is preserved.

3. **embedding.implementation.ts**: switched the static
   `import {pipeline} from "@xenova/transformers"` to a dynamic
   `await import(...)` inside `ensureLoaded()`. Defense in depth:
   even if a future bundler change reintroduces the env-detection
   crash, it now happens at runtime where we can catch and surface
   it as `embedding-failed` state instead of a hard module-eval crash.

4. **Retry button wiring** — new `retryEmbeddingLoad` callback on the
   hook that:
   - Dispatches `retryEmbeddingLoad` action (resets status to
     `capability-check` so the loading UI redraws)
   - Calls `embedHost.dispose()` then `setEmbedHost(createEmbeddingHost())`
   - The lifecycle useEffect picks up the new host and runs the load cycle

   Reducer adds `retryEmbeddingLoad` to the Action union; AssistantPanel
   passes the new callback to the alert button (with data-testid for E2E).

All 21 affected tests still pass (reducer 12, hook 2, panel 4, embedding
impl 3).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…uggingface/transformers v4

Per maintainer feedback: drops the next.config Turbopack alias hack +
the empty-module stub. The official @huggingface/transformers v4.2.0
package (the renamed successor of @xenova/transformers v2) doesn't
import Node builtins (fs/path/url) at module evaluation, so Turbopack's
worker bundler doesn't crash on it.

Changes:
- package.json: replace @xenova/transformers@2.17.2 with
  @huggingface/transformers@4.2.0 (workspace = sites/arolariu.ro)
- next.config.ts: remove the turbopack.resolveAlias block for
  fs/path/url/sharp/onnxruntime-node (no longer needed)
- src/lib/empty-module.ts: deleted (no longer referenced)
- embedding.implementation.ts: dynamic import target updated; the
  pipeline() signature and the {data: Float32Array} output shape are
  unchanged across the rename so no logic adjustments needed
- scripts/generate.embeddings.ts: import path updated; matrix
  regenerated (300 embeddings, same Xenova/multilingual-e5-small model
  hosted on HuggingFace Hub — the model name is unchanged across the
  package rename)
- embedding.implementation.test.ts: vi.mock target string updated

Kept the dynamic import inside ensureLoaded() as defense in depth so
any future package-init failure surfaces as embedding-failed state
rather than a hard module-eval crash.

All 145 invoice-ai unit tests pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The /.clerk/ directory is created by @clerk/nextjs during local
development and can contain secrets (publishable + secret keys).
Auto-generated by clerk during npm install.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…igible CTA

User reported three real UX issues:

1. "I always get 'You have no receipts in last quarter. Try all time.'"
   The store is empty, so every classified intent legitimately returns
   the empty-result template. The pipeline is correct; the UX is bad.
   FIX: render an explicit empty-corpus alert when invoices.length === 0
   so the user understands they need to upload receipts before the
   assistant can compute anything. Adds InvoiceAssistant.emptyCorpus
   i18n keys in en/ro/fr.

2. "I don't have any option to download a bigger model" — the user is
   on hardware where checkHardwareEligibility returns ineligible (no
   WebGPU adapter, etc.). The Layer 2 CTA was hidden in favor of a
   tiny "i" badge in the corner, which the user couldn't find.
   FIX: render the Layer 2 button in disabled state with the unavailable
   tooltip + an inline ⓘ marker. Same affordance, just discoverable.

3. "Switching between Chat and Settings tabs resets the worker and I
   have to see the loading model dialog once again."
   Each tab switch unmounts AssistantPanel which unmounted the hook
   which disposed the worker hosts; remount triggered a fresh ~118 MB
   model load. Brutal UX.
   FIX: lift the worker hosts to MODULE-LEVEL SINGLETONS via lazy
   getters (getEmbedHost, getSlotHost). They survive React mount/unmount
   cycles and only get torn down on full page navigation. The reducer
   state remains per-hook (so conversation history clears on remount,
   matching the H1 architectural lock from the spec) — only the
   expensive model load is preserved.

   Also: when a fresh hook instance mounts and the slot singleton is
   already alive (user enabled Layer 2 earlier this session), the hook
   immediately dispatches layer2Loaded so the UI reflects the active
   state without re-running ensureLoaded.

All 145 invoice-ai unit tests still pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants