feat(SUD-1512): add ocho.i18n-parity plugin — full multi-locale scan + 5 worker tools by achtung-ocho · Pull Request #1915 · paperclipai/paperclip

achtung-ocho · 2026-03-27T14:37:41Z

Summary

Implements the ocho.i18n-parity Paperclip plugin (Phase 1 + Phase 2 of SUD-1510 plan).

manifest.ts: plugin ocho.i18n-parity with 5 tools, 3 UI slots, config schema
worker.ts: cheerio-based HTML scanner across all non-EN locales × EN_BASELINE_ROUTES
- Per-surface extraction (meta, nav, hero, main, cta, footer, embeds)
- EN likelihood scoring via stopword rate + script detection (CJK/Cyrillic/Devanagari)
- Missing locale files scored as 0 (still_english_flag: true, missing: true)
- v1 report schema: schema_version, generated_at, config, localization.{summary, pages}, analytics: null, search_console: null
- Scan history stored in-memory keyed by generated_at ISO string
5 tool handlers: run-scan, get-report, get-summary, get-page-detail, create-tickets
ui/index.tsx: dashboard page with filterable table, sidebar link, dashboard widget

Validation

tsc --noEmit: clean ✅
pnpm build: passes (dist/manifest.js, dist/worker.js, dist/ui/index.js) ✅
Smoke test: v1 schema shape confirmed, 16 non-EN locales × baseline routes scanned ✅

Issue

Closes SUD-1512 / SUD-1511 (scaffold + scanner core + full scan + tools)

🤖 Generated with Claude Code

…+ 5 worker tools - manifest.ts: plugin definition with 5 tools, 3 UI slots, config schema - worker.ts: cheerio-based scanner across all non-EN locales × EN_BASELINE_ROUTES - per-surface extraction (meta, nav, hero, main, cta, footer, embeds) - EN likelihood scoring via stopword rate + script detection (CJK/Cyrillic/Devanagari) - missing locale files scored as 0 with still_english_flag=true - v1 report schema with summary keyed by locale (total_pages, above_threshold, avg_score, worst_pages) - scan history stored in memory keyed by generated_at timestamp - 5 tool handlers: run-scan, get-report, get-summary, get-page-detail, create-tickets - ui/index.tsx: dashboard page with filterable table, sidebar link, widget Co-Authored-By: Paperclip <noreply@paperclip.ing>

greptile-apps · 2026-03-27T14:41:24Z

Greptile Summary

This PR introduces the ocho.i18n-parity Paperclip plugin: a cheerio-based HTML scanner that scores translation parity across all non-EN locales and 30+ baseline routes, exposes 5 agent tools, and adds a React dashboard (sidebar, full-page report, widget). The scanner logic, surface extractors, and tool handler plumbing are well-structured.

However, there are three P1 defects that need to be resolved before merge:

Broken UI data contract (P1): The worker's ctx.data.register handler returns a raw V1Report object (with generated_at, localization.pages, localization.summary) after a scan, but the UI reads data.scannedAt, data.pages, and data.summary — fields that don't exist on V1Report. The result is that all three UI components (I18nParitySidebar, I18nParityPage, I18nParityWidget) will always render their "No scan data" empty state even after a successful scan completes.
pageLimit silently ignored (P1): The run-scan manifest schema advertises a pageLimit parameter, but neither the tool handler nor runScan() read or apply it. Passing pageLimit has no effect.
maxTickets missing from manifest (P1): The create-tickets tool handler reads and applies input.maxTickets, but it is absent from the manifest's parametersSchema, so AI consumers cannot discover or use this parameter.

Additionally, per CONTRIBUTING.md, this PR is a larger/impactful change and should include a thinking path, details about why the change matters, how to verify it works, any risks, and before/after screenshots of the new UI components.

Confidence Score: 4/5

Not safe to merge as-is — the UI will never display scan results due to a data-shape mismatch between the worker and UI.

Three P1 defects are present: (1) the data contract between worker and UI is broken so the dashboard is non-functional after scanning, (2) the pageLimit parameter is advertised but never applied, and (3) maxTickets is implemented in the worker but absent from the manifest schema. These all need fixing before this plugin can be used as intended.

src/worker.ts (data handler reshape + pageLimit wiring) and src/manifest.ts (add maxTickets) need the most attention; src/ui/index.tsx types should be reconciled with the worker's V1Report shape.

Important Files Changed

Filename	Overview
packages/plugins/examples/plugin-i18n-parity/src/worker.ts	Core scanner and tool handlers; contains three P1 bugs: data handler returns V1Report shape that the UI cannot consume (broken after scan), pageLimit parameter is declared in manifest but silently ignored here, and in-memory state is volatilely stored.
packages/plugins/examples/plugin-i18n-parity/src/manifest.ts	Plugin manifest with 5 tools and 3 UI slots; missing maxTickets from create-tickets parametersSchema, and pageLimit is declared for run-scan but not implemented in the worker.
packages/plugins/examples/plugin-i18n-parity/src/ui/index.tsx	Dashboard UI with sidebar, page, and widget components; type definitions diverge from V1Report (uses weightedScore, langAttr, scannedAt, flat summary array) so all components will render the empty/no-data state after a real scan.
packages/plugins/examples/plugin-i18n-parity/src/index.ts	Trivial barrel file re-exporting manifest and worker defaults.
packages/plugins/examples/plugin-i18n-parity/package.json	Package definition with correct workspace dependencies, build scripts, and plugin metadata.
packages/plugins/examples/plugin-i18n-parity/scripts/build-ui.mjs	esbuild script for bundling the UI; correctly externalises React and the plugin SDK UI module.
packages/plugins/examples/plugin-i18n-parity/tsconfig.json	TypeScript config; no issues observed.

Prompt To Fix All With AI

This is a comment left during a code review.
Path: packages/plugins/examples/plugin-i18n-parity/src/worker.ts
Line: 541-544

Comment:
**Data contract mismatch: UI will never render scan results**

The data handler returns either `{ pages: [], summary: {}, scannedAt: null }` (no scan) or a full `V1Report` (after scan). However the `V1Report` type uses:
- `generated_at` (not `scannedAt`)
- `localization.pages` (not `pages`)
- `localization.summary` (not `summary`, and it's a `Record<string, V1LocaleSummary>` not an array)

The UI (`index.tsx`) checks `data.scannedAt` on lines 87, 127, and 233. Since `V1Report` has no `scannedAt` property, this is always `undefined`, causing the UI to permanently render the "No scan data available" empty state even after a successful scan.

Additionally, `data.pages` and `data.summary` are accessed in the UI but the actual data lives at `data.localization.pages` and `data.localization.summary` in the V1Report shape.

The data handler needs to project/reshape the V1Report into the shape the UI expects:

```ts
ctx.data.register("i18n-parity-report", async () => {
  if (!latestScanKey) return { pages: [], summary: [], scannedAt: null };
  const report = scanHistory.get(latestScanKey);
  if (!report) return { pages: [], summary: [], scannedAt: null };
  const summary = Object.entries(report.localization.summary).map(([locale, s]) => ({
    locale,
    pageCount: s.total_pages,
    flaggedCount: s.total_pages - s.above_threshold,
    averageScore: s.avg_score,
    minScore: s.worst_pages[0]?.page_localization_score ?? 0,
  }));
  return {
    scannedAt: report.generated_at,
    pages: report.localization.pages.map((p) => ({
      ...p,
      weightedScore: p.page_localization_score,
      langAttr: null,
      surfaces: Object.entries(p.surfaces).map(([surface, r]) => ({ surface, ...r })),
    })),
    summary,
  };
});
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: packages/plugins/examples/plugin-i18n-parity/src/worker.ts
Line: 559-579

Comment:
**`pageLimit` parameter silently ignored**

The manifest's `run-scan` tool declares `pageLimit` as a supported parameter (manifest.ts lines 72-76), but the handler here only reads `input.locale` — `input.pageLimit` is never consumed. The `runScan()` function also has no `pageLimit` argument. Callers who pass `pageLimit` expecting a capped scan will receive a full unlimited scan with no warning.

Either remove `pageLimit` from the manifest schema, or thread it through:

```ts
const input = params as { locale?: string; pageLimit?: number };
const config = await getConfig(ctx);
if (!config.repoPath) return { error: "repoPath is not configured." };

const report = runScan(config, input.locale, input.pageLimit, ctx.logger);
```

And update `runScan` to accept and apply a `pageLimit` per locale.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: packages/plugins/examples/plugin-i18n-parity/src/manifest.ts
Line: 116-136

Comment:
**`maxTickets` parameter missing from `create-tickets` manifest schema**

The worker's `create-tickets` handler reads and applies `input.maxTickets` (worker.ts lines 694–703), but this parameter is not declared in the manifest's `parametersSchema`. AI model consumers won't know it's a valid input, and Paperclip's parameter validation may reject or ignore it silently.

```suggestion
    {
      name: "create-tickets",
      displayName: "Create Parity Tickets",
      description:
        "Creates Paperclip issues for pages that fall below the minScore threshold from the most recent scan.",
      parametersSchema: {
        type: "object",
        properties: {
          minScore: {
            type: "number",
            description:
              "Override threshold (0–1). Defaults to plugin config minScore.",
          },
          dryRun: {
            type: "boolean",
            description:
              "If true, returns planned ticket list without creating issues.",
          },
          maxTickets: {
            type: "number",
            description:
              "Cap on the number of tickets to create in a single call.",
          },
        },
      },
    },
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: packages/plugins/examples/plugin-i18n-parity/src/worker.ts
Line: 134-136

Comment:
**In-memory state lost on worker restart with no user-facing warning**

`scanHistory`, `latestScanKey`, and `cachedCompanyId` are module-level variables. Any worker restart (deploy, crash, idle timeout) silently wipes all scan history, causing all subsequent `get-report`, `get-summary`, `get-page-detail`, and `create-tickets` calls to return "No scan report available." The user will only discover this by running `run-scan` again.

Consider at minimum logging a warning on setup that state is volatile, or surfacing `scannedAt` in `onHealth()` so operators can detect a cold worker. A future improvement would be persisting to `ctx.kv` or similar storage.

How can I resolve this? If you propose a fix, please make it concise.

_{Reviews (1): Last reviewed commit: "feat(SUD-1512): add ocho.i18n-parity plu..." | Re-trigger Greptile}

greptile-apps · 2026-03-27T14:41:27Z

packages/plugins/examples/plugin-i18n-parity/src/worker.ts

+    ctx.data.register("i18n-parity-report", async () => {
+      if (!latestScanKey) return { pages: [], summary: {}, scannedAt: null };
+      return scanHistory.get(latestScanKey) ?? { pages: [], summary: {}, scannedAt: null };
+    });


Data contract mismatch: UI will never render scan results

The data handler returns either { pages: [], summary: {}, scannedAt: null } (no scan) or a full V1Report (after scan). However the V1Report type uses:

generated_at (not scannedAt)

localization.pages (not pages)

localization.summary (not summary, and it's a Record<string, V1LocaleSummary> not an array)

The UI (index.tsx) checks data.scannedAt on lines 87, 127, and 233. Since V1Report has no scannedAt property, this is always undefined, causing the UI to permanently render the "No scan data available" empty state even after a successful scan.

Additionally, data.pages and data.summary are accessed in the UI but the actual data lives at data.localization.pages and data.localization.summary in the V1Report shape.

The data handler needs to project/reshape the V1Report into the shape the UI expects:

ctx.data.register("i18n-parity-report", async () => { if (!latestScanKey) return { pages: [], summary: [], scannedAt: null }; const report = scanHistory.get(latestScanKey); if (!report) return { pages: [], summary: [], scannedAt: null }; const summary = Object.entries(report.localization.summary).map(([locale, s]) => ({ locale, pageCount: s.total_pages, flaggedCount: s.total_pages - s.above_threshold, averageScore: s.avg_score, minScore: s.worst_pages[0]?.page_localization_score ?? 0, })); return { scannedAt: report.generated_at, pages: report.localization.pages.map((p) => ({ ...p, weightedScore: p.page_localization_score, langAttr: null, surfaces: Object.entries(p.surfaces).map(([surface, r]) => ({ surface, ...r })), })), summary, }; });

Prompt To Fix With AI

This is a comment left during a code review. Path: packages/plugins/examples/plugin-i18n-parity/src/worker.ts Line: 541-544 Comment: **Data contract mismatch: UI will never render scan results** The data handler returns either `{ pages: [], summary: {}, scannedAt: null }` (no scan) or a full `V1Report` (after scan). However the `V1Report` type uses: - `generated_at` (not `scannedAt`) - `localization.pages` (not `pages`) - `localization.summary` (not `summary`, and it's a `Record<string, V1LocaleSummary>` not an array) The UI (`index.tsx`) checks `data.scannedAt` on lines 87, 127, and 233. Since `V1Report` has no `scannedAt` property, this is always `undefined`, causing the UI to permanently render the "No scan data available" empty state even after a successful scan. Additionally, `data.pages` and `data.summary` are accessed in the UI but the actual data lives at `data.localization.pages` and `data.localization.summary` in the V1Report shape. The data handler needs to project/reshape the V1Report into the shape the UI expects: ```ts ctx.data.register("i18n-parity-report", async () => { if (!latestScanKey) return { pages: [], summary: [], scannedAt: null }; const report = scanHistory.get(latestScanKey); if (!report) return { pages: [], summary: [], scannedAt: null }; const summary = Object.entries(report.localization.summary).map(([locale, s]) => ({ locale, pageCount: s.total_pages, flaggedCount: s.total_pages - s.above_threshold, averageScore: s.avg_score, minScore: s.worst_pages[0]?.page_localization_score ?? 0, })); return { scannedAt: report.generated_at, pages: report.localization.pages.map((p) => ({ ...p, weightedScore: p.page_localization_score, langAttr: null, surfaces: Object.entries(p.surfaces).map(([surface, r]) => ({ surface, ...r })), })), summary, }; }); ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-03-27T14:41:28Z

packages/plugins/examples/plugin-i18n-parity/src/worker.ts

+      async (params): Promise<ToolResult> => {
+        try {
+          const input = params as { locale?: string };
+          const config = await getConfig(ctx);
+          if (!config.repoPath) return { error: "repoPath is not configured." };
+
+          const report = runScan(config, input.locale, ctx.logger);
+          latestScanKey = report.generated_at;
+          scanHistory.set(latestScanKey, report);
+
+          const totalPages = report.localization.pages.length;
+          const flagged = report.localization.pages.filter((p) => p.still_english_flag).length;
+          return {
+            content: `Scan complete. ${totalPages} pages across ${report.config.locales_scanned.length} locale(s). ${flagged} still-English pages flagged.`,
+            data: report,
+          };
+        } catch (err) {
+          return { error: `run-scan failed: ${err instanceof Error ? err.message : String(err)}` };
+        }
+      },
+    );


pageLimit parameter silently ignored

The manifest's run-scan tool declares pageLimit as a supported parameter (manifest.ts lines 72-76), but the handler here only reads input.locale — input.pageLimit is never consumed. The runScan() function also has no pageLimit argument. Callers who pass pageLimit expecting a capped scan will receive a full unlimited scan with no warning.

Either remove pageLimit from the manifest schema, or thread it through:

const input = params as { locale?: string; pageLimit?: number }; const config = await getConfig(ctx); if (!config.repoPath) return { error: "repoPath is not configured." }; const report = runScan(config, input.locale, input.pageLimit, ctx.logger);

And update runScan to accept and apply a pageLimit per locale.

Prompt To Fix With AI

This is a comment left during a code review. Path: packages/plugins/examples/plugin-i18n-parity/src/worker.ts Line: 559-579 Comment: **`pageLimit` parameter silently ignored** The manifest's `run-scan` tool declares `pageLimit` as a supported parameter (manifest.ts lines 72-76), but the handler here only reads `input.locale` — `input.pageLimit` is never consumed. The `runScan()` function also has no `pageLimit` argument. Callers who pass `pageLimit` expecting a capped scan will receive a full unlimited scan with no warning. Either remove `pageLimit` from the manifest schema, or thread it through: ```ts const input = params as { locale?: string; pageLimit?: number }; const config = await getConfig(ctx); if (!config.repoPath) return { error: "repoPath is not configured." }; const report = runScan(config, input.locale, input.pageLimit, ctx.logger); ``` And update `runScan` to accept and apply a `pageLimit` per locale. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-03-27T14:41:29Z

packages/plugins/examples/plugin-i18n-parity/src/manifest.ts

+    {
+      name: "create-tickets",
+      displayName: "Create Parity Tickets",
+      description:
+        "Creates Paperclip issues for pages that fall below the minScore threshold from the most recent scan.",
+      parametersSchema: {
+        type: "object",
+        properties: {
+          minScore: {
+            type: "number",
+            description:
+              "Override threshold (0–1). Defaults to plugin config minScore.",
+          },
+          dryRun: {
+            type: "boolean",
+            description:
+              "If true, returns planned ticket list without creating issues.",
+          },
+        },
+      },
+    },


maxTickets parameter missing from create-tickets manifest schema

The worker's create-tickets handler reads and applies input.maxTickets (worker.ts lines 694–703), but this parameter is not declared in the manifest's parametersSchema. AI model consumers won't know it's a valid input, and Paperclip's parameter validation may reject or ignore it silently.

Suggested change

{

name: "create-tickets",

displayName: "Create Parity Tickets",

description:

"Creates Paperclip issues for pages that fall below the minScore threshold from the most recent scan.",

parametersSchema: {

type: "object",

properties: {

minScore: {

type: "number",

description:

"Override threshold (0–1). Defaults to plugin config minScore.",

},

dryRun: {

type: "boolean",

description:

"If true, returns planned ticket list without creating issues.",

},

},

},

},

{

name: "create-tickets",

displayName: "Create Parity Tickets",

description:

"Creates Paperclip issues for pages that fall below the minScore threshold from the most recent scan.",

parametersSchema: {

type: "object",

properties: {

minScore: {

type: "number",

description:

"Override threshold (0–1). Defaults to plugin config minScore.",

},

dryRun: {

type: "boolean",

description:

"If true, returns planned ticket list without creating issues.",

},

maxTickets: {

type: "number",

description:

"Cap on the number of tickets to create in a single call.",

},

},

},

},

Prompt To Fix With AI

This is a comment left during a code review. Path: packages/plugins/examples/plugin-i18n-parity/src/manifest.ts Line: 116-136 Comment: **`maxTickets` parameter missing from `create-tickets` manifest schema** The worker's `create-tickets` handler reads and applies `input.maxTickets` (worker.ts lines 694–703), but this parameter is not declared in the manifest's `parametersSchema`. AI model consumers won't know it's a valid input, and Paperclip's parameter validation may reject or ignore it silently. ```suggestion { name: "create-tickets", displayName: "Create Parity Tickets", description: "Creates Paperclip issues for pages that fall below the minScore threshold from the most recent scan.", parametersSchema: { type: "object", properties: { minScore: { type: "number", description: "Override threshold (0–1). Defaults to plugin config minScore.", }, dryRun: { type: "boolean", description: "If true, returns planned ticket list without creating issues.", }, maxTickets: { type: "number", description: "Cap on the number of tickets to create in a single call.", }, }, }, }, ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-03-27T14:41:31Z

packages/plugins/examples/plugin-i18n-parity/src/worker.ts

+const scanHistory = new Map<string, V1Report>();
+let latestScanKey: string | null = null;
+let cachedCompanyId: string | null = null;


In-memory state lost on worker restart with no user-facing warning

scanHistory, latestScanKey, and cachedCompanyId are module-level variables. Any worker restart (deploy, crash, idle timeout) silently wipes all scan history, causing all subsequent get-report, get-summary, get-page-detail, and create-tickets calls to return "No scan report available." The user will only discover this by running run-scan again.

Consider at minimum logging a warning on setup that state is volatile, or surfacing scannedAt in onHealth() so operators can detect a cold worker. A future improvement would be persisting to ctx.kv or similar storage.

Prompt To Fix With AI

This is a comment left during a code review. Path: packages/plugins/examples/plugin-i18n-parity/src/worker.ts Line: 134-136 Comment: **In-memory state lost on worker restart with no user-facing warning** `scanHistory`, `latestScanKey`, and `cachedCompanyId` are module-level variables. Any worker restart (deploy, crash, idle timeout) silently wipes all scan history, causing all subsequent `get-report`, `get-summary`, `get-page-detail`, and `create-tickets` calls to return "No scan report available." The user will only discover this by running `run-scan` again. Consider at minimum logging a warning on setup that state is volatile, or surfacing `scannedAt` in `onHealth()` so operators can detect a cold worker. A future improvement would be persisting to `ctx.kv` or similar storage. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps bot reviewed Mar 27, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(SUD-1512): add ocho.i18n-parity plugin — full multi-locale scan + 5 worker tools#1915

feat(SUD-1512): add ocho.i18n-parity plugin — full multi-locale scan + 5 worker tools#1915
achtung-ocho wants to merge 1 commit intopaperclipai:masterfrom
achtung-ocho:feat/SUD-1512-i18n-parity-plugin

achtung-ocho commented Mar 27, 2026

Uh oh!

greptile-apps bot commented Mar 27, 2026

Uh oh!

greptile-apps bot Mar 27, 2026

Uh oh!

greptile-apps bot Mar 27, 2026

Uh oh!

greptile-apps bot Mar 27, 2026

Uh oh!

greptile-apps bot Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

achtung-ocho commented Mar 27, 2026

Summary

Validation

Issue

Uh oh!

greptile-apps bot commented Mar 27, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Uh oh!

greptile-apps bot Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant