Skip to content

feat(cursor): invisible sync-on-open migrator from legacy stream-json to ACP#844

Merged
tiann merged 2 commits into
tiann:mainfrom
heavygee:spike/cursor-legacy-to-acp-migrator
Jun 10, 2026
Merged

feat(cursor): invisible sync-on-open migrator from legacy stream-json to ACP#844
tiann merged 2 commits into
tiann:mainfrom
heavygee:spike/cursor-legacy-to-acp-migrator

Conversation

@heavygee

@heavygee heavygee commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator

Closes #824

When the user reopens a legacy stream-json Cursor session in HAPI, the
hub now transparently transplants its ~/.cursor/chats/<wsh>/<uuid>/store.db
into ~/.cursor/acp-sessions/<uuid>/, verifies it loads via agent acp,
flips metadata.cursorSessionProtocol = 'acp', and removes the legacy
source — all before resumeSession returns. Subsequent opens are pure ACP.

Why migration (not "leave legacy alone")

The primary justification is safety, not feature parity. #784 (cursor-agent
fabricates Questions skipped by the user responses in legacy stream-json
mode) still fires regularly in dogfood despite #801's mitigation: the agent
ships destructive side effects against fabricated consent. Migration to ACP
closes the protocol-level door because the AskQuestion tool does not exist
on the ACP side — there is nothing to fabricate.

#799 deliberately left existing sessions on legacy because they kept
working. That tradeoff was reasonable at the time. The accumulated #784
evidence makes legacy sessions actively unsafe; this PR makes the upgrade
path invisible enough that users stop avoiding it.

How: transplant → verify → flip → remove

A pre-PR spike established that legacy and ACP store.db files use the
identical SQLite schema; only the directory layout differs. The migrator
therefore:

  1. Sanity-checks the source store and pre-flips state (session.active,
    lifecycleState, on-disk presence, target collision)
  2. Optionally archives a stale-running row (forceArchiveRunning: true is
    the default for the auto-migrate path because the caller already
    verified session.active === false)
  3. Atomically creates ~/.cursor/acp-sessions/<uuid>/ with mode 0o700
  4. Copies store.db and chmods to 0o600 (multi-user-host hardening)
  5. Writes a minimal meta.json sidecar (schemaVersion, cwd, optional
    title) with mode 0o600
  6. Spawns agent acp under HAPI_HOME isolation and verifies the session
    loads via session/load. On long histories the verify also drives a
    trivial single-turn prompt; on short ones load-only is enough
  7. Flips cursorSessionProtocol = 'acp' AND clears the
    cursorMigrationState banner flag in a SINGLE metadata write
  8. Removes the legacy source store (only after verify succeeded and the
    protocol flip committed). The legacy ~/.cursor/chats parent dir is
    left as-is

Every failure leaves the legacy state intact. No rm fires without a
verify success AND a committed protocol flip.

In-flight upgrade banner

The transplant takes 15-20s on long histories (copy a multi-hundred-MB
store, spawn agent acp, replay thousands of notifications, tear down
the probe). Without a progress indicator the wait reads as "broken" to
a fresh user. A minimal banner ships alongside the migrator:

  • Hub sets metadata.cursorMigrationState = 'in_progress' before
    the long-running transplant. The session-cache refresh emits the
    existing session-updated SSE event (no new event type), so the web
    client picks it up in milliseconds. No client-side polling needed.
  • Hub clears the flag in the same metadata write that flips
    cursorSessionProtocol to 'acp' on success, so the banner disappears
    in the same render tick the chat re-renders as ACP — no flicker window.
  • Hub clears the flag explicitly in the auto-migrate helper's finally
    on failure/exception, so the banner never gets stuck if migration
    falls back to the legacy launcher.
  • Web renders an accessible (role="status", aria-live="polite")
    banner with an indeterminate spinner. Deliberately no fake
    percentage — we do not have phase data and a fake progress bar
    would lie.

Coordination with the ACP refcount lock (#835 / #836 / #837)

This PR is sequenced after @swear01's three ACP mop-up PRs (merged
today as ad038bbf, 8094b500, fa363c2f), all of which are
prerequisite for safe concurrent ACP launches.

The verify probe spawns agent acp directly via AcpVerifyProbe under
HAPI_HOME isolation (the migrator overrides HOME to a temp dir for
the verify pass), so it never touches <real-HAPI_HOME>/locks/agent-acp-active/
at all. Per @swear01's #835 design note, the post-flip ACP launcher claims
the lock through the standard registerActiveAcpTransport entry and
behaves like any other concurrent ACP start. The migrator itself never
writes pid or count files directly.

Kill-switch

The auto-migrate path is gated by HAPI_CURSOR_LEGACY_AUTO_MIGRATE.
Set to 0, false, no, or off to suppress it entirely (legacy
sessions keep running through the existing stream-json launcher).
Default is on.

Single-session escape hatch

A REST endpoint at POST /api/sessions/:id/migrate-to-acp allows
explicit migration of a single session outside the sync-on-open path
(e.g. for a specific cold archived session a user wants to re-engage).
Bulk migration surfaces (CLI subcommand, web button, bulk REST endpoint,
candidate-listing API) were deliberately stripped per reviewer feedback;
per-session sync-on-open + this escape hatch are the only two paths.

Test budget

  • 389 hub unit tests pass against the rebased branch (53 for the
    migrator core, 32 for the verify probe, 15 for the auto-migrate
    helper guard matrix, 4 for the migration banner flag transitions,
    plus all upstream suites)
  • 3 integration tests against a real agent acp (skipped by default
    unless HAPI_CURSOR_LEGACY_MIGRATOR_INTEGRATION=1)
  • 10 web unit tests for the banner component (visibility paths + a11y)
  • Typecheck clean across cli + web + hub

Files

  • hub/src/cursor/{acpVerifyProbe,cursorLegacyMigrator}.{ts,test.ts}:
    migrator core + verify probe + 53+32 unit tests
  • hub/src/cursor/cursorLegacyMigratorIntegration.test.ts: 3 real-agent acp integration tests
  • hub/src/cursor/fixtures/buildSyntheticLegacyStore.ts: synthetic store.db builder
  • hub/src/sync/syncEngine.ts: flipCursorSessionProtocolToAcp,
    maybeAutoMigrateLegacyCursorSession, setCursorMigrationStateInProgress,
    clearCursorMigrationState, buildMigratorForRequest
  • hub/src/sync/syncEngineAutoMigrate.test.ts: helper guard matrix
  • hub/src/web/routes/{sessions,cli}.ts: per-session escape-hatch endpoint
  • hub/src/store/index.ts: getCursorSessionByCursorIdAndProtocol lookup
  • shared/src/{schemas,apiTypes}.ts: cursorMigrationState field + request/outcome types
  • web/src/components/CursorMigrationBanner.{tsx,test.tsx}: banner + 10 tests
  • web/src/components/SessionChat.tsx: render banner
  • web/src/lib/locales/{en,zh-CN}.ts: banner copy
  • web/src/types/api.ts: Metadata re-export
  • web/src/api/client.ts: migrateCursorSessionToAcp client method

… to ACP

Closes tiann#824

When the operator reopens a legacy stream-json Cursor session in HAPI,
the hub now transparently transplants its `~/.cursor/chats/<wsh>/<uuid>/store.db`
into `~/.cursor/acp-sessions/<uuid>/`, verifies it loads via `agent acp`,
flips `metadata.cursorSessionProtocol = 'acp'`, and removes the legacy
source - all before `resumeSession` returns. Subsequent opens are pure ACP.

The primary justification is safety, not feature parity. tiann#784 (`cursor-agent`
fabricates `Questions skipped by the user` responses in legacy stream-json
mode) still fires regularly in dogfood despite tiann#801's mitigation: the agent
ships destructive side effects against fabricated consent. Migration to ACP
closes the protocol-level door because the `AskQuestion` tool does not exist
on the ACP side, so there is nothing to fabricate.

working. That tradeoff was reasonable at the time. The accumulated tiann#784
evidence makes legacy sessions actively unsafe; this PR makes the upgrade
path invisible enough that users stop avoiding it.

A pre-PR spike established that legacy and ACP `store.db` files use the
identical SQLite schema; only the directory layout differs. The migrator
therefore:

1. Sanity-checks the source store and pre-flips state (`session.active`,
   `lifecycleState`, on-disk presence, target collision)
2. Optionally archives a stale-running row (`forceArchiveRunning: true` is
   the default for the auto-migrate path because the caller already
   verified `session.active === false`)
3. Atomically creates `~/.cursor/acp-sessions/<uuid>/` with mode `0o700`
4. Copies `store.db` and chmods to `0o600` (multi-user-host hardening)
5. Writes a minimal `meta.json` sidecar (`schemaVersion`, `cwd`, optional
   `title`) with mode `0o600`
6. Spawns `agent acp` under HAPI_HOME isolation and verifies the session
   loads via `session/load`. On long histories the verify also drives a
   trivial single-turn prompt; on short ones load-only is enough
7. Flips `cursorSessionProtocol = 'acp'` AND clears the
   `cursorMigrationState` banner flag in a SINGLE metadata write
8. Removes the legacy source store (only after verify succeeded and the
   protocol flip committed). The legacy `~/.cursor/chats` parent dir is
   left as-is

Every failure leaves the legacy state intact. No `rm` fires without a
verify success AND a committed protocol flip.

The transplant takes 15-20s on long histories (copy a multi-hundred-MB
store, spawn `agent acp`, replay thousands of notifications, tear down
the probe). Without a progress indicator the wait reads as "broken" to a
fresh reviewer. A minimal banner ships alongside the migrator:

- Hub sets `metadata.cursorMigrationState = 'in_progress'` BEFORE the
  long-running transplant. The session-cache refresh emits the existing
  `session-updated` SSE event (no new event type), so the web client
  picks it up in milliseconds. No client-side polling needed.
- Hub clears the flag in the SAME metadata write that flips
  `cursorSessionProtocol` to `'acp'` on success, so the banner disappears
  in the same render tick the chat re-renders as ACP - no flicker window.
- Hub clears the flag explicitly in the auto-migrate helper's `finally`
  on failure/exception, so the banner never gets stuck if migration
  falls back to the legacy launcher.
- Web renders an accessible (role=status, aria-live=polite) banner with
  an indeterminate spinner. Deliberately no fake percentage - we do not
  have phase data and a fake progress bar would lie.

This PR is intentionally sequenced AFTER swear01's three ACP mop-up PRs
(merged today as ad038bb, 8094b50, fa363c2), all of which are
prerequisite for safe concurrent ACP launches.

The verify probe spawns `agent acp` directly via `AcpVerifyProbe` under
HAPI_HOME isolation (the migrator overrides `HOME` to a temp dir for the
verify pass), so it never touches `<real-HAPI_HOME>/locks/agent-acp-active/`
at all. Per swear01's tiann#835 design note, the post-flip ACP launcher claims
the lock through the standard `registerActiveAcpTransport` entry and
behaves like any other concurrent ACP start. The migrator itself never
writes `pid` or `count` files directly.

The auto-migrate path is gated by `HAPI_CURSOR_LEGACY_AUTO_MIGRATE`. Set
to `0`, `false`, `no`, or `off` to suppress it entirely (legacy sessions
keep running through the existing stream-json launcher). Default is on.

A REST endpoint at `POST /api/sessions/:id/migrate-to-acp` allows
explicit migration of a single session outside the sync-on-open path
(e.g. for a specific cold archived session a user wants to re-engage).
Bulk migration surfaces (CLI subcommand, web button, bulk REST endpoint,
candidate-listing API) were deliberately stripped per reviewer feedback;
per-session sync-on-open + this escape hatch are the only two paths.

- 343 hub unit tests (4 new for the migration banner flag transitions,
  53 for the migrator core, 32 for the verify probe, 15 for the auto-
  migrate helper guard matrix, plus existing suites)
- 3 integration tests against a real `agent acp` (skipped by default
  unless `HAPI_CURSOR_LEGACY_MIGRATOR_INTEGRATION=1`)
- 10 web unit tests for the banner component (visibility paths + a11y)
- Typecheck clean across cli, web, hub

Co-authored-by: Cursor <cursoragent@cursor.com>

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

  • [Major] Verify probe ignores metadata.homeDir for binary lookup — migrateOneWithLock() correctly resolves the legacy store under the recorded session owner home, but the default probe factory still sets agentLookupHome from this.deps.homeDir() instead of that resolved home. In the service-account hub case described in the PR, the store is found under metadata.homeDir, then agent acp lookup falls back to the hub user's ~/.local/bin, so verification fails and sync-on-open silently falls back to legacy. Evidence: hub/src/cursor/cursorLegacyMigrator.ts:319.
    Suggested fix:
    // widen the dependency and pass the resolved source home through verifyInTempHome
    createProbe?: (env: NodeJS.ProcessEnv, agentLookupHome: string) => AcpVerifyProbe
    
    createProbe: deps.createProbe ?? ((env, agentLookupHome) => new AcpVerifyProbe({
        env,
        skipLockAcquire: true,
        agentLookupHome
    }))
    
    const probe = this.deps.createProbe(env, opts.sourceHome)
  • [Major] Windows optional binary is locked to the wrong version — cli/package.json requires @twsxtd/hapi-win32-x64@0.20.1, but the added lock entry resolves @twsxtd/hapi-win32-x64@0.20.0. That can leave Windows installs pinned to the previous native binary or make frozen installs reject the lock. Evidence: bun.lock:1077.
    Suggested fix:
    -    "@twsxtd/hapi-win32-x64": ["@twsxtd/hapi-win32-x64@0.20.0", "", { "os": "win32", "cpu": "x64", "bin": { "hapi": "bin/hapi.exe" } }, "sha512-..."]
    +    "@twsxtd/hapi-win32-x64": ["@twsxtd/hapi-win32-x64@0.20.1", "", { "os": "win32", "cpu": "x64", "bin": { "hapi": "bin/hapi.exe" } }, "sha512-<0.20.1 integrity>"]

Summary
Review mode: initial

  • Two blocking release/runtime risks found in the added migration path and lockfile.

Testing

  • Not run (automation). I attempted bun install --frozen-lockfile, but bun is not available on this runner PATH.

HAPI Bot

Comment thread hub/src/cursor/cursorLegacyMigrator.ts Outdated
// PATH. Thread the migrator's homeDir dep through so the
// probe doesn't fall back to process.env.HOME when a more
// specific lookup home is available.
agentLookupHome: this.deps.homeDir()

@github-actions github-actions Bot Jun 8, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MAJOR] migrateOneWithLock() resolves the legacy store using metadata.homeDir, but the default probe factory still builds agentLookupHome from this.deps.homeDir(). In the service-account hub deployment this PR calls out, that means the store is found under the human user's home, then agent acp lookup falls back to the hub user's ~/.local/bin and verification fails back to legacy. Pass the resolved source home into the probe instead.

Suggested fix:

createProbe?: (env: NodeJS.ProcessEnv, agentLookupHome: string) => AcpVerifyProbe

createProbe: deps.createProbe ?? ((env, agentLookupHome) => new AcpVerifyProbe({
    env,
    skipLockAcquire: true,
    agentLookupHome
}))

const probe = this.deps.createProbe(env, opts.sourceHome)

Comment thread bun.lock Outdated

"@twsxtd/hapi-linux-x64": ["@twsxtd/hapi-linux-x64@0.20.1", "", { "os": "linux", "cpu": "x64", "bin": { "hapi": "bin/hapi" } }, "sha512-VWPCKdAgwfUNBRI9Xy14CKjx1d7JS1irOja5l6zufpaTi139jc51gyDcWFfygMwttQlNimmh2qHTfaFqqvcdNg=="],

"@twsxtd/hapi-win32-x64": ["@twsxtd/hapi-win32-x64@0.20.0", "", { "os": "win32", "cpu": "x64", "bin": { "hapi": "bin/hapi.exe" } }, "sha512-1GWfncMeaZvBIfSB0RY4UI4ywiKUtOAi41nRHxqUI/VdWS9Rw3syCRa4bH2gFJzrdRtDdi0kfSib9YRHs1uQgg=="],

@github-actions github-actions Bot Jun 8, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MAJOR] This added package entry resolves @twsxtd/hapi-win32-x64@0.20.0, while cli/package.json requires 0.20.1. That can pin Windows installs to the previous native binary or make frozen installs reject the lock. Regenerate/update the lockfile so the package entry is also 0.20.1 with the matching integrity.

Suggested fix:

-    "@twsxtd/hapi-win32-x64": ["@twsxtd/hapi-win32-x64@0.20.0", "", { "os": "win32", "cpu": "x64", "bin": { "hapi": "bin/hapi.exe" } }, "sha512-..."]
+    "@twsxtd/hapi-win32-x64": ["@twsxtd/hapi-win32-x64@0.20.1", "", { "os": "win32", "cpu": "x64", "bin": { "hapi": "bin/hapi.exe" } }, "sha512-<0.20.1 integrity>"]

…ajors)

Finding 1 (Major) — verify probe `agentLookupHome` ignored `metadata.homeDir`
in service-account hub deployments. `migrateOne` resolved the legacy store
under `metadata.homeDir` (the recorded session-owner home) but the default
createProbe factory still set `agentLookupHome` from `this.deps.homeDir()`
(the hub user's home). On a service-account hub, the store lookup
succeeded but `agent acp` discovery fell back to the hub user's
`~/.local/bin`, so verify silently failed and sync-on-open quietly fell
back to legacy.

Fix:
- Widen `CursorLegacyMigratorDeps.createProbe` signature from
  `(env) => AcpVerifyProbe` to `(env, agentLookupHome) => AcpVerifyProbe`
- Default factory uses the passed `agentLookupHome`
- `verifyInTempHome` threads `opts.sourceHome` (already the resolved
  session-owner home) through as the 2nd arg

2 new regression tests pin the contract:
- service-account case (metadata.homeDir != deps.homeDir()): captured
  agentLookupHome MUST equal metadata.homeDir
- legacy session record (no metadata.homeDir): falls back to
  deps.homeDir() correctly

Finding 2 (Major) — `bun.lock` win32-x64 pinned to 0.20.0 while
`cli/package.json` required 0.20.1 (rebase artifact from the v0.20.0 →
v0.20.1 release commit landing in upstream/main between the original
spike and the rebase). Frozen-install Windows users would either get
the wrong native binary or have the lock rejected.

Fix: regenerated bun.lock so the entry resolves
`@twsxtd/hapi-win32-x64@0.20.1`. `bun install --frozen-lockfile` now
passes clean.

Test budget:
- 391 hub unit tests pass (2 new for createProbe agentLookupHome
  contract, +0 regressions)
- Typecheck clean across cli + web + hub
- `bun install --frozen-lockfile` clean

Co-authored-by: Cursor <cursoragent@cursor.com>
@heavygee

heavygee commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator Author

Thanks for the review. Both Major findings addressed in e3e1c0b3:

Finding 1 — verify probe agentLookupHome ignored metadata.homeDir: Widened CursorLegacyMigratorDeps.createProbe signature from (env) => AcpVerifyProbe to (env, agentLookupHome) => AcpVerifyProbe. The default factory now uses the passed arg; verifyInTempHome threads opts.sourceHome (already the resolved session-owner home from metadata.homeDir) through. The earlier rounds plumbed the type into the factory default but never threaded the resolved home through the call path — exactly the gap you spotted. Two regression tests added in cursorLegacyMigrator.test.ts:

  • Service-account case (metadata.homeDir != deps.homeDir()): captured agentLookupHome MUST equal metadata.homeDir
  • Legacy session record (no metadata.homeDir): falls back to deps.homeDir() correctly

Finding 2 — bun.lock win32-x64 pin mismatch: Rebase artifact. The squash-rebase onto upstream/main 1f92a31b (v0.20.1 release) updated cli/package.json to 0.20.1 but the win32-x64 lock entry stayed at 0.20.0. Regenerated with bun install; bun install --frozen-lockfile is now clean.

Status:

  • 391 hub unit tests pass (+2 new)
  • Typecheck clean across cli + web + hub

Ready for re-review.

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

  • None.

Summary

  • Review mode: follow-up after new commits
  • The prior bot findings are addressed in the latest head: the Windows optional binary lock entry now points at @twsxtd/hapi-win32-x64@0.20.1, and the verify probe now receives the resolved session-owner home for agentLookupHome.
  • Residual risk: the real agent acp migration integration tests are opt-in, so this review relies on source inspection plus the added unit coverage for the destructive migration paths.

Testing

  • Not run (automation); bun is not available on this runner PATH.

HAPI Bot

@tiann tiann merged commit 55d1bbb into tiann:main Jun 10, 2026
2 checks passed
yoyoworms added a commit to yoyoworms/hapi that referenced this pull request Jun 10, 2026
…or ACP

Merge tiann/hapi upstream (50 commits, v0.20.0 + v0.20.1) into the
liuxin fork. Notable adoptions:

- POST /sessions/:id/reopen + web Reopen button (tiann#826), archive
  metadata preservation (tiann#825), clearSessionArchiveMetadata /
  restoreSessionArchiveMetadata on sessionCache
- runner self-restart resilience under external supervision (tiann#814)
- session export (tiann#808), scratchlist v1.1 (tiann#798), voice backend
  enabled (Gemini Live / Qwen Realtime, tiann#692/tiann#742)
- cursor legacy→ACP auto-migration machinery (tiann#799/tiann#824/tiann#835/tiann#844)
- stale queued-message ghost fix (tiann#811), composer send-error
  restore path, mermaid/remark-math web fixes

Conflict resolutions (32 files) — fork features preserved:
- shared/models.ts: adopt upstream `fable`/`fable[1m]` aliases
  (verified working via `claude --model fable -p`), drop our
  `claude-fable-5` long-form ids, keep opus-4-6/4-7 1M pins
- syncEngine.resumeSession: keep fork Restart behavior (archive
  active session then respawn — web Restart depends on it) over
  upstream's return-success-if-active; keep resumeWithSessionId
  override, --continue fallback, auto-discovery scan, and
  retry-without-token; adopt upstream's cursor auto-migrate hook,
  never-started fresh-spawn path, and the no-set-session-config
  fix; adapt upstream's 11-arg spawnSession call to our 13-arg
  signature (sandbox + continueLatest)
- sessions routes: keep fork /resume-options + /pin alongside
  upstream /reopen + /migrate-to-acp
- sessionBase.onSessionFound: merged signature
  (sessionId, extras?, sessionFilePath?) serving upstream's cursor
  metadata extras and our transcript-path scanner callback
- useSendMessage.onError: upstream's attachment-aware composer
  restore + fork's clearTurnLock/pauseQueue queue stop
- SessionChat: keep fork InactiveSessionBanner (one-click resume +
  browse) over upstream's plain-text banner; adopt voice
  integration; keep recent-12h section + pin + compact postTokens
- web/server.ts: keep 100MB body cap + options-object push routes;
  adopt voice WS proxies and codexDesktop routes
- telegram bot: adopt upstream formatReadyNotification

Validation: typecheck green (cli/web/hub); 983/984 tests pass.
The one failure is runner.integration stress test (spawn/stop 20
concurrent sessions) — environmental on this loaded dev machine;
not in the deploy script's focused test set.

via [HAPI](https://hapi.run)

Co-Authored-By: HAPI <noreply@hapi.run>
tiann pushed a commit that referenced this pull request Jun 11, 2026
…regression) (#877)

* fix(cursor): migrator path-priority + ambiguity surface (closes #844 regression)

The legacy-to-ACP migrator's `findLegacyChatStore()` walks
`~/.cursor/chats/<workspace-hash>/<cursorSessionId>/store.db` via
`readdirSync()` and returns the FIRST match. When the same cursor
session id exists in more than one workspace-hash drawer (operator
opened the session from a worktree, an old workspace clone, etc.)
the readdir order picks an arbitrary candidate. The migrator then
transplants alien content into the ACP target, deletes the source
drawer, and reports success - because the verify probe only checks
"loads cleanly", not "loaded the right content". Operator session
resurrects with no recall of its real history.

Four-part fix (all four must land together):

1. Path-priority discovery in `findLegacyChatStore(id, home, cwd?)`:
   - Optional 3rd arg = canonical workspace path (caller passes
     `session.metadata.path`).
   - Compute md5(cwd) and check that drawer FIRST.
   - Fall back to readdir scan only if the canonical drawer is empty.
   - If 2+ candidates remain after fallback, throw
     `AmbiguousLegacyStoreError` listing all of them
     (workspaceHash, sizeBytes, mtimeMs).
2. Ambiguity surface in `maybeAutoMigrateLegacyCursorSession`:
   - Catch `ambiguous_legacy_store` / `size_mismatch` refusals and
     promote `cursorMigrationState` from 'in_progress' to a new
     'ambiguous' state instead of silently clearing the banner.
     Operator sees an actionable web-banner.
3. Size sanity check before transplant:
   - Compare HAPI's known message count (new `MessageStore.countMessages`
     + `CursorLegacyMigratorDeps.getHapiMessageCount` dep) against
     the candidate `store.db`'s blob count. If message count > 100
     AND blob count < messageCount/4, refuse with `size_mismatch`.
   - Skipped when message count is 0 (brand-new session) or the dep
     is unwired (unit tests, CLI direct callers).
4. Diagnostic logging on every successful transplant:
   - `[migrator] transplanted` info log capturing cursorSessionId,
     picked workspaceHash, candidate count discovered, sourceBytes,
     sourceBlobCount, targetAcpPath, sourceRemoved, canonical-path
     md5. Future regressions of this bug shape are diagnosable from
     `journalctl -u hapi-hub` without blob-overlap forensics.

Tests added in `hub/src/cursor/cursorLegacyMigrator.test.ts`:
  - regression guard for single-drawer discovery
  - canonical-path wins over readdir order
  - ambiguity throws with all candidates listed (3-drawer + 2-drawer
    no-canonical-arg variants)
  - canonical-path resolves ambiguity cleanly
  - listLegacyChatStoreCandidates enumeration
  - workspaceHashFromPath shape
  - migrateOne happy path with canonical workspace + 3 sibling decoys
  - migrateOne refuses with ambiguous_legacy_store (3 drawers, no
    canonical match) and leaves all sources untouched
  - migrateOne proceeds when canonical path resolves
  - size_mismatch refuses tiny candidate when messageCount=6000
  - size_mismatch passes when candidate blob count meets the floor
  - size sanity skipped on messageCount=0, missing dep, throwing dep,
    boundary (messageCount=100)
  - countLegacyStoreBlobs returns counts / null on bad path
And in `hub/src/sync/syncEngineAutoMigrate.test.ts`:
  - cursorMigrationState promoted to 'ambiguous' on
    ambiguous_legacy_store / size_mismatch refusals.

Schema:
  - `shared/src/schemas.ts`: cursorMigrationState enum gains 'ambiguous'.
  - `shared/src/apiTypes.ts`: CursorMigrateRefusalReason gains
    'ambiguous_legacy_store' + 'size_mismatch'.

Real-world repro (operator's tooling session, 2026-06-09): three legacy
drawers contained one cursor session id - one with the real 21k-blob
history, two with stale 19/568-blob diagnostic snapshots. Migrator
silently transplanted the 568-blob alien content; resurrected session
had no memory of prior history. Manual rescue completed; this fix
prevents recurrence and surfaces the ambiguity to the operator instead.

* fix(cursor): address cold review on migrator path-priority fix

Self-review against the cold-PR rubric surfaces four polish items on
the previous commit; all four addressed in-loop before push.

- Major: `migrator:transplanted` candidate count was captured AFTER
  the source rm, so for the dominant single-candidate happy path the
  log reported `candidateCount=0, sourceRemoved=true`. Useless for
  diagnosing a future regression of the bug shape this PR is fixing.
  Snapshot candidates + source-side size + source-side blob count
  BEFORE any destructive step and use those for the log.
- Minor: `sourceBytes` and `sourceBlobCount` were read from the
  destination path (acpSessionDir/store.db). The cp guarantees they
  match, but the field names imply source-side measurement. Now they
  measure the source directly.
- Minor: `setCursorMigrationStateAmbiguous` silently returned false on
  cache miss / repeated version mismatch / write failure, letting the
  finally{} block clear the banner without any log. Now emits a
  warn-level log so the gap is diagnosable from journalctl.
- Minor: `findLegacyChatStore` is exported public API and used as a
  free function in unit tests. An out-of-band caller bypassing
  preflightSession could pass `..` or `/etc/passwd` and have the inner
  `join(chatsRoot, wsh, id, 'store.db')` resolve to an arbitrary on-
  disk path. The probe is read-only `statSync` so blast radius is
  small, but enforce the same CURSOR_SESSION_ID_RE at the function
  boundary as a defence-in-depth. New unit test locks the behaviour.

Hub test suite: 414 pass, 0 fail. Typecheck clean across cli/web/hub.

* fix(cursor): cold-review polish on migrator path-priority (#873)

- Web `CursorMigrationBanner` now renders a "Manual review needed"
  state for `cursorMigrationState === 'ambiguous'` (Major #1: caller
  was promoting the metadata flag but no UI surfaced it).
- Pin the md5-fixture contract for `workspaceHashFromPath`: raw,
  no-normalization, trailing-slash-distinct hashes computed via
  `printf '%s' <path> | md5sum` (Major #2: prevents algorithm drift
  that would silently revert path-priority discovery to fallback).
- Snapshot full candidate set BEFORE the canonical fast-path resolves
  a single drawer so the `migrator:transplanted` log reports the
  decision-time count, not a post-rm undercount (Minor #1).
- Warn log when canonical-path drawer is missing but readdir hands
  back exactly one candidate - regression-equivalent behaviour, but
  the size mismatch warrants a journalctl trail (path-normalization
  corner case the maintainer can grep for).
- Boundary test: `messageCount = 101` (first value above the skip
  threshold) engages the size sanity check, pinning the cutoff
  contract (Nit).
- Schema docstring on `cursorMigrationState` enum spelling out the
  banner contract per value (Nit).
- syncEngine `getHapiMessageCount` warn-logs `countMessages` throws
  instead of silently downgrading to 0 (would chronically disable
  the floor).

Drafted with claude-4.6-sonnet-thinking via Cursor; reviewed and
tested by the operator. #873.

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(cursor): correct log-search strings in ambiguous banner copy

The en/zh-CN locale strings told users to grep for
'migrator:ambiguous_legacy_store' and 'migrator:size_mismatch'
but the hub emits '[migrator] ambiguous legacy store; refusing
transplant' and '[migrator] size sanity check refused transplant'.

Fix both locale files to quote the actual log prefix so the
journalctl grep the operator is directed to actually hits.

Addresses #877 bot finding (Minor).

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(cursor): address #877 bot Minor findings (trim + boundary guard)

- Remove .trim() from canonical path before hashing: Cursor hashes
  raw workspace-path bytes; trimming a POSIX path with leading/
  trailing spaces would hash to the wrong drawer, causing a false
  canonical miss and potential ambiguity refusal.

- Add CURSOR_SESSION_ID_RE guard to listLegacyChatStoreCandidates:
  the function was exported without the same traversal-ID boundary
  check present in findLegacyChatStore. A future direct caller
  bypassing findLegacyChatStore could stat paths outside the intended
  <wsh>/<cursorSessionId>/store.db shape.

- Move CURSOR_SESSION_ID_RE declaration above both functions that
  reference it so there is no temporal-dead-zone hazard.

Addresses #877 bot review Minor findings.

Co-authored-by: Cursor <cursoragent@cursor.com>

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
heavygee added a commit to heavygee/hapi that referenced this pull request Jun 12, 2026
Legacy stream-json sessions do not exist in practice. Hub auto-migrates
them to ACP at resume time via maybeAutoMigrateLegacyCursorSession (PR
tiann#844): cursorSessionProtocol flips from 'stream-json' to 'acp' before
cursorRemoteLauncher selects a launcher. The legacy launcher is reached
only when migration soft-fails — a degraded fallback path, not a
supported flow.

Carrying duplicate model-error logic on the legacy path:
  - Doubles the surface for bugs in the model-error contract.
  - Implies legacy is a real, parity-required path (it is not).
  - Is dead code in practice (migration is ~100% reliable on healthy DBs).

Removes recordModelError, handleTextMessageClassification, the
turnHasModelError + lastAssistantText fields, the sendReady block on
modelError, and the classifier import. Replaced with an inline comment
documenting the rationale and pointing readers at the migration path
if they ever encounter a use case.

Net: -55 lines of dead code; ACP launcher remains the sole structural-
signal surface for model errors.

135/135 cursor tests still pass. bun typecheck clean across cli/web/hub.

Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(cursor): operator-driven legacy stream-json -> ACP session migrator (transplant strategy)

2 participants