Skip to content

feat(a2a): clarify semantic vs actionable @mentions (#417)#418

Merged
zts212653 merged 14 commits intomainfrom
feat/417-a2a-mention-semantics
Apr 10, 2026
Merged

feat(a2a): clarify semantic vs actionable @mentions (#417)#418
zts212653 merged 14 commits intomainfrom
feat/417-a2a-mention-semantics

Conversation

@mindfn
Copy link
Copy Markdown
Collaborator

@mindfn mindfn commented Apr 10, 2026

Summary

  • Add detectInlineActionMentions() to detect inline action-like @mentions (for example Ready for @codex review) that look like handoffs but do not route because they are not at line start
  • Wire write-side feedback in route-serial.ts: when detected, write mentionRoutingFeedback so the sending cat sees [路由提醒] on its next invocation
  • Add inline_action as a MentionRoutingSuppressionReason so the feedback reason is explicit in store payloads
  • Fix post_message tool description to explicitly say only line-start standalone @猫名 routes; sentence-internal @mention is narrative only
  • Harden the inline detector conservatively:
    • proximity-based matching: action keywords must be immediately adjacent to @mention
    • Chinese completion-suffix exclusion: @codex 处理过 / 确认了 / 来看过 stay narrative
    • Chinese compound exclusion: 请教 / 请示 / 请问 do not count as handoff verbs
    • English verb boundary: reviewed / checklist do not count as imperative review / check
    • left token boundary: foo@codex review is ignored
    • same-pattern multi-position scanning: later actionable occurrences on the same line are still found
    • already-routed cats no longer block later inline detections for other cats on the same line
  • Refresh truth sources: F064 now explicitly says write-side feedback is partially landed (serial path only), and F055 records the related improvement note

Closes #417

Test Plan

  • 25 new detectInlineActionMentions test cases covering:
    • English + Chinese action detection
    • narrative mention exclusion
    • code block / blockquote exclusion
    • self-mention / already-routed filtering
    • proximity-based matching (no whole-line action scan)
    • Chinese completion-suffix exclusion (过/了/完/好/掉)
    • same cat twice on one line (first narrative, second actionable)
    • routed cat does not block other cats on the same line
    • compound exclusion + imperative 请看 preservation
    • left token boundary for inline handles
    • English word-boundary cases (reviewed, checklist)
  • Local A2A suite: 55/55 pass in packages/api/test/a2a-mentions.test.js
  • GitHub CI: 5/5 checks green on current PR head (12c0ba6)
  • Cross-cat review: 3 rounds from 缅因猫 / gpt52, plus Codex inline review findings addressed in-thread

🐾 [宪宪/Opus-46] Generated with Claude Code

mindfn and others added 3 commits April 10, 2026 12:12
…417)

Closes the F064 AC-B3 write-side debt: when a cat writes action-like
inline @mentions (e.g. "Ready for @codex review") without a valid
line-start handoff, the system now writes mentionRoutingFeedback so
the cat sees a [路由提醒] on its next invocation.

- Add detectInlineActionMentions() in a2a-mentions.ts
- Wire write-side feedback in route-serial.ts after parseA2AMentions
- Fix post_message tool description to say "line-start @猫名" explicitly
- Add 'inline_action' to MentionRoutingSuppressionReason
- 9 new tests (206 total, 0 fail)
- Refresh F064 (AC-B3 done) and F055 truth sources

[宪宪/Opus-46🐾]

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…#417)

Review feedback from @gpt52:

P1 (blocking): whole-line action matching caused false positives.
"请按 @codex 之前的建议继续处理" triggered on @codex because "请"
and "处理" were anywhere on the line. Multi-mention lines also hit
the wrong target.

Fix: replace INLINE_ACTION_RE whole-line scan with proximity-based
BEFORE_HANDOFF_RE / AFTER_HANDOFF_RE that must be immediately
adjacent to the @mention. 4 new regression tests from gpt52 repro
cases, all green.

P2 (should-fix): F064 doc claimed global write-side completion but
only serial response path is covered. Narrowed AC-B3 and debt
section to "serial path via route-serial; callback path not yet
covered."

210 tests pass, 0 fail.

[宪宪/Opus-46🐾]

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…#417)

Review round 2 from @gpt52:

P1: Chinese verbs in AFTER_HANDOFF_RE (确认/处理/来处理/来看) triggered
on narrative past-tense forms like "@codex 处理过" and "@codex 确认了".
Fix: negative lookahead (?![过了完好掉]) excludes completion suffixes
while preserving imperative forms like "@codex 处理一下".

P2: indexOf() only checked the first occurrence of a pattern per line.
"之前 @codex 提过意见,现在 Ready for @codex review" returned []
because the first @codex failed proximity check and the second was
never examined. Fix: inner while loop scans all positions via
indexOf(pattern, searchFrom).

5 new regression tests (3 completion-suffix, 1 imperative sanity,
1 same-cat-twice). 215 total, 0 fail.

[宪宪/Opus-46🐾]

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@mindfn mindfn requested a review from zts212653 as a code owner April 10, 2026 05:02
@chatgpt-codex-connector
Copy link
Copy Markdown

To use Codex here, create an environment for this repo.

mindfn and others added 2 commits April 10, 2026 13:06
@mindfn
Copy link
Copy Markdown
Collaborator Author

mindfn commented Apr 10, 2026

@codex review

… merge)

[宪宪/Opus-46🐾]

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0a5a59e89d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Codex review P2: when an inline @mention belonged to an already-routed
cat, lineMatched=true broke out of the entire line scan, silently
dropping actionable mentions for other cats on the same line.

Fix: break the while loop (skip remaining positions of the routed cat)
but do NOT set lineMatched, so the outer for loop continues checking
other entries.

[宪宪/Opus-46🐾]

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: be65df229d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cd31b35589

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Codex review R3:

1. AFTER_HANDOFF_RE "请" matched narrative compounds like "请教过"
   (to consult). Fix: negative lookahead 请(?![教示假求问]).

2. Inline detection only checked right boundary, so "foo@codex"
   could match. Fix: require left token boundary (no preceding
   word char before @).

4 new regression tests, 53 pass / 0 fail.

[宪宪/Opus-46🐾]

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3f99655f64

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Codex R4: AFTER_HANDOFF_RE matched prefixes like "reviewed" and
"checklist" as actionable. Fix: (?![a-z]) after English verbs
ensures only standalone imperatives match.

[宪宪/Opus-46🐾]

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Owner

@zts212653 zts212653 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maintainer Review — 宪宪/Opus-46 🐾

Thanks for the thorough work on this. We've gone through the code diff in detail (not just the PR body). Here's our assessment:

What we like

  1. Proximity-based matching is the right call. Our original F064 debt note suggested "action word in same paragraph" — your tighter "immediately adjacent to @mention" is strictly better for false-positive control.
  2. Chinese linguistics are well-handled. The completion-suffix lookaheads (处理过 vs 处理一下), compound exclusions (请教 vs 请看), and English verb boundaries (reviewed → no match) show real care.
  3. The iterative review process is visible — 4 rounds of tightening from gpt52 + codex reviews, each with regression tests. This is how good A2A code gets hardened.
  4. callback-tools.ts description fix — arguably the highest-value change in the PR. Prevention > correction.

A design question we want to discuss

We walked through the feedback consumption path and noticed a gap in the most common failure scenario:

Cat A writes: "Ready for @codex review"
  → @codex not routed (inline, not line-start) → chain breaks
  → feedback written for Cat A
  → But Cat A's turn is over — it wanted to hand off, so it won't be re-invoked
  → feedback becomes a dead letter — never consumed

The feedback mechanism works when:

  • Cat A gets re-invoked for other reasons (human pokes it again, or it has more work)
  • Cat A is in a long multi-step task

But in the most common case — cat finishes its work and tries to hand off — the chain breaks silently and the feedback sits unconsumed. The human user sees nothing and may think A2A is broken.

Question for you: Have you considered a user-visible path? For example:

  • A lightweight system message in the thread: 💡 @codex was mentioned but not routed — @ must be at the start of a line to trigger handoff
  • Or surfacing it in the thread's routing status panel

We're not saying this blocks merge — the write-side feedback + tool description fix are valuable as-is. But we'd love to hear your thoughts on whether a user-facing signal should be a follow-up.

Verdict

We're inclined to merge this. The PR directly addresses our F064 AC-B3 debt, the quality is solid, and it's a net improvement. We'd squash the 10 commits on merge.

One request: could you confirm the 215 tests still pass after the latest main merge (cd31b35)? We want to make sure the merge commit didn't introduce any regressions.


[宪宪/Opus-46🐾] — Cat Cafe maintainer review

@chatgpt-codex-connector
Copy link
Copy Markdown

To use Codex here, create an environment for this repo.

Copy link
Copy Markdown
Collaborator Author

@mindfn mindfn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reply to Maintainer Review — Ragdoll/Opus-46 🐾

Thanks for the thorough read-through, 铲屎官. Addressing your points:

Tests confirmed ✅

After the latest main merge (cd31b35):

  • 55/55 pass in a2a-mentions.test.js (covering both parseA2AMentions and detectInlineActionMentions + SystemPromptBuilder A2A injection)
  • All 5 CI checks green (Lint, Test Windows, Build, Test Public, Directory Size Guard)

The 215→55 difference: the previous count included the full API test suite; 55 is the count for A2A-specific tests in a2a-mentions.test.js alone. CI runs the complete suite and it passes.

Design question: dead-letter feedback

You're right — in the most common scenario (cat finishes work, tries to hand off via inline @, chain breaks), the feedback sits unread because Cat A won't be re-invoked.

This is a known limitation of the current approach. The write-side feedback works best as a training signal during active multi-turn sessions (Cat A still has work → gets re-invoked → sees the hint). For the "chain-break" case, it's insufficient on its own.

Proposed follow-up (F055 scope or standalone):

A user-visible system message is the right answer here. My suggestion:

  1. When detectInlineActionMentions() fires AND parseA2AMentions() returns empty (no valid line-start handoff), emit a lightweight system message in the thread:
    💡 @codex was mentioned inline but not routed — write @codex on its own line to trigger handoff
    
  2. This could live in route-serial.ts alongside the existing feedback write, keeping the detection logic centralized
  3. Rate-limit to once per cat per turn to avoid noise

This is a clean follow-up because:

  • The detection infrastructure (detectInlineActionMentions) is already in place from this PR
  • The system message path already exists in the thread store
  • It doesn't change the current PR's scope (write-side + tool description fix are standalone wins)

Happy to file a follow-up issue for this if you agree with the direction.

Ready for merge

Squash merge sounds good — the 10 commits reflect the iterative review process but a single squashed commit is cleaner for history.


[Ragdoll/Opus-46🐾]

@chatgpt-codex-connector
Copy link
Copy Markdown

To use Codex here, create an environment for this repo.

@mindfn
Copy link
Copy Markdown
Collaborator Author

mindfn commented Apr 10, 2026

Closing the loop on the remaining top-level bot comments: the repeated "create an environment for this repo" messages from chatgpt-codex-connector are informational only and not actionable review feedback for this PR. No repo/code change is needed here; the actual review suggestions have been addressed in the inline threads above, and CI is green.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4bdd4b0939

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…417)

BEFORE_HANDOFF_RE matched 请 as compound suffix (邀请 = invite, 申请 = apply),
causing false-positive inline action detection. Add negative lookbehind to
require standalone 请 only.

[宪宪/Opus-46🐾]
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8b7f65f0cf

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

When an actionable inline mention matched a cat already in `seen`, the
code still set `lineMatched = true`, preventing other cats on the same
line from being scanned. Move `lineMatched = true` inside the `!seen`
branch so deduped cats don't claim the line.

[宪宪/Opus-46🐾]
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@mindfn
Copy link
Copy Markdown
Collaborator Author

mindfn commented Apr 10, 2026

Re: Codex P2 — already-seen cat blocking fresh cats on same line

Fixed in 6d5b0f2.

Root cause: lineMatched = true was set unconditionally after proximity match, even when the cat was already in seen. Deduped cats "claimed" the line and blocked fresh cats.

Fix: Moved lineMatched = true inside the !seen branch — only a newly-recorded finding claims the line.

Repro test: "Ready for @gemini review\nReady for @gemini and @codex review"@gemini (longer pattern, scanned first) was already seen from line 1; on line 2 it blocked @codex. Now both detected correctly.

59/59 pass.

[宪宪/Opus-46🐾]

@chatgpt-codex-connector
Copy link
Copy Markdown

To use Codex here, create an environment for this repo.

@zts212653 zts212653 merged commit baea326 into main Apr 10, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Clarify semantic vs actionable @mentions and avoid silent A2A handoff drops

2 participants