Skip to content

feat(framework): self-hosted tool approval for the ai-sdk agent flavor fixes NV-8148#11733

Draft
ChmaraX wants to merge 16 commits into
nextfrom
nv-8148-self-hosted-tool-approval-flow-novuframework-ai-sdk
Draft

feat(framework): self-hosted tool approval for the ai-sdk agent flavor fixes NV-8148#11733
ChmaraX wants to merge 16 commits into
nextfrom
nv-8148-self-hosted-tool-approval-flow-novuframework-ai-sdk

Conversation

@ChmaraX

@ChmaraX ChmaraX commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds self-hosted (vanilla + @novu/framework/ai-sdk) tool approval so a gated tool call pauses the turn, posts an Approve/Deny card, and resumes statelessly on the click — mirroring the managed runtime, with no server-side session.

  • Shared kit (packages/framework core): action-id grammar (tool-approval:{verdict}:{approvalId}), richContent.toolApproval payload codec, default + resolved approval cards, and click routing to onToolApproval.
  • AI-SDK adapter: detects the SDK's tool-approval-request, posts the card, ends the turn, and auto-resumes by re-running onMessage (toModelMessages(ctx.history) rebuilds the request body). onToolApproval is optional — return void to auto-resume or an AiSdkResult to drive it yourself.
  • API persistence/routing (apps/api/.../conversation-runtime): persists the approval decision to the transcript so the stateless resume can reconstruct the in-flight cycle.

Flow

sequenceDiagram
    participant U as User
    participant N as Novu (adapter)
    participant M as Model (streamText)

    U->>N: "refund $250 for A1"
    N->>M: onMessage → streamText(tools)
    M-->>N: tool-approval-request (needsApproval)
    N-->>U: Approve / Deny card (one per turn)
    U->>N: click Approve/Deny
    N->>N: persist decision to history
    N->>M: resume → toModelMessages(history)
    M-->>N: executes (approve) / declines (deny) → reply
    N-->>U: result
Loading

Contract for onToolApproval resume

One rule for both paths: keep the reconstructed tool-approval-response as the last message; steer with the system option, not a pushed message. The AI SDK resolves the approval itself (execute on approve, mark denied on deny) only while that part is last. Appending a message after toModelMessages(...) leaves a tool_use with no tool_result and the provider (e.g. Anthropic) rejects it — symmetric on approve and deny.

Multiple gated tools in one turn

Only the first approval card is surfaced per turn (sequential approval, matching the managed runtime's pendingApprovalTools[0]). The AI SDK keeps parallel tool-calls in a single assistant message, so a lone in-flight approval is the only shape toModelMessages can replay without leaving a tool_use unanswered; the resumed model re-drives any remaining work.

Test plan

  • cd packages/framework && pnpm test (unit: history-mapper, reply-mapper, ai-sdk-agent, action-id, agent)
  • pnpm build for @novu/framework (exports + no circular deps)
  • Manual: gated tool → approve executes; deny declines gracefully; multi-tool request surfaces cards sequentially

Notes

  • No breaking changes; new exports only (@novu/framework/ai-sdk toModelMessages, tool-approval types/kit).
  • Community edition unaffected — this is additive framework API.

Made with Cursor

Greptile Summary

This PR adds stateless tool-approval support for self-hosted (@novu/framework vanilla + AI SDK adapter): when the model requests a gated tool, the framework posts an Approve/Deny card, persists the decision to the conversation transcript, and auto-resumes by re-running onMessage with toModelMessages(ctx.history) — mirroring the managed runtime without server-side session state.

  • Framework core (agent.context.ts, agent-dispatch.ts, agent.types.ts): new ToolApprovalControl, PendingApproval sentinel, emitToolResult flush mechanism, ToolApprovalConfig customisation hook, and tool-approval:* action-id grammar.
  • AI SDK adapter (reply-mapper.ts, history-mapper.ts, ai-sdk-agent.ts): handleResult detects tool-approval-request content parts, posts the first card (sequential gating), collects executed tool results from steps/response.messages, and the resume helper re-invokes onMessage with the updated history.
  • API persistence (conversation-runtime, dal): three new activity types with a first-class toolData column; inbound-turn.handler persists decisions before dispatch so the bridge sees complete history.
  • Dashboard (conversation-timeline-grouping.ts): groups approval cycles into a single timeline entry.

Confidence Score: 5/5

Safe to merge; the feature is entirely additive with no breaking changes to existing agents, and the stateless resume logic is well-tested end-to-end.

The core resume contract (persist decision → rebuild history → re-run onMessage) is sound and backed by comprehensive unit tests covering approved/denied/chained/multi-tool cycles. New activity types and the toolData column are purely additive. The two findings are data-quality nits that do not affect the correctness of the approval flow or the durable transcript.

history-mapper.ts for the executionDeniedResult tool-name omission, and inbound-turn.handler.ts for the managed-agent double-write concern raised in the prior review round.

Important Files Changed

Filename Overview
packages/framework/src/ai-sdk/history-mapper.ts Core mapper from Novu history to AI SDK ModelMessage[]; reconstructs tool-call/approval/denial message sequences for stateless resume. Logic is sound but executionDeniedResult hardcodes toolName: 'tool'.
packages/framework/src/ai-sdk/reply-mapper.ts Adds handleResult, emitExecutedToolResults, and postApprovalCard. The AiSdkContext interface cleanly scopes the internal cast to one boundary.
packages/framework/src/ai-sdk/ai-sdk-agent.ts Wires the AI SDK adapter to the framework: onToolApproval calls user handler then auto-resumes via resume(ctx). The double cast on ctx is intentional and safe.
packages/framework/src/resources/agent/agent-dispatch.ts Routes onAction to onToolApproval when action ID matches tool-approval:*; falls back to resolvedApprovalCard if the handler doesn't edit the card.
packages/framework/src/resources/agent/agent.context.ts Adds ToolApprovalControl, emitToolResult, createReplyHandle, and widens reply() to accept toolApproval payload; tool results are flushed with the next reply or in flush().
apps/api/src/app/agents/conversation-runtime/ingress/inbound-turn.handler.ts Adds recordApprovalVerdict which persists approval decisions before dispatch. Delegates to both self-hosted and managed parsers — managed approvals also write a TOOL_APPROVAL_DECISION activity (noted in previous review).
apps/api/src/app/agents/conversation-runtime/conversation/agent-conversation.service.ts Adds persistToolApprovalRequest, persistToolApprovalDecision, and persistToolResult through createToolActivity. Well-structured persistence layer additions.
apps/api/src/app/agents/conversation-runtime/egress/outbound.gateway.ts Introduces splitReplyPersistence to separate renderable richContent from toolData; routes approval-carrying messages to persistToolApprovalRequest.
apps/api/src/app/agents/managed-runtime/managed-agent.service.ts Changes history filter from !== SIGNAL to === MESSAGE, intentionally excluding edit/update/tool activity types. Marked with a TODO.
apps/dashboard/src/components/conversations/conversation-timeline-grouping.ts New file: groups timeline activities into approval cycles and tool-progress groups; fallback for unlinked activities as standalone entries is correct.
libs/dal/src/repositories/conversation-activity/conversation-activity.entity.ts Adds three new ConversationActivityTypeEnum values and ConversationActivityToolData interface with per-type field documentation.
apps/api/src/app/agents/shared/dtos/agent-reply-payload.dto.ts Adds ToolResultDto (properly validated with @ValidateNested) and toolApproval?: Record<string, unknown> on ReplyContentDto.

Sequence Diagram

%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
    participant U as User
    participant IH as InboundTurnHandler
    participant DB as ConversationActivity DB
    participant FR as Framework (AgentContextImpl)
    participant RM as ReplyMapper
    participant M as Model (streamText)
    participant OG as OutboundGateway

    U->>IH: message
    IH->>FR: dispatch(onMessage)
    FR->>M: streamText(tools)
    M-->>FR: tool-approval-request
    FR->>RM: handleResult
    RM->>OG: ctx.reply(card, toolApproval)
    OG->>DB: TOOL_APPROVAL_REQUEST
    OG-->>U: Approve/Deny card

    U->>IH: click Approve
    IH->>DB: TOOL_APPROVAL_DECISION
    IH->>FR: dispatch(onAction)
    FR->>FR: resume(ctx)
    FR->>M: streamText(toModelMessages(history))
    M-->>FR: tool result + text
    RM->>FR: emitToolResult
    FR->>OG: ctx.reply(text) + toolResults
    OG->>DB: TOOL_RESULT + MESSAGE
    OG-->>U: reply
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
    participant U as User
    participant IH as InboundTurnHandler
    participant DB as ConversationActivity DB
    participant FR as Framework (AgentContextImpl)
    participant RM as ReplyMapper
    participant M as Model (streamText)
    participant OG as OutboundGateway

    U->>IH: message
    IH->>FR: dispatch(onMessage)
    FR->>M: streamText(tools)
    M-->>FR: tool-approval-request
    FR->>RM: handleResult
    RM->>OG: ctx.reply(card, toolApproval)
    OG->>DB: TOOL_APPROVAL_REQUEST
    OG-->>U: Approve/Deny card

    U->>IH: click Approve
    IH->>DB: TOOL_APPROVAL_DECISION
    IH->>FR: dispatch(onAction)
    FR->>FR: resume(ctx)
    FR->>M: streamText(toModelMessages(history))
    M-->>FR: tool result + text
    RM->>FR: emitToolResult
    FR->>OG: ctx.reply(text) + toolResults
    OG->>DB: TOOL_RESULT + MESSAGE
    OG-->>U: reply
Loading

Comments Outside Diff (1)

  1. apps/api/src/app/agents/conversation-runtime/ingress/inbound-turn.handler.ts, line 109-140 (link)

    P2 recordApprovalVerdict writes signal entries for managed-agent approvals too

    parseApprovalVerdict delegates to both parseApprovalActionId (self-hosted tool-approval:* prefix) and parseToolApprovalActionId (managed mcp-approval:* / direct-approval:*). This means every managed-agent approval click now also writes a tool-approval-response signal activity into the transcript — in addition to whatever the managed runtime persists via its own state machine. Managed agents don't rely on toModelMessages for stateless resume, so the extra writes are pure noise and could surface unexpectedly if toModelMessages is ever called on that history (the orphan signal would influence inFlightApprovalId). Consider guarding persistToolApprovalDecision to only fire for self-hosted action IDs (parseApprovalActionId only).

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: apps/api/src/app/agents/conversation-runtime/ingress/inbound-turn.handler.ts
    Line: 109-140
    
    Comment:
    **`recordApprovalVerdict` writes signal entries for managed-agent approvals too**
    
    `parseApprovalVerdict` delegates to both `parseApprovalActionId` (self-hosted `tool-approval:*` prefix) and `parseToolApprovalActionId` (managed `mcp-approval:*` / `direct-approval:*`). This means every managed-agent approval click now also writes a `tool-approval-response` signal activity into the transcript — in addition to whatever the managed runtime persists via its own state machine. Managed agents don't rely on `toModelMessages` for stateless resume, so the extra writes are pure noise and could surface unexpectedly if `toModelMessages` is ever called on that history (the orphan signal would influence `inFlightApprovalId`). Consider guarding `persistToolApprovalDecision` to only fire for self-hosted action IDs (`parseApprovalActionId` only).
    
    How can I resolve this? If you propose a fix, please make it concise.

    Fix in Cursor

Reviews (2): Last reviewed commit: "feat(agents): wire self-hosted tool appr..." | Re-trigger Greptile

ChmaraX and others added 15 commits June 30, 2026 11:04
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Export only AgentToolCall publicly; keep action-id and card helpers as package-internal imports.

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
… resolve

Co-authored-by: Cursor <cursoragent@cursor.com>
…proval handlers

Map persisted approval cards and synthetic decision entries into AI SDK
ModelMessage parts, and extend AiSdkAgentHandlers with onToolApproval and
toolApproval config for the pause/resume adapter.

Co-authored-by: Cursor <cursoragent@cursor.com>
Detect tool-approval-request in AI SDK results and post the approval card
instead of text; wire onToolApproval with default auto-resume via history.

Co-authored-by: Cursor <cursoragent@cursor.com>
…elf-hosted resume

A self-hosted (stateless) agent reconstructs its resume message list from the
conversation transcript, so an approval verdict must live there — not only in
the ephemeral approval card. Record the decision as a `tool-approval-response`
signal at ingestion (`handleAction`), the same seam that persists inbound
messages, covering both runtimes from one place. The ai-sdk history mapper now
reads the verdict from that persisted signal instead of an in-memory push,
removing the local/persisted history desync.

Co-authored-by: Cursor <cursoragent@cursor.com>
… of action ids

Shrink approval action ids to routing keys to stay within platform limits,
carry the full tool-call record via reply.toolApproval into message richContent,
and unify resume reconstruction across vanilla and ai-sdk paths.

Co-authored-by: Cursor <cursoragent@cursor.com>
Reconstruct tool-call/approval-request/response only for the approval
whose decision is the latest entry; collapse settled cycles to the
agent's reply so later turns don't replay a dangling tool_use (Anthropic
400) or re-execute the tool. Clear the typing indicator when a tool run
returns empty text, and fix the default approval card's Actions() call.

Co-authored-by: Cursor <cursoragent@cursor.com>
Declare AiSdkResult by the fields the adapter reads instead of the SDK's
StreamTextResult, whose toolset generic is invariant — returning
streamText() no longer needs `as unknown as AiSdkResult`.

Co-authored-by: Cursor <cursoragent@cursor.com>
…pprovals

Reconstructing a denied in-flight approval cycle emitted a bare
`tool-approval-response`, which providers drop, leaving the `tool_use`
without a matching `tool_result` and causing Anthropic to reject the
request. streamText only auto-synthesizes the denial result when the
tool message is last, so appending handler context broke it.

Map a denied decision to an `execution-denied` tool-result instead, so
the cycle is self-consistent regardless of what the handler appends.

Co-authored-by: Cursor <cursoragent@cursor.com>
…provals

The denial fix only rescued a handler that appends a message after
toModelMessages(...), which is the same anti-pattern that fails on the
approve path. Steering belongs in the `system` option so the
tool-approval-response stays last and the AI SDK resolves it itself
(execute on approve, mark denied on deny). Removing it keeps both paths
consistent and simplifies the mapper.

Co-authored-by: Cursor <cursoragent@cursor.com>
When a turn gates multiple tools, post only the first approval card,
mirroring the managed runtime's sequential approval. The AI SDK keeps
parallel tool-calls in a single assistant message, so a lone in-flight
approval is the only shape toModelMessages can replay without leaving a
tool_use unanswered. The resumed model re-drives remaining work.

Co-authored-by: Cursor <cursoragent@cursor.com>
@linear-code

linear-code Bot commented Jun 30, 2026

Copy link
Copy Markdown

NV-8148

@netlify

netlify Bot commented Jun 30, 2026

Copy link
Copy Markdown

Deploy Preview for dashboard-v2-novu-staging ready!

Name Link
🔨 Latest commit 26e2302
🔍 Latest deploy log https://app.netlify.com/projects/dashboard-v2-novu-staging/deploys/6a4536483f1b2d0008c60341
😎 Deploy Preview https://deploy-preview-11733.dashboard-v2.novu-staging.co
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Comment thread packages/framework/src/ai-sdk/reply-mapper.ts Outdated
Comment thread packages/framework/src/resources/agent/agent-dispatch.ts
Comment thread apps/api/src/app/agents/shared/dtos/agent-reply-payload.dto.ts
…istory

Persist executed tool results from AI SDK response messages on approval resume so multi-tool turns replay valid transcripts, reconstruct conversation history from activity ledger data, and group approval cycle events in the dashboard timeline using the existing InlineLogRow layout.

Co-authored-by: Cursor <cursoragent@cursor.com>
@greptile-apps

greptile-apps Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

T-Rex pricing update — T-Rex was free through June 2026. Effective July 1, 2026, T-Rex adds 2 credits on top of the standard 1-credit review (3 total). T-Rex settings

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant