feat(framework): self-hosted tool approval for the ai-sdk agent flavor fixes NV-8148#11733
Draft
ChmaraX wants to merge 16 commits into
Draft
feat(framework): self-hosted tool approval for the ai-sdk agent flavor fixes NV-8148#11733ChmaraX wants to merge 16 commits into
ChmaraX wants to merge 16 commits into
Conversation
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Export only AgentToolCall publicly; keep action-id and card helpers as package-internal imports. Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
… resolve Co-authored-by: Cursor <cursoragent@cursor.com>
…proval handlers Map persisted approval cards and synthetic decision entries into AI SDK ModelMessage parts, and extend AiSdkAgentHandlers with onToolApproval and toolApproval config for the pause/resume adapter. Co-authored-by: Cursor <cursoragent@cursor.com>
Detect tool-approval-request in AI SDK results and post the approval card instead of text; wire onToolApproval with default auto-resume via history. Co-authored-by: Cursor <cursoragent@cursor.com>
…elf-hosted resume A self-hosted (stateless) agent reconstructs its resume message list from the conversation transcript, so an approval verdict must live there — not only in the ephemeral approval card. Record the decision as a `tool-approval-response` signal at ingestion (`handleAction`), the same seam that persists inbound messages, covering both runtimes from one place. The ai-sdk history mapper now reads the verdict from that persisted signal instead of an in-memory push, removing the local/persisted history desync. Co-authored-by: Cursor <cursoragent@cursor.com>
… of action ids Shrink approval action ids to routing keys to stay within platform limits, carry the full tool-call record via reply.toolApproval into message richContent, and unify resume reconstruction across vanilla and ai-sdk paths. Co-authored-by: Cursor <cursoragent@cursor.com>
Reconstruct tool-call/approval-request/response only for the approval whose decision is the latest entry; collapse settled cycles to the agent's reply so later turns don't replay a dangling tool_use (Anthropic 400) or re-execute the tool. Clear the typing indicator when a tool run returns empty text, and fix the default approval card's Actions() call. Co-authored-by: Cursor <cursoragent@cursor.com>
Declare AiSdkResult by the fields the adapter reads instead of the SDK's StreamTextResult, whose toolset generic is invariant — returning streamText() no longer needs `as unknown as AiSdkResult`. Co-authored-by: Cursor <cursoragent@cursor.com>
…pprovals Reconstructing a denied in-flight approval cycle emitted a bare `tool-approval-response`, which providers drop, leaving the `tool_use` without a matching `tool_result` and causing Anthropic to reject the request. streamText only auto-synthesizes the denial result when the tool message is last, so appending handler context broke it. Map a denied decision to an `execution-denied` tool-result instead, so the cycle is self-consistent regardless of what the handler appends. Co-authored-by: Cursor <cursoragent@cursor.com>
…provals The denial fix only rescued a handler that appends a message after toModelMessages(...), which is the same anti-pattern that fails on the approve path. Steering belongs in the `system` option so the tool-approval-response stays last and the AI SDK resolves it itself (execute on approve, mark denied on deny). Removing it keeps both paths consistent and simplifies the mapper. Co-authored-by: Cursor <cursoragent@cursor.com>
When a turn gates multiple tools, post only the first approval card, mirroring the managed runtime's sequential approval. The AI SDK keeps parallel tool-calls in a single assistant message, so a lone in-flight approval is the only shape toModelMessages can replay without leaving a tool_use unanswered. The resumed model re-drives remaining work. Co-authored-by: Cursor <cursoragent@cursor.com>
✅ Deploy Preview for dashboard-v2-novu-staging ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
…istory Persist executed tool results from AI SDK response messages on approval resume so multi-tool turns replay valid transcripts, reconstruct conversation history from activity ledger data, and group approval cycle events in the dashboard timeline using the existing InlineLogRow layout. Co-authored-by: Cursor <cursoragent@cursor.com>
Contributor
|
T-Rex pricing update — T-Rex was free through June 2026. Effective July 1, 2026, T-Rex adds 2 credits on top of the standard 1-credit review (3 total). T-Rex settings |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds self-hosted (vanilla +
@novu/framework/ai-sdk) tool approval so a gated tool call pauses the turn, posts an Approve/Deny card, and resumes statelessly on the click — mirroring the managed runtime, with no server-side session.packages/frameworkcore): action-id grammar (tool-approval:{verdict}:{approvalId}),richContent.toolApprovalpayload codec, default + resolved approval cards, and click routing toonToolApproval.tool-approval-request, posts the card, ends the turn, and auto-resumes by re-runningonMessage(toModelMessages(ctx.history)rebuilds the request body).onToolApprovalis optional — returnvoidto auto-resume or anAiSdkResultto drive it yourself.apps/api/.../conversation-runtime): persists the approval decision to the transcript so the stateless resume can reconstruct the in-flight cycle.Flow
sequenceDiagram participant U as User participant N as Novu (adapter) participant M as Model (streamText) U->>N: "refund $250 for A1" N->>M: onMessage → streamText(tools) M-->>N: tool-approval-request (needsApproval) N-->>U: Approve / Deny card (one per turn) U->>N: click Approve/Deny N->>N: persist decision to history N->>M: resume → toModelMessages(history) M-->>N: executes (approve) / declines (deny) → reply N-->>U: resultContract for
onToolApprovalresumeOne rule for both paths: keep the reconstructed
tool-approval-responseas the last message; steer with thesystemoption, not a pushed message. The AI SDK resolves the approval itself (execute on approve, mark denied on deny) only while that part is last. Appending a message aftertoModelMessages(...)leaves atool_usewith notool_resultand the provider (e.g. Anthropic) rejects it — symmetric on approve and deny.Multiple gated tools in one turn
Only the first approval card is surfaced per turn (sequential approval, matching the managed runtime's
pendingApprovalTools[0]). The AI SDK keeps parallel tool-calls in a single assistant message, so a lone in-flight approval is the only shapetoModelMessagescan replay without leaving atool_useunanswered; the resumed model re-drives any remaining work.Test plan
cd packages/framework && pnpm test(unit: history-mapper, reply-mapper, ai-sdk-agent, action-id, agent)pnpm buildfor@novu/framework(exports + no circular deps)Notes
@novu/framework/ai-sdktoModelMessages, tool-approval types/kit).Made with Cursor
Greptile Summary
This PR adds stateless tool-approval support for self-hosted (
@novu/frameworkvanilla + AI SDK adapter): when the model requests a gated tool, the framework posts an Approve/Deny card, persists the decision to the conversation transcript, and auto-resumes by re-runningonMessagewithtoModelMessages(ctx.history)— mirroring the managed runtime without server-side session state.agent.context.ts,agent-dispatch.ts,agent.types.ts): newToolApprovalControl,PendingApprovalsentinel,emitToolResultflush mechanism,ToolApprovalConfigcustomisation hook, andtool-approval:*action-id grammar.reply-mapper.ts,history-mapper.ts,ai-sdk-agent.ts):handleResultdetectstool-approval-requestcontent parts, posts the first card (sequential gating), collects executed tool results fromsteps/response.messages, and theresumehelper re-invokesonMessagewith the updated history.conversation-runtime,dal): three new activity types with a first-classtoolDatacolumn;inbound-turn.handlerpersists decisions before dispatch so the bridge sees complete history.conversation-timeline-grouping.ts): groups approval cycles into a single timeline entry.Confidence Score: 5/5
Safe to merge; the feature is entirely additive with no breaking changes to existing agents, and the stateless resume logic is well-tested end-to-end.
The core resume contract (persist decision → rebuild history → re-run onMessage) is sound and backed by comprehensive unit tests covering approved/denied/chained/multi-tool cycles. New activity types and the toolData column are purely additive. The two findings are data-quality nits that do not affect the correctness of the approval flow or the durable transcript.
history-mapper.ts for the executionDeniedResult tool-name omission, and inbound-turn.handler.ts for the managed-agent double-write concern raised in the prior review round.
Important Files Changed
executionDeniedResulthardcodestoolName: 'tool'.handleResult,emitExecutedToolResults, andpostApprovalCard. TheAiSdkContextinterface cleanly scopes the internal cast to one boundary.onToolApprovalcalls user handler then auto-resumes viaresume(ctx). The double cast onctxis intentional and safe.onActiontoonToolApprovalwhen action ID matchestool-approval:*; falls back toresolvedApprovalCardif the handler doesn't edit the card.ToolApprovalControl,emitToolResult,createReplyHandle, and widensreply()to accepttoolApprovalpayload; tool results are flushed with the next reply or inflush().recordApprovalVerdictwhich persists approval decisions before dispatch. Delegates to both self-hosted and managed parsers — managed approvals also write aTOOL_APPROVAL_DECISIONactivity (noted in previous review).persistToolApprovalRequest,persistToolApprovalDecision, andpersistToolResultthroughcreateToolActivity. Well-structured persistence layer additions.splitReplyPersistenceto separate renderablerichContentfromtoolData; routes approval-carrying messages topersistToolApprovalRequest.!== SIGNALto=== MESSAGE, intentionally excluding edit/update/tool activity types. Marked with a TODO.ConversationActivityTypeEnumvalues andConversationActivityToolDatainterface with per-type field documentation.ToolResultDto(properly validated with@ValidateNested) andtoolApproval?: Record<string, unknown>onReplyContentDto.Sequence Diagram
%%{init: {'theme': 'neutral'}}%% sequenceDiagram participant U as User participant IH as InboundTurnHandler participant DB as ConversationActivity DB participant FR as Framework (AgentContextImpl) participant RM as ReplyMapper participant M as Model (streamText) participant OG as OutboundGateway U->>IH: message IH->>FR: dispatch(onMessage) FR->>M: streamText(tools) M-->>FR: tool-approval-request FR->>RM: handleResult RM->>OG: ctx.reply(card, toolApproval) OG->>DB: TOOL_APPROVAL_REQUEST OG-->>U: Approve/Deny card U->>IH: click Approve IH->>DB: TOOL_APPROVAL_DECISION IH->>FR: dispatch(onAction) FR->>FR: resume(ctx) FR->>M: streamText(toModelMessages(history)) M-->>FR: tool result + text RM->>FR: emitToolResult FR->>OG: ctx.reply(text) + toolResults OG->>DB: TOOL_RESULT + MESSAGE OG-->>U: reply%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%% sequenceDiagram participant U as User participant IH as InboundTurnHandler participant DB as ConversationActivity DB participant FR as Framework (AgentContextImpl) participant RM as ReplyMapper participant M as Model (streamText) participant OG as OutboundGateway U->>IH: message IH->>FR: dispatch(onMessage) FR->>M: streamText(tools) M-->>FR: tool-approval-request FR->>RM: handleResult RM->>OG: ctx.reply(card, toolApproval) OG->>DB: TOOL_APPROVAL_REQUEST OG-->>U: Approve/Deny card U->>IH: click Approve IH->>DB: TOOL_APPROVAL_DECISION IH->>FR: dispatch(onAction) FR->>FR: resume(ctx) FR->>M: streamText(toModelMessages(history)) M-->>FR: tool result + text RM->>FR: emitToolResult FR->>OG: ctx.reply(text) + toolResults OG->>DB: TOOL_RESULT + MESSAGE OG-->>U: replyComments Outside Diff (1)
apps/api/src/app/agents/conversation-runtime/ingress/inbound-turn.handler.ts, line 109-140 (link)recordApprovalVerdictwrites signal entries for managed-agent approvals tooparseApprovalVerdictdelegates to bothparseApprovalActionId(self-hostedtool-approval:*prefix) andparseToolApprovalActionId(managedmcp-approval:*/direct-approval:*). This means every managed-agent approval click now also writes atool-approval-responsesignal activity into the transcript — in addition to whatever the managed runtime persists via its own state machine. Managed agents don't rely ontoModelMessagesfor stateless resume, so the extra writes are pure noise and could surface unexpectedly iftoModelMessagesis ever called on that history (the orphan signal would influenceinFlightApprovalId). Consider guardingpersistToolApprovalDecisionto only fire for self-hosted action IDs (parseApprovalActionIdonly).Prompt To Fix With AI
Reviews (2): Last reviewed commit: "feat(agents): wire self-hosted tool appr..." | Re-trigger Greptile