Conversation history returns MessageSelect[] and formats at call site by mike-inkeep · Pull Request #2743 · inkeep/agents

mike-inkeep · 2026-03-18T14:48:01Z

No description provided.

…l sites

changeset-bot · 2026-03-18T14:48:06Z

⚠️ No Changeset found

Latest commit: a96c949

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

vercel · 2026-03-18T14:48:14Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
agents-api	Ready	Preview, Comment	Mar 23, 2026 11:11pm
agents-docs	Ready	Preview, Comment	Mar 23, 2026 11:11pm
agents-manage-ui	Ready	Preview, Comment	Mar 23, 2026 11:11pm

pullfrog

The return-type refactor from string to MessageSelect[] is well-structured — separation of fetching/compressing vs. formatting is a good direction, and the multi-artifact grouping fix is correct. Three areas need attention before merging: a behavioral inconsistency between getFormattedConversationHistory and the new formatMessagesAsConversationHistory, a createdAt regression in the summary message, and missing test coverage for two new code paths.

^{｜ Fix all ➔ ｜ Fix 👍s ➔ ｜ View workflow run ｜ pullfrog.com ｜ 𝕏}

pullfrog · 2026-03-18T14:51:33Z

agents-api/src/domains/run/data/conversations.ts

@@ -937,8 +947,13 @@ function formatMessagesAsConversationHistory(messages: any[]): string {
        roleLabel = msg.role || 'system';
      }

-      return `${roleLabel}: """${reconstructMessageText(msg)}"""`;
+      const reconstructedMessage = reconstructMessageText(msg);
+      if (!reconstructedMessage) {
+        return null;
+      }
+      return `${roleLabel}: """${reconstructedMessage}"""`;
    })
+    .filter((line): line is string => line !== null)
    .join('\n');

  return `<conversation_history>\n${formattedHistory}\n</conversation_history>\n`;


The newly exported formatMessagesAsConversationHistory differs from the inline formatting in getFormattedConversationHistory (around line 376) in two meaningful ways:

This function uses reconstructMessageText(msg) (handles multi-part content with artifact refs), while the other uses msg.content.text directly — silently dropping artifact reference tags from multi-part messages.

This function filters out messages with empty reconstructed text, while the other includes them (producing role: """""" entries).

Since getFormattedConversationHistory is still actively called from AgentSession.ts, these inconsistencies produce different conversation history formats depending on the code path. Consider refactoring getFormattedConversationHistory to delegate to this exported function.

pullfrog · 2026-03-18T14:51:33Z

packages/agents-core/src/data-access/runtime/conversations.ts

+        const referenceMessage = messageHistory[0];
+        const summaryMessage: MessageSelect = {
          id: `summary-${getConversationId()}`,
+          tenantId: referenceMessage.tenantId,
+          projectId: referenceMessage.projectId,
+          conversationId: referenceMessage.conversationId,
          role: 'system',
+          fromSubAgentId: null,
+          toSubAgentId: null,
+          fromExternalAgentId: null,
+          toExternalAgentId: null,
+          fromTeamAgentId: null,
+          toTeamAgentId: null,
          content: {
            text: `[Previous conversation history truncated - ${i + 1} earlier messages]`,
          },
          visibility: 'system',
          messageType: 'chat',
-          createdAt: messageHistory[0].createdAt,
+          taskId: null,
+          parentMessageId: null,
+          a2aTaskId: null,
+          a2aSessionId: null,
+          metadata: null,
+          createdAt: new Date().toISOString(),
+          updatedAt: new Date().toISOString(),


createdAt changed from messageHistory[0].createdAt to new Date().toISOString(). The summary message now has a timestamp after all the messages it summarizes, rather than being anchored to the start of the conversation. The array ordering is still correct (it's unshifted), but if any downstream code sorts or filters by createdAt, the summary would sort to the end. The previous behavior was more semantically correct — consider reverting to referenceMessage.createdAt.

pullfrog · 2026-03-18T14:51:33Z

agents-api/src/__tests__/run/data/conversations.artifact-replacement.test.ts

+    const result = await getConversationHistoryWithCompression(baseParams);
+    const toolResult = result.find((msg) => msg.messageType === 'tool-result');
+    const toolResultText = toolResult?.content?.text ?? '';
+
+    expect(toolResultText).toContain('id: art-1');
+    expect(toolResultText).toContain('id: art-2');


This test verifies both artifact IDs are present but doesn't assert the \n join between them. If the join separator were accidentally changed to empty string or space, this test would still pass. Consider:

expect(toolResultText).toMatch(/id: art-1[\s\S]*\n[\s\S]*id: art-2/);

pullfrog · 2026-03-18T14:51:33Z

agents-api/src/domains/run/data/conversations.ts

  return parts
    .map((part: any) => {
-      if (part.type === 'text') {
+      const partKind = part.kind ?? part.type;


The part.kind ?? part.type fallback handles both A2A protocol shape ({ kind: 'text' }) and legacy shape ({ type: 'text' }). The existing reconstructMessageText tests only cover { type: ... } parts — there's no test coverage for the { kind: ... } shape. Add a test case to verify the fallback works.

pullfrog · 2026-03-18T14:51:33Z

agents-api/src/domains/run/data/conversations.ts

+      const reconstructedMessage = reconstructMessageText(msg);
+      if (!reconstructedMessage) {
+        return null;
+      }
+      return `${roleLabel}: """${reconstructedMessage}"""`;
    })
+    .filter((line): line is string => line !== null)


formatMessagesAsConversationHistory now filters out messages where reconstructMessageText returns empty string — this is new behavior (previously all messages were formatted). A message with only non-text/non-artifact parts (e.g. file parts) would be silently dropped. This needs a direct test, and it's worth confirming this is the desired behavior for all message types.

claude

PR Review Summary

(1) Total Issues | Risk: Low

🟡 Minor (1) 🟡

Inline Comments:

🟡 Minor: conversations.ts:510 Args duplicated for each artifact sharing a toolCallId

💭 Consider (2) 💭

💭 1) conversations.ts:903 Defensive fallback for part.kind ?? part.type lacks documentation

Issue: The change from part.type to part.kind ?? part.type suggests two different part schemas exist, but there's no documentation of when each occurs.
Why: The canonical types use kind. If type is a legacy format, a comment would clarify intent.
Fix: Add inline comment explaining the variants, or log occurrences if type is unexpected.
Refs: utility.ts:91-96 (canonical kind-based schema)

💭 2) conversations.ts:950-956 Behavioral change - empty messages now filtered out

Issue: New implementation filters out messages with empty reconstructed text. Previously these appeared as roleLabel: """""".
Why: This is arguably an improvement but goes beyond the stated refactoring goal.
Fix: Document this as intentional and consider adding a test case.

Inline Comments:

💭 Consider: conversations.ts:903 Defensive kind ?? type fallback
💭 Consider: conversations.ts:950-956 Empty message filtering behavior change

💡 APPROVE WITH SUGGESTIONS

Summary: This is a well-structured refactoring that improves type safety by returning MessageSelect[] from getConversationHistoryWithCompression and moving formatting to call sites. The multi-artifact support for shared toolCallId values is a valuable fix. The one Minor issue (args duplication) is a small optimization opportunity, and the Consider items are documentation/testing suggestions rather than blocking concerns.

The PR is ready to merge. Nice work on improving the type boundaries! 🎉

Discarded (3)

Location	Issue	Reason Discarded
`conversations.artifact-replacement.test.ts:147`	Test doesn't verify newline-joining format	Nitpick - the core assertion (both artifacts present) is correct
`conversations.test.ts`	Missing tests for `kind` property in `reconstructMessageText`	Valid but downgraded - defensive code for compatibility, not a bug risk
`conversations.artifact-replacement.test.ts:35`	Test helper missing `MessageSelect` fields	Pre-existing pattern in test fixtures, low impact

Reviewers (5)

Reviewer	Returned	Consider	Inline Comments	Discarded
`pr-review-tests`	4	0	0	3
`pr-review-precision`	3	2	1	0
`pr-review-standards`	0	0	0	0
`pr-review-types`	0	0	0	0
`pr-review-errors`	0	0	0	0
Total	7	2	1	3

claude · 2026-03-18T14:54:00Z

agents-api/src/domains/run/data/conversations.ts

+            const artifactParts = [
+              `Artifact: "${artifact.name ?? artifact.artifactId}" (id: ${artifact.artifactId})`,
+            ];
+            if (argsStr) artifactParts.push(`args: ${argsStr}`);


🟡 Minor: Args duplicated for each artifact sharing a toolCallId

Issue: When multiple artifacts share the same toolCallId, the argsStr (computed once from message metadata at lines 492-495) gets appended to every artifact's reference string inside the map. Since all artifacts for the same tool call came from the same invocation, they have identical toolArgs, resulting in the same args: {...} appearing N times in the output.

Why: This unnecessarily bloats token count in the conversation history. For tool calls that produce multiple artifacts (e.g., multiple images or documents), the args string could be repeated 2-10x.

Fix: Consider moving args to a single prefix line before the artifact list:

const argsLine = argsStr ? `Tool call args: ${argsStr}\n` : ''; const artifactRefs = relatedArtifacts.map((artifact) => { // ... build artifact-specific parts WITHOUT args }); return { ...msg, content: { text: argsLine + artifactRefs.join('\n') }, };

Refs:

Test showing multi-artifact case

claude · 2026-03-18T14:54:05Z

agents-api/src/domains/run/data/conversations.ts

  return parts
    .map((part: any) => {
-      if (part.type === 'text') {
+      const partKind = part.kind ?? part.type;


💭 Consider: Defensive fallback for part.kind ?? part.type

Issue: This fallback suggests there are two different part schemas in play: the canonical MessageContent.parts type (which uses kind per the schema definitions) and some other format that uses type. The fallback is reasonable defensive code, but there's no documentation explaining when each variant occurs.

Why: Without clarity on the source of type-keyed parts, this could mask upstream inconsistencies rather than solve a real polymorphism need. The canonical types in utility.ts and a2a.ts both use kind.

Fix: If both are legitimate variants (e.g., legacy data or external formats), add a brief inline comment:

// Support both 'kind' (canonical schema) and 'type' (legacy/external format) const partKind = part.kind ?? part.type;

If type is never expected, consider logging when encountered to track occurrences.

claude · 2026-03-18T14:54:08Z

agents-api/src/domains/run/data/conversations.ts

+      const reconstructedMessage = reconstructMessageText(msg);
+      if (!reconstructedMessage) {
+        return null;
+      }
+      return `${roleLabel}: """${reconstructedMessage}"""`;
    })
+    .filter((line): line is string => line !== null)


💭 Consider: Behavioral change - empty messages now filtered out

Issue: The new implementation filters out messages with empty reconstructed text (returning null then filtering). The previous implementation would include these as roleLabel: """""". This is arguably an improvement, but goes beyond the stated refactoring goal.

Why: This could affect downstream consumers that rely on message boundaries (e.g., compression logic counting messages) or debugging traces. If intentional, it's worth documenting.

Fix: If intentional, consider adding a test case validating the filtering behavior and noting this in the PR description as a deliberate improvement.

pullfrog · 2026-03-23T23:10:01Z

TL;DR — Shifts getConversationHistoryWithCompression from returning a pre-formatted string to returning raw MessageSelect[], moving the formatting responsibility to the call site via the now-exported formatMessagesAsConversationHistory. Also fixes a data-loss bug where multiple artifacts sharing a toolCallId collapsed to a single reference.

Key changes

getConversationHistoryWithCompression returns MessageSelect[] instead of string — call sites in buildConversationHistory now explicitly call formatMessagesAsConversationHistory to produce the final string, decoupling data retrieval from presentation.
Fix multi-artifact toolCallId grouping — the artifact-replacement logic now collects all artifacts per toolCallId into an array instead of keeping only the last one, preserving every reference in the compressed output.
Export formatMessagesAsConversationHistory — promoted from a private function to a public export so call sites can format MessageSelect[] on demand; also filters out messages that reconstruct to empty strings.
Replace any with MessageSelect across history APIs — getScopedHistory, applyContextWindowManagement, getConversationHistory, and related locals are now properly typed.
Fix reconstructMessageText part-kind detection — checks part.kind before falling back to part.type, aligning with the MessageSelect content shape.

_{Summary ｜ 6 files ｜ 2 commits ｜ base: main ← stack/history_shape}

Return structured messages instead of pre-formatted strings

Before: getConversationHistoryWithCompression called formatMessagesAsConversationHistory internally and returned a string.
After: It returns MessageSelect[]; buildConversationHistory calls the formatter explicitly.

This decouples history retrieval from serialization, letting future consumers of the history data (e.g. token-counting, streaming, or alternate formats) operate on structured messages without re-parsing a string. The summary placeholder created by applyContextWindowManagement is now a fully-typed MessageSelect object rather than a loosely-typed bag.

conversations.ts · conversation-history.ts · runtime/conversations.ts

Multi-artifact grouping per `toolCallId`

Before: artifactsByToolCallId was a Map<string, Artifact> — when multiple artifacts shared a toolCallId, only the last one survived.
After: It is a Map<string, Artifact[]>, and the replacement template iterates all related artifacts, joining their references with newlines.

A new test case (preserves all artifact references when multiple artifacts share a toolCallId) covers the fix end-to-end. The ledgerArtifacts data-access test suite also gains a returns multiple artifacts for the same toolCallId case.

conversations.ts · conversations.artifact-replacement.test.ts · ledgerArtifacts.test.ts

^{｜ View workflow run ｜ Triggered by Pullfrog ｜ pullfrog.com ｜ 𝕏}

claude

PR Review Summary

(0) Total Issues | Risk: Low

Re-review scope: Changes since last review (8eea5bf88a06). The delta consists of a merge commit bringing in PDF attachment support from origin/main (PR #2709).

Delta Review Summary

Files with delta changes:

conversation-history.ts: Added mapFileToAiSdkContentPart helper for PDF/image handling, renamed imageParts → fileParts
conversations.ts: Updated createMessage API to use { scopes, data } structure
Agent.test.ts: Added PDF test + mock path rename

Findings: No new issues meeting the severity/confidence thresholds were identified in the delta changes. The PDF support code is well-structured with proper type handling, defensive null returns for unsupported file types, and appropriate warning logging.

🕐 Pending Recommendations (5)

Prior review feedback that remains applicable to the original PR changes:

🟡 conversations.ts:510 Args duplicated for each artifact sharing a toolCallId
💭 conversations.ts:903 Defensive kind ?? type fallback lacks documentation
💭 conversations.ts:950-956 Empty messages now filtered out (behavioral change)
🟠 runtime/conversations.ts:268 createdAt regression in summary message (pullfrog)
🟠 conversations.ts:961 Behavioral inconsistency between formatting functions (pullfrog)

✅ APPROVE

Summary: The delta changes from the merge commit are clean. PDF attachment support is correctly integrated with the existing file handling infrastructure. Prior review feedback from the first review run remains applicable to the original PR changes — see Pending Recommendations above. 🎉

Reviewers (2)

Reviewer	Returned	Discarded
`pr-review-standards`	0	0
`pr-review-tests`	3	3
Total	3	3

Note: Test coverage findings (unsupported file types, filename metadata, URI path) were assessed as MEDIUM/LOW confidence and discarded per review criteria.

itoqa · 2026-03-24T01:02:20Z

Ito Test Report ❌

7 test cases ran. 1 failed, 6 passed.

Overall, the unified run executed 7 test cases with 6 passes and 1 failure (plus several verification runs with no executable cases), indicating generally solid behavior with one confirmed production defect. Key findings were that auth setup and run-domain gates behaved correctly (web_client app creation returned 201, PoW was required and successfully minted anon tokens when solved, and unauthorized origins were blocked with 403), while chat edge checks for 10 parallel requests and a ~50k prompt were stable, but first-turn requests incorrectly inject an empty <conversation_history> wrapper (medium severity, likely introduced by this PR) causing no-history semantics drift.

❌ Failed (1)

Category	Summary	Screenshot
Edge	🟠 First-turn requests can still inject an empty <conversation_history> wrapper when prior history is empty.	N/A

🟠 Empty conversation history wrapper injected on first turn

What failed: Expected no conversation-history payload on first turn, but the formatter still emits <conversation_history>\n\n</conversation_history>\n, which is treated as non-empty and injected.
Impact: First-turn prompts can carry an empty synthetic wrapper that should not exist, adding avoidable prompt noise and behavior drift. This can cause brittle instruction-following in edge flows that depend on exact no-history semantics.
Introduced by this PR: Yes – this PR modified the relevant code
Steps to reproduce:
1. Create a brand-new conversation with no prior messages.
2. Call the chat endpoint for the first turn and inspect how conversation history is built.
3. Observe that an empty <conversation_history> wrapper is still added to initial model messages.
Code analysis: Inspected conversation-history retrieval/formatting and message assembly paths. getConversationHistoryWithCompression correctly returns an empty array when there is no history, but formatMessagesAsConversationHistory always wraps output, and buildInitialMessages only checks trim() !== '', so the wrapper is injected even when there are zero historical messages.
Why this is likely a bug: The production path composes [] history into a non-empty wrapper string and then unconditionally injects it, which contradicts intended first-turn no-history behavior.

Relevant code:

agents-api/src/domains/run/data/conversations.ts (lines 462-464)

if (!messagesToFormat.length) {
  return [];
}

agents-api/src/domains/run/data/conversations.ts (lines 925-962)

export function formatMessagesAsConversationHistory(messages: MessageSelect[]): string {
  const formattedHistory = messages
    .map((msg: MessageSelect) => {
      // ...role formatting...
      const reconstructedMessage = reconstructMessageText(msg);
      if (!reconstructedMessage) {
        return null;
      }
      return `${roleLabel}: """${reconstructedMessage}"""`;
    })
    .filter((line): line is string => line !== null)
    .join('\n');

  return `<conversation_history>\n${formattedHistory}\n</conversation_history>\n`;
}

agents-api/src/domains/run/agents/generation/conversation-history.ts (lines 124-126)

if (conversationHistory.trim() !== '') {
  messages.push({ role: 'user', content: conversationHistory });
}

✅ Passed (6)

Category	Summary	Screenshot
Adversarial	Valid PoW from an unauthorized origin is denied, preventing token issuance.
Auth	Created prerequisite project/app setup and successfully created a web_client app via manage API (HTTP 201).
Auth	Anonymous-session creation without a PoW solution is rejected with the expected bad-request error.
Auth	Solved PoW challenge mints an anonymous session token successfully, with expected JWT claims.
Edge	10 parallel non-stream requests succeeded with HTTP 200 and conversation remained readable.
Edge	A ~50k-character prompt completed with HTTP 200 and no uncaught 500/internal error.

Commit: a96c949

View Full Run

Tell us how we did: Give Ito Feedback

Return MessageSelect[] from history APIs and format at generation cal…

8eea5bf

…l sites

vercel bot deployed to Preview – agents-api March 18, 2026 14:50 View deployment

vercel bot deployed to Preview – agents-docs March 18, 2026 14:50 View deployment

pullfrog bot reviewed Mar 18, 2026

View reviewed changes

vercel bot deployed to Preview – agents-manage-ui March 18, 2026 14:51 View deployment

claude bot reviewed Mar 18, 2026

View reviewed changes

github-actions bot deleted a comment from claude bot Mar 18, 2026

This was referenced Mar 18, 2026

fix(agents-api): sanitize artifact binary data before persistence #2680

Closed

feat(images): enable multi-turn conversations about image context parts #2604

Closed

Merge remote-tracking branch 'origin/main' into stack/history_shape

a96c949

vercel bot deployed to Preview – agents-api March 23, 2026 23:11 View deployment

vercel bot deployed to Preview – agents-manage-ui March 23, 2026 23:11 View deployment

vercel bot deployed to Preview – agents-docs March 23, 2026 23:11 View deployment

claude bot reviewed Mar 23, 2026

View reviewed changes

github-actions bot deleted a comment from claude bot Mar 23, 2026

Conversation

mike-inkeep commented Mar 18, 2026

Uh oh!

changeset-bot bot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ No Changeset found

Uh oh!

vercel bot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pullfrog bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pullfrog bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

pullfrog bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

pullfrog bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

pullfrog bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

pullfrog bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

PR Review Summary

🟡 Minor (1) 🟡

💭 Consider (2) 💭

💡 APPROVE WITH SUGGESTIONS

Uh oh!

claude bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

pullfrog bot commented Mar 23, 2026

Key changes

Return structured messages instead of pre-formatted strings

Multi-artifact grouping per toolCallId

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

PR Review Summary

Delta Review Summary

🕐 Pending Recommendations (5)

✅ APPROVE

Uh oh!

itoqa bot commented Mar 24, 2026

Ito Test Report ❌

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

changeset-bot bot commented Mar 18, 2026 •

edited

Loading

vercel bot commented Mar 18, 2026 •

edited

Loading

pullfrog bot left a comment •

edited

Loading

Multi-artifact grouping per `toolCallId`