Skip to content

Bugfix/compressor bug#2833

Open
tim-inkeep wants to merge 8 commits intomainfrom
bugfix/compressor-bug
Open

Bugfix/compressor bug#2833
tim-inkeep wants to merge 8 commits intomainfrom
bugfix/compressor-bug

Conversation

@tim-inkeep
Copy link
Contributor

No description provided.

@changeset-bot
Copy link

changeset-bot bot commented Mar 25, 2026

🦋 Changeset detected

Latest commit: 4fdb505

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 10 packages
Name Type
@inkeep/agents-api Patch
@inkeep/agents-core Patch
@inkeep/agents-manage-ui Patch
@inkeep/agents-cli Patch
@inkeep/agents-sdk Patch
@inkeep/agents-work-apps Patch
@inkeep/ai-sdk-provider Patch
@inkeep/create-agents Patch
@inkeep/agents-email Patch
@inkeep/agents-mcp Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@vercel
Copy link

vercel bot commented Mar 25, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agents-api Ready Ready Preview, Comment Mar 26, 2026 1:03am
agents-docs Ready Ready Preview, Comment Mar 26, 2026 1:03am
agents-manage-ui Ready Ready Preview, Comment Mar 26, 2026 1:03am

Request Review

@pullfrog
Copy link
Contributor

pullfrog bot commented Mar 25, 2026

TL;DR — Prevents redundant artifact retrieval when the compressor has already summarized an artifact. The get_artifact_full tool now short-circuits with cached key findings and a sentinel hint, compression summaries persist key_findings back to the ledger artifact in the database, and a new updateLedgerArtifactParts DAL function supports partial updates to artifact parts.

Key changes

  • Short-circuit get_artifact_full for already-summarized artifacts — returns cached key findings and a sentinel-based hint instead of re-fetching the full artifact from the artifact service.
  • Persist key_findings to ledger artifacts after compression — writes summary key findings back to each related artifact's parts[0].data.summary in the runtime database so they survive across sessions.
  • Add hasSummarizedArtifact / getSummarizedArtifact to BaseCompressor — lookup methods that check the cumulative summary's related_artifacts for a given artifact ID.
  • Add updateLedgerArtifactParts DAL function — new data-access function for updating the parts column of an existing ledger artifact by ID.
  • Add unit tests for new compressor and DAL methods — covers hasSummarizedArtifact, getSummarizedArtifact, and updateLedgerArtifactParts with success and edge-case scenarios.

Summary | 6 files | 8 commits | base: mainbugfix/compressor-bug


Short-circuit artifact retrieval for compressed artifacts

Before: get_artifact_full always fetched the full artifact from the artifact service, even if the compressor had already distilled it into key findings — wasting tokens and potentially re-inflating a compressed context.
After: The tool checks ctx.currentCompressor.hasSummarizedArtifact(artifactId) first. If the artifact was already summarized, it returns status: 'already_summarized' with the extracted key_findings and a sentinel-based hint for downstream tool use.

The sentinel hint instructs the model to reference the artifact by ID using SENTINEL_KEY.ARTIFACT and SENTINEL_KEY.TOOL rather than inlining its content, keeping the context window lean.

default-tools.ts


Persist compression key findings to the runtime database

Before: Key findings produced during compression lived only in the in-memory cumulativeSummary and were lost when the compressor was garbage-collected.
After: After each compression pass, persistArtifactKeyFindings writes key_findings into each related artifact's ledger entry via updateLedgerArtifactParts, merging them into parts[0].data.summary.

The persistence is fire-and-forget — failures are logged at warn level and do not block the compression pipeline. The method iterates over related_artifacts, skips entries with no key findings or no matching DB record, and returns a { persisted, skipped, failed } tally for observability.

How does the new updateLedgerArtifactParts DAL function work? It takes a project-scoped artifact ID and a replacement parts array, issues a Drizzle update().set({ parts, updatedAt }).where(...) against ledgerArtifacts, and returns a boolean indicating whether a row was matched. The caller (persistArtifactKeyFindings) uses this to distinguish successful updates from no-ops.

BaseCompressor.ts · ledgerArtifacts.ts


Unit tests for compressor lookups and DAL update

Tests cover hasSummarizedArtifact (null summary, missing artifact, match) and getSummarizedArtifact (null summary, no match, returns correct key_findings and tool_call_id). The updateLedgerArtifactParts tests use a mock DB to verify the set payload and the true/false return based on row match.

BaseCompressor.test.ts · ledgerArtifacts.test.ts

Pullfrog  | View workflow run | Triggered by Pullfrogpullfrog.com𝕏

Copy link
Contributor

@pullfrog pullfrog bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium urgency — the retrieval guard in default-tools.ts looks correct, but persistArtifactKeyFindings has a silent data-loss bug: upsertLedgerArtifact is insert-only (returns { created: false } on conflict rather than updating), so the mutated parts with key_findings are never actually written back to the database. This means the hint message returned from get_reference_artifact references key findings that may not match what's actually stored.

Pullfrog  | Fix all ➔Fix 👍s ➔View workflow runpullfrog.com𝕏

Comment on lines +653 to +662
await upsertLedgerArtifact(runDbClient)({
scopes: { tenantId: this.tenantId, projectId: this.projectId },
contextId: this.conversationId,
taskId: dbArtifact.taskId ?? `task_${this.conversationId}-${this.sessionId}`,
toolCallId: dbArtifact.toolCallId,
artifact: {
...dbArtifact,
parts,
},
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: upsertLedgerArtifact is insert-only — this update is silently lost.

upsertLedgerArtifact calls db.insert() and on a unique-constraint violation returns { created: false, existing } without writing anything. Since compression artifacts already exist at this point (they were created in createNewArtifact / saveToolResultsAsArtifacts), the insert will always conflict and the mutated parts with key_findings will never reach the database.

You need either:

  1. A proper UPDATE query (e.g., db.update(ledgerArtifacts).set({ parts }).where(...)) — likely the right call here since you're patching a single field on an existing row.
  2. An onConflictDoUpdate in the upsert function.

Without this fix, the key_findings data is only held in the in-memory cumulativeSummary and will be lost on process restart.

Comment on lines +644 to +651
const dbArtifact = existing[0];
const parts = (dbArtifact.parts ?? []) as any[];
if (parts.length > 0 && parts[0]?.data?.summary) {
parts[0].data.summary = {
...parts[0].data.summary,
key_findings: artifact.key_findings,
};
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mutates a shared object in place. parts aliases dbArtifact.parts — the parts[0].data.summary = { ... } assignment mutates the fetched DB row object. This is harmless today since the object is not reused, but if getLedgerArtifacts ever adds caching, this will silently corrupt the cache. Consider cloning:

const parts = structuredClone(dbArtifact.parts ?? []) as any[];

Comment on lines +635 to +636
for (const artifact of relatedArtifacts) {
if (!artifact.id || !artifact.key_findings?.length) continue;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: artifact.id is typed as z.string() (non-nullable) from the ConversationSummarySchema, so the !artifact.id guard is dead code — it can never be falsy. Consider removing it for clarity, or if you genuinely need to guard against empty strings, use !artifact.id.length.

Comment on lines +24 to +37
const compressor = ctx.currentCompressor;
if (compressor?.hasSummarizedArtifact(artifactId)) {
const summarized = compressor.getSummarizedArtifact(artifactId);
logger.info(
{ artifactId, toolCallId },
'Blocked retrieval of artifact already summarized in compression'
);
return {
artifactId,
status: 'already_summarized',
key_findings: summarized?.key_findings ?? [],
hint: `This artifact's key findings are already in your compressed context. Use them directly to answer. To pass this artifact to a tool, use { "${SENTINEL_KEY.ARTIFACT}": "${artifactId}", "${SENTINEL_KEY.TOOL}": "${summarized?.tool_call_id ?? toolCallId}" } sentinel instead of retrieving it.`,
};
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The guard correctly prevents re-fetching a summarized artifact and points the model at the sentinel syntax. One concern: hasSummarizedArtifact matches on artifact.id (the compression-generated compress_* ID), but the model calls get_reference_artifact with an artifactId that came from the artifact list. Are these guaranteed to be the same ID? If the distillation produces the compression-generated ID in related_artifacts[].id but the model sees a different ID in available_artifacts, this guard will never trigger. Worth verifying the ID mapping is consistent end-to-end.

Comment on lines +532 to +539
if (summary.related_artifacts?.length) {
this.persistArtifactKeyFindings(summary.related_artifacts).catch((err) =>
logger.warn(
{ err: err instanceof Error ? err.message : String(err) },
'Failed to persist key_findings to artifacts'
)
);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The .catch() fire-and-forget is fine for a best-effort persist, but since persistArtifactKeyFindings currently has no effect (see the upsert bug above), this entire block is inert. Once the persist path is fixed, consider whether a failure here should propagate — if the key findings aren't stored, the retrieval guard in default-tools.ts will still return them from memory, creating an inconsistency after restart.

Copy link
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(4) Total Issues | Risk: High

🔴❗ Critical (1) ❗🔴

Inline Comments:

  • 🔴 Critical: BaseCompressor.ts:653-662 upsertLedgerArtifact is insert-only — key_findings will never be persisted to the database

🟠⚠️ Major (2) 🟠⚠️

Inline Comments:

  • 🟠 Major: BaseCompressor.ts:646-651 Silent no-op when artifact structure doesn't match expected format
  • 🟠 Major: BaseCompressor.ts:532-538 Fire-and-forget async call masks persistence failures

🟡 Minor (2) 🟡

Inline Comments:

  • 🟡 Minor: BaseCompressor.ts:617-629 Missing test coverage for hasSummarizedArtifact() and getSummarizedArtifact() methods
  • 🟡 Minor: default-tools.ts:24-36 Missing test coverage for early return path when artifact already summarized

💭 Consider (1) 💭

💭 1) BaseCompressor.ts:632-673 Track persistence results for better observability

Issue: The persistArtifactKeyFindings method silently continues when individual artifacts fail. The method completes "successfully" even if ALL artifacts failed to persist.

Why: A "successful" call could have persisted 0 of N artifacts. Mix of silent skips and warn-logged failures makes tracking impossible.

Fix: Consider returning a result object: { persisted: number; skipped: number; failed: number } to enable monitoring and debugging.


🚫 REQUEST CHANGES

Summary: The core functionality of this PR — persisting key_findings to artifacts — will not work due to upsertLedgerArtifact being insert-only (no update capability). The PR adds the methods and calls them, but the database writes silently fail to update existing records. This is a blocking issue that must be fixed before merge.

Additionally:

  • The new code paths lack test coverage (per AGENTS.md requirements)
  • Silent failure modes in persistArtifactKeyFindings could mask data integrity issues
  • A changeset is needed since this changes runtime behavior in agents-api
Discarded (1)
Location Issue Reason Discarded
default-tools.ts:34 Race condition if hasSummarizedArtifact returns true but getSummarizedArtifact returns null Extremely unlikely given current implementation — both methods use the same cumulativeSummary object synchronously. Low confidence finding.
Reviewers (3)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
pr-review-standards 1 0 0 0 1 0 0
pr-review-errors 4 0 1 0 2 0 1
pr-review-tests 4 0 0 0 2 0 0
Total 9 0 1 0 5 0 1

Comment on lines +653 to +662
await upsertLedgerArtifact(runDbClient)({
scopes: { tenantId: this.tenantId, projectId: this.projectId },
contextId: this.conversationId,
taskId: dbArtifact.taskId ?? `task_${this.conversationId}-${this.sessionId}`,
toolCallId: dbArtifact.toolCallId,
artifact: {
...dbArtifact,
parts,
},
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 CRITICAL: upsertLedgerArtifact is insert-only — key_findings will never be persisted

Issue: The persistArtifactKeyFindings method calls upsertLedgerArtifact to update existing artifacts with key_findings, but this function doesn't actually update existing records. When a duplicate key error occurs, it returns { created: false, existing: ... } without applying any changes.

Why: Looking at ledgerArtifacts.ts:166-196, the function uses db.insert() without .onConflictDoUpdate(). The modified parts array with key_findings is never persisted to the database. This means:

  1. The in-memory cumulativeSummary will have key_findings
  2. The database artifacts will NOT have them
  3. On service restart/session reload, key_findings are lost
  4. The get_artifact_full early-return path will show stale data

Fix: Either:

  1. Add a dedicated updateLedgerArtifact function that uses db.update() with a WHERE clause, or
  2. Modify upsertLedgerArtifact to use .onConflictDoUpdate() similar to other upserts in the codebase (e.g., upsertWorkAppSlackChannelAgentConfig)

Refs:

Comment on lines +646 to +651
if (parts.length > 0 && parts[0]?.data?.summary) {
parts[0].data.summary = {
...parts[0].data.summary,
key_findings: artifact.key_findings,
};
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 MAJOR: Silent no-op when artifact structure doesn't match expected format

Issue: The code checks if (parts.length > 0 && parts[0]?.data?.summary) but has no else branch. If this condition is false (e.g., artifact has parts: [] or a different structure), the artifact is still upserted with unmodified parts, potentially overwriting existing data without adding key_findings.

Why: This creates a silent failure path where:

  1. The mutation at lines 647-650 never executes
  2. The upsert still runs with original (unmodified) parts
  3. No indication this happened — no log, no error
  4. Could corrupt artifact data if there's a race condition between read and write

Fix: Add explicit handling for unexpected structures:

const parts = (dbArtifact.parts ?? []) as any[];
if (parts.length === 0) {
  logger.debug({ artifactId: artifact.id }, 'Artifact has no parts, skipping key_findings persistence');
  continue; // Skip this artifact entirely
}
if (!parts[0]?.data?.summary) {
  logger.debug(
    { artifactId: artifact.id, partKind: parts[0]?.kind },
    'Artifact first part has no summary structure, skipping key_findings persistence'
  );
  continue; // Skip rather than risk overwriting with unmodified data
}

Comment on lines +532 to +538
if (summary.related_artifacts?.length) {
this.persistArtifactKeyFindings(summary.related_artifacts).catch((err) =>
logger.warn(
{ err: err instanceof Error ? err.message : String(err) },
'Failed to persist key_findings to artifacts'
)
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 MAJOR: Fire-and-forget async call masks persistence failures

Issue: The persistArtifactKeyFindings() method is called fire-and-forget with .catch() that only logs a warning. There's no mechanism for the caller to know if persistence failed, and no retry logic.

Why: Silent data loss scenario:

  • If persistence fails (DB connection issue, constraint violation, etc.), the key_findings data will never be persisted
  • The compression summary in memory will contain key_findings, but database artifacts won't
  • On service restart or session reload, key_findings would be lost
  • Debugging is difficult — only a WARN log entry with minimal context (no artifact IDs, no correlation ID)

Fix: If persistence is critical, make it synchronous and propagate errors:

if (summary.related_artifacts?.length) {
  try {
    await this.persistArtifactKeyFindings(summary.related_artifacts);
  } catch (err) {
    logger.error(
      { 
        err: err instanceof Error ? err.message : String(err),
        conversationId: this.conversationId,
        artifactIds: summary.related_artifacts.map(a => a.id),
      },
      'Failed to persist key_findings to artifacts - summary may be inconsistent'
    );
    // Decide: throw to fail compression, or continue with explicit degraded state
  }
}

If fire-and-forget is intentional, at minimum add richer logging with artifact IDs and conversation context.

Comment on lines +617 to +629
hasSummarizedArtifact(artifactId: string): boolean {
return this.cumulativeSummary?.related_artifacts?.some((a) => a.id === artifactId) ?? false;
}

getSummarizedArtifact(
artifactId: string
): { key_findings: string[]; tool_call_id: string } | null {
const artifact = this.cumulativeSummary?.related_artifacts?.find((a) => a.id === artifactId);
if (!artifact) return null;
return {
key_findings: artifact.key_findings,
tool_call_id: artifact.tool_call_id,
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Minor: Missing test coverage for new public methods

Issue: The new hasSummarizedArtifact() and getSummarizedArtifact() methods are called in the critical path of getArtifactTools but have no test coverage.

Why: These methods determine whether to short-circuit artifact retrieval. If broken (e.g., a typo changing a.id to a.artifactId), the agent could:

  1. Fail to recognize already-summarized artifacts, causing redundant retrievals that bloat context
  2. Return incorrect key_findings, causing the agent to use stale/wrong information

Fix: Add unit tests in BaseCompressor.test.ts:

describe('hasSummarizedArtifact', () => {
  it('should return true when artifact exists in related_artifacts', () => {
    compressor['cumulativeSummary'] = {
      ...baseSummary,
      related_artifacts: [
        { id: 'artifact-123', name: 'Test', tool_name: 'search', tool_call_id: 'call-1', content_type: 'results', key_findings: ['finding1'] }
      ],
    };
    expect(compressor.hasSummarizedArtifact('artifact-123')).toBe(true);
  });

  it('should return false when cumulativeSummary is null', () => {
    compressor['cumulativeSummary'] = null;
    expect(compressor.hasSummarizedArtifact('any-id')).toBe(false);
  });
});

describe('getSummarizedArtifact', () => {
  it('should return key_findings and tool_call_id when artifact exists', () => {
    // ... test implementation
  });
});

Comment on lines +24 to +36
const compressor = ctx.currentCompressor;
if (compressor?.hasSummarizedArtifact(artifactId)) {
const summarized = compressor.getSummarizedArtifact(artifactId);
logger.info(
{ artifactId, toolCallId },
'Blocked retrieval of artifact already summarized in compression'
);
return {
artifactId,
status: 'already_summarized',
key_findings: summarized?.key_findings ?? [],
hint: `This artifact's key findings are already in your compressed context. Use them directly to answer. To pass this artifact to a tool, use { "${SENTINEL_KEY.ARTIFACT}": "${artifactId}", "${SENTINEL_KEY.TOOL}": "${summarized?.tool_call_id ?? toolCallId}" } sentinel instead of retrieving it.`,
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Minor: Missing test coverage for early return path

Issue: This new code path that returns status: 'already_summarized' when an artifact is in the compression summary has no test coverage.

Why: This code path:

  1. Returns a different response shape that the agent needs to handle
  2. Constructs a hint with SENTINEL_KEY values — if constants change, tests should fail
  3. Uses null-safety fallbacks (summarized?.key_findings ?? []) that should be validated

Fix: Add tests in the appropriate test file:

describe('getArtifactTools with summarized artifact', () => {
  it('should return already_summarized response when artifact is in compression summary', async () => {
    const mockCompressor = {
      hasSummarizedArtifact: vi.fn().mockReturnValue(true),
      getSummarizedArtifact: vi.fn().mockReturnValue({
        key_findings: ['finding1', 'finding2'],
        tool_call_id: 'original-call-123',
      }),
    };
    
    const ctx = { ...baseAgentRunContext, currentCompressor: mockCompressor };
    const tool = getArtifactTools(ctx);
    const result = await tool.execute({ artifactId: 'artifact-xyz', toolCallId: 'call-456' });
    
    expect(result.status).toBe('already_summarized');
    expect(result.key_findings).toEqual(['finding1', 'finding2']);
    expect(result.hint).toContain('$artifact');
  });
});

@github-actions github-actions bot deleted a comment from claude bot Mar 25, 2026
@itoqa
Copy link

itoqa bot commented Mar 25, 2026

Ito Test Report ✅

10 test cases ran. 10 passed.

Across the unified run, all 10 executed test cases passed with 0 failures, and additional verification-only checks found no reportable product defects when reviewing ROUTE-2/ROUTE-3 artifacts and related code. Key findings were that playground token issuance and retry-after-failure flows are robust, summarized-artifact retrieval correctly short-circuits with already_summarized plus key_findings and compressor-preferred tool_call_id hints, and edge-case behavior remained stable across rapid double submit, send-clear-send resets, back/forward interruptions, mobile debug toggling, and multi-tab usage.

✅ Passed (10)
Category Summary Screenshot
Edge Rapid double-submit completed without chat corruption symptoms. EDGE-1
Edge Send-clear-send loop reset conversation state without leakage. EDGE-2
Edge Back-forward interruption recovered and follow-up message succeeded. EDGE-3
Edge Mobile debug toggle kept input and send controls operable. EDGE-4
Edge Two-tab run stayed consistent for same agent conversation flow. EDGE-5
Logic Confirmed summarized artifacts short-circuit with status: already_summarized rather than full retrieval. N/A
Logic Confirmed short-circuit payload includes key_findings and sentinel-based hint guidance. N/A
Logic Confirmed hint generation prefers compressor-known summarized tool_call_id over caller-supplied value. N/A
Happy-path Playground initialization succeeded after Try it, with token endpoint returning 200 and valid apiKey/expiresAt for the expected agent context. ROUTE-1
Happy-path Forced token generation failure surfaced recoverable initialization error UI, and retry recovered successfully with token 200 and active chat input. ROUTE-4

Commit: 9db3ef6

View Full Run


Tell us how we did: Give Ito Feedback

Copy link
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(2) Total Issues | Risk: Low

Re-review scope: This review covers only the delta since the previous automated review. The author has addressed all Critical and Major issues.

✅ Issues Addressed Since Last Review

The following issues from the previous review have been fixed:

Prior Finding Resolution
🔴 Critical: upsertLedgerArtifact is insert-only ✅ Now uses new updateLedgerArtifactParts function with proper db.update()
🟠 Major: Silent no-op when artifact structure doesn't match ✅ Now logs at debug level and increments result.skipped
🟠 Major: Fire-and-forget async masks persistence failures ✅ Now synchronous with try/await/catch and rich error logging
💭 Consider: Track persistence results for observability ✅ Returns { persisted, skipped, failed } and logs at debug level
🐸 Pullfrog: Mutates shared object in place ✅ Now uses structuredClone(dbArtifact.parts ?? [])

🟡 Minor (2) 🟡

Inline Comments:

  • 🟡 Minor: BaseCompressor.ts:627-640 Missing test coverage for hasSummarizedArtifact() and getSummarizedArtifact() methods
  • 🟡 Minor: ledgerArtifacts.ts:373-389 Missing test coverage for updateLedgerArtifactParts function

💭 Consider (1) 💭

💭 1) scope Add a changeset for this bug fix

Issue: The existing changeset (informal-green-porpoise.md) covers the PDF attachments feature from a merged commit, not this compressor bug fix.

Why: Per AGENTS.md, any bug fix or behavior change to a published package needs a changeset. This PR changes runtime behavior in agents-api and agents-core.

Fix:

pnpm bump patch --pkg agents-api --pkg agents-core "Fix key_findings persistence in compressor by using proper update instead of insert-only upsert"

💡 APPROVE WITH SUGGESTIONS

Summary: Great work addressing all the critical and major issues from the previous review! The core bug is fixed — updateLedgerArtifactParts now properly updates existing artifacts instead of silently failing. The remaining items are test coverage (recommended but non-blocking for a bug fix) and adding a changeset. This PR is ready to merge once a changeset is added.

Discarded (0)

No findings were discarded in this review.

Reviewers (1)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
orchestrator 3 0 1 0 2 0 0
Total 3 0 1 0 2 0 0

Note: This is a delta re-review. Sub-reviewers were not re-dispatched since the prior Critical/Major issues have been addressed and no new significant code was introduced.

key_findings: artifact.key_findings,
tool_call_id: artifact.tool_call_id,
};
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Minor: Missing test coverage for hasSummarizedArtifact() and getSummarizedArtifact() methods

Issue: These new public methods are called in the critical path of getArtifactTools but have no test coverage in BaseCompressor.test.ts.

Why: These methods determine whether to short-circuit artifact retrieval. If broken (e.g., a typo changing a.id to a.artifactId), the agent could fail to recognize already-summarized artifacts, causing redundant retrievals that bloat context, or return incorrect key_findings.

Fix: Add unit tests for edge cases:

  • hasSummarizedArtifact when cumulativeSummary is null
  • hasSummarizedArtifact when artifact exists vs doesn't exist in related_artifacts
  • getSummarizedArtifact returning correct key_findings and tool_call_id

Refs:

.returning();

return result.length > 0;
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Minor: Missing test coverage for updateLedgerArtifactParts

Issue: This new data-access function has no test coverage in ledgerArtifacts.test.ts.

Why: This function is critical to the key_findings persistence feature. Without tests, regressions (e.g., incorrect WHERE clause, missing scope filtering) could go undetected.

Fix: Add tests covering:

  • Successful update returns true
  • Non-existent artifact returns false
  • Tenant/project scoping is enforced (can't update artifacts from other projects)

Refs:

Copy link
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(2) Total Issues | Risk: Low

Re-review scope: This review covers only the delta since the previous automated review. The new commits are merge commits bringing in upstream changes — the core bug fix code is unchanged.

✅ Delta Assessment

The commits since last review (780e46ce2f98..HEAD) are:

  • d22b0f70a — merged main
  • c0018a66c — Route LLM calls through AI Gateway (#2803)
  • d2b5368b7 — return structured conversation history (#2743)

These changes affect BaseCompressor.ts but only introduce upstream modifications:

  • Import reorganization (estimateTokens moved to @inkeep/agents-core)
  • Added agentId constructor parameter
  • Added usageContext to distillConversation call for cost tracking

The core bug fix functionality is intact and unchanged. All Critical and Major issues from prior reviews remain addressed.

🕐 Pending Recommendations (2)

💭 Consider (1)

💭 1) scope Add a changeset for this bug fix

Issue: No changeset exists for this compressor bug fix. The existing changesets (golden-crimson-emu.md, informal-green-porpoise.md) are from other merged PRs.

Why: Per AGENTS.md, any bug fix or behavior change to a published package needs a changeset. This PR fixes a data persistence bug in agents-api and adds a new DAL function in agents-core.

Fix:

pnpm bump patch --pkg agents-api --pkg agents-core "Fix key_findings persistence in compressor by using proper update instead of insert-only upsert"

💡 APPROVE WITH SUGGESTIONS

Summary: The merge commits introduce no new issues — they only bring in upstream changes from main (cost tracking, structured conversation history). The core bug fix remains solid: updateLedgerArtifactParts correctly persists key_findings to existing artifacts, and the get_artifact_full guard properly short-circuits for already-summarized artifacts.

Remaining items before merge:

  1. Changeset required — Run pnpm bump patch --pkg agents-api --pkg agents-core "..." to document the bug fix
  2. Test coverage recommended — The new methods lack tests, though this is non-blocking for a bug fix
Discarded (0)

No findings were discarded in this review.

Reviewers (1)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
orchestrator 3 0 1 0 0 2 0
Total 3 0 1 0 0 2 0

Note: Sub-reviewers were not re-dispatched — the delta contains only merge commits with no substantive changes to the bug fix code.

@github-actions github-actions bot deleted a comment from claude bot Mar 25, 2026
@itoqa
Copy link

itoqa bot commented Mar 25, 2026

Ito Test Report ✅

8 test cases ran. 8 passed.

Across the unified run, 8 executable test cases were included and all 8 passed (0 failed), while two additional code-first verification checks reported no includable pass/fail outcomes. Key findings were that conversation detail retrieval remained valid after compression-related artifact persistence, the playground stayed resilient and usable through refresh/mobile/rapid-submit/stream-interruption scenarios, and adversarial attempts (sentinel-style payloads, policy-bypass prompt injection, and cross-project artifact guessing) were safely rejected without unauthorized data exposure, aligning with expected local non-production behavior.

✅ Passed (8)
Category Summary Screenshot
Adversarial Rapid repeated submits and a normal follow-up prompt completed without destabilizing the conversation state. ADV-1
Adversarial Sentinel-like payload was handled safely and did not bypass artifact boundaries. ADV-2
Adversarial Prompt-injection bypass attempt did not defeat summarized-artifact guard behavior. ADV-3
Adversarial Cross-project artifact ID guessing was rejected without data exposure. ADV-6
Adversarial Interrupt/reload and immediate resend recovered cleanly without unrecoverable state or visible harmful duplicate effects. ADV-7
Edge Refresh during an active chat recovered correctly; conversation remained usable and accepted a follow-up prompt. EDGE-4
Edge Mobile viewport (390x844) controls stayed reachable, with Try it, Debug, and prompt submission flows usable. EDGE-5
Happy-path Conversation detail retrieval returned valid, parseable JSON after artifact persistence flow. ROUTE-2

Commit: d22b0f7

View Full Run


Tell us how we did: Give Ito Feedback

Copy link
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(0) Total Issues | Risk: Low

Re-review scope: This review covers only the delta since the previous automated review. All pending recommendations have been addressed.

✅ All Prior Recommendations Addressed

The following items from the previous review are now complete:

Prior Recommendation Resolution
🟡 Missing test coverage for hasSummarizedArtifact() and getSummarizedArtifact() ✅ Added 7 comprehensive tests in BaseCompressor.test.ts covering: null cumulativeSummary, null related_artifacts, artifact not found, artifact found, and correct return values
🟡 Missing test coverage for updateLedgerArtifactParts ✅ Added 2 tests in ledgerArtifacts.test.ts covering: successful update returns true, no-match returns false
💭 Add a changeset for this bug fix ✅ Added unconscious-brown-deer.md with proper packages and a clear message: "Fix key_findings persistence in compressor by using proper update instead of insert-only upsert"

Implementation Summary

This PR correctly fixes the key_findings persistence bug:

  1. New DAL function: updateLedgerArtifactParts uses proper db.update() to update existing artifacts
  2. Error handling: Synchronous try/await/catch with rich error logging (artifact IDs, conversation ID)
  3. Observability: Returns { persisted, skipped, failed } results and logs at debug level
  4. Data safety: Uses structuredClone() to avoid mutating cached DB objects
  5. Artifact retrieval guard: hasSummarizedArtifact() check in default-tools.ts short-circuits redundant fetches

✅ APPROVE

Summary: Excellent work addressing all feedback! The tests thoroughly cover edge cases, the changeset is properly formatted, and the core bug fix implementation is solid. This PR is ready to merge.

Discarded (0)

No findings were discarded in this review.

Reviewers (1)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
orchestrator 0 0 0 0 0 0 0
Total 0 0 0 0 0 0 0

Note: Sub-reviewers were not dispatched — this delta review verifies that all prior pending recommendations have been addressed.

@github-actions github-actions bot deleted a comment from claude bot Mar 26, 2026
@itoqa
Copy link

itoqa bot commented Mar 26, 2026

Ito Test Report ❌

12 test cases ran. 1 failed, 11 passed.

Across 12 executed test cases, 11 passed and 1 failed, showing generally stable API/UI behavior including streaming completions, scope-header auth enforcement, Manage UI playground chat, conversation-history serialization, and race/mobile/navigation resilience. The single confirmed high-severity defect is that after a refresh the summarized-artifact retrieval guard can be bypassed because summarized state lives in request-scoped compressor memory that is cleaned up and not rehydrated, while non-refresh paths (including already_summarized responses and empty key_findings handling) behaved as designed.

❌ Failed (1)
Category Summary Screenshot
Logic ⚠️ After refresh, summarized-artifact detection can be lost, allowing retrieval to bypass the expected already_summarized guard. LOGIC-4
⚠️ Post-refresh retrieval loses summarized-artifact guard
  • What failed: Expected behavior is a stable already_summarized response with key findings after refresh, but the guard can be skipped because summarized state is not reloaded into the new request compressor context.
  • Impact: Users can receive full artifact payloads after refresh when summarized behavior should persist, increasing context pressure and undermining the retrieval guard contract. This can cause inconsistent behavior across turns for the same artifact.
  • Introduced by this PR: Yes – this PR modified the relevant code
  • Steps to reproduce:
    1. Create an artifact and summarize it via compress_context in a conversation.
    2. Refresh the browser and reopen the same conversation.
    3. Request get_reference_artifact for the previously summarized artifact.
    4. Observe that retrieval can proceed without stable already_summarized blocking because summary state is not rehydrated into request-scoped compressor memory.
  • Code analysis: The retrieval block checks only ctx.currentCompressor in-memory summary state, while compressor instances are recreated per generation and explicitly cleared at the end of each run. There is no rehydration path from persisted summary data into currentCompressor before guard evaluation.
  • Why this is likely a bug: The guard condition is tied to ephemeral request memory rather than durable conversation state, so refresh/new-request flows can violate expected summarized-artifact behavior in production.

Relevant code:

agents-api/src/domains/run/agents/tools/default-tools.ts (lines 24-37)

const compressor = ctx.currentCompressor;
if (compressor?.hasSummarizedArtifact(artifactId)) {
  const summarized = compressor.getSummarizedArtifact(artifactId);
  logger.info(
    { artifactId, toolCallId },
    'Blocked retrieval of artifact already summarized in compression'
  );
  return {
    artifactId,
    status: 'already_summarized',
    key_findings: summarized?.key_findings ?? [],

agents-api/src/domains/run/agents/generation/compression.ts (lines 20-35)

const compressor = compressionConfig.enabled
  ? new MidGenerationCompressor(
      sessionId,
      contextId,
      ctx.config.tenantId,
      ctx.config.projectId,
      ctx.config.agentId,
      compressionConfig,
      getSummarizerModel(ctx.config),
      primaryModelSettings
    )
  : null;

ctx.currentCompressor = compressor;

agents-api/src/domains/run/agents/generation/generate.ts (lines 401-405)

if (compressor) {
  compressor.fullCleanup();
}
ctx.currentCompressor = null;
✅ Passed (11)
Category Summary Screenshot
Adversarial Run bypass bearer without required scope headers was correctly rejected with HTTP 401. ADV-4
Adversarial Same-conversation concurrent requests completed without retrieval-state corruption. ADV-6
Edge Empty key_findings in summarized retrieval returns [] safely without server failure. EDGE-1
Edge Rapid concurrent submits completed without duplicate-output corruption. EDGE-2
Edge Back/forward navigation during mid-flow did not break the conversation; a follow-up prompt still produced a valid response. EDGE-3
Edge Mobile viewport flow remained usable through artifact, compression, and retrieval prompts, with readable streamed output. EDGE-4
Edge Multi-turn compression/retrieval loop stayed stable across the long run. EDGE-5
Logic Verified summarized retrieval is blocked with already_summarized, includes key_findings, and provides sentinel guidance. LOGIC-1
Happy-path OpenAI-compatible completions endpoint returned HTTP 200 SSE with assistant delta chunks and a completion marker using valid scope headers. ROUTE-2
Happy-path Manage UI playground chat completed successfully after local non-production remediations, with assistant response visible. ROUTE-3
Happy-path Conversation detail endpoint returned HTTP 200 with parseable JSON messages after artifact compression and retrieval flow. ROUTE-4

Commit: 4fdb505

View Full Run


Tell us how we did: Give Ito Feedback

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant