Skip to content
16 changes: 16 additions & 0 deletions agents-api/src/domains/run/agents/tools/default-tools.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ import { type Tool, type ToolSet, tool } from 'ai';
import { getLogger } from '../../../../logger';
import { formatOversizedRetrievalReason } from '../../artifacts/artifact-utils';
import { getModelAwareCompressionConfig } from '../../compression/BaseCompressor';
import { SENTINEL_KEY } from '../../constants/artifact-syntax';
import { agentSessionManager } from '../../session/AgentSession';
import type { AgentRunContext } from '../agent-types';
import { wrapToolWithStreaming } from './tool-wrapper';
Expand All @@ -20,6 +21,21 @@ export function getArtifactTools(ctx: AgentRunContext): Tool<any, any> {
execute: async ({ artifactId, toolCallId }) => {
logger.info({ artifactId, toolCallId }, 'get_artifact_full executed');

const compressor = ctx.currentCompressor;
if (compressor?.hasSummarizedArtifact(artifactId)) {
const summarized = compressor.getSummarizedArtifact(artifactId);
logger.info(
{ artifactId, toolCallId },
'Blocked retrieval of artifact already summarized in compression'
);
return {
artifactId,
status: 'already_summarized',
key_findings: summarized?.key_findings ?? [],
hint: `This artifact's key findings are already in your compressed context. Use them directly to answer. To pass this artifact to a tool, use { "${SENTINEL_KEY.ARTIFACT}": "${artifactId}", "${SENTINEL_KEY.TOOL}": "${summarized?.tool_call_id ?? toolCallId}" } sentinel instead of retrieving it.`,
};
Comment on lines +24 to +36
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Minor: Missing test coverage for early return path

Issue: This new code path that returns status: 'already_summarized' when an artifact is in the compression summary has no test coverage.

Why: This code path:

  1. Returns a different response shape that the agent needs to handle
  2. Constructs a hint with SENTINEL_KEY values — if constants change, tests should fail
  3. Uses null-safety fallbacks (summarized?.key_findings ?? []) that should be validated

Fix: Add tests in the appropriate test file:

describe('getArtifactTools with summarized artifact', () => {
  it('should return already_summarized response when artifact is in compression summary', async () => {
    const mockCompressor = {
      hasSummarizedArtifact: vi.fn().mockReturnValue(true),
      getSummarizedArtifact: vi.fn().mockReturnValue({
        key_findings: ['finding1', 'finding2'],
        tool_call_id: 'original-call-123',
      }),
    };
    
    const ctx = { ...baseAgentRunContext, currentCompressor: mockCompressor };
    const tool = getArtifactTools(ctx);
    const result = await tool.execute({ artifactId: 'artifact-xyz', toolCallId: 'call-456' });
    
    expect(result.status).toBe('already_summarized');
    expect(result.key_findings).toEqual(['finding1', 'finding2']);
    expect(result.hint).toContain('$artifact');
  });
});

}
Comment on lines +24 to +37
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The guard correctly prevents re-fetching a summarized artifact and points the model at the sentinel syntax. One concern: hasSummarizedArtifact matches on artifact.id (the compression-generated compress_* ID), but the model calls get_reference_artifact with an artifactId that came from the artifact list. Are these guaranteed to be the same ID? If the distillation produces the compression-generated ID in related_artifacts[].id but the model sees a different ID in available_artifacts, this guard will never trigger. Worth verifying the ID mapping is consistent end-to-end.


const streamRequestId = ctx.streamRequestId ?? '';
const artifactService = agentSessionManager.getArtifactService(streamRequestId);

Expand Down
70 changes: 69 additions & 1 deletion agents-api/src/domains/run/compression/BaseCompressor.ts
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import { randomUUID } from 'node:crypto';
import type { ModelSettings } from '@inkeep/agents-core';
import { getLedgerArtifacts, SPAN_KEYS } from '@inkeep/agents-core';
import { getLedgerArtifacts, SPAN_KEYS, upsertLedgerArtifact } from '@inkeep/agents-core';
import { type Span, SpanStatusCode } from '@opentelemetry/api';
import runDbClient from '../../../data/db/runDbClient';
import { getLogger } from '../../../logger';
Expand Down Expand Up @@ -528,6 +528,16 @@ export abstract class BaseCompressor {
);

this.cumulativeSummary = summary;

if (summary.related_artifacts?.length) {
this.persistArtifactKeyFindings(summary.related_artifacts).catch((err) =>
logger.warn(
{ err: err instanceof Error ? err.message : String(err) },
'Failed to persist key_findings to artifacts'
)
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 MAJOR: Fire-and-forget async call masks persistence failures

Issue: The persistArtifactKeyFindings() method is called fire-and-forget with .catch() that only logs a warning. There's no mechanism for the caller to know if persistence failed, and no retry logic.

Why: Silent data loss scenario:

  • If persistence fails (DB connection issue, constraint violation, etc.), the key_findings data will never be persisted
  • The compression summary in memory will contain key_findings, but database artifacts won't
  • On service restart or session reload, key_findings would be lost
  • Debugging is difficult — only a WARN log entry with minimal context (no artifact IDs, no correlation ID)

Fix: If persistence is critical, make it synchronous and propagate errors:

if (summary.related_artifacts?.length) {
  try {
    await this.persistArtifactKeyFindings(summary.related_artifacts);
  } catch (err) {
    logger.error(
      { 
        err: err instanceof Error ? err.message : String(err),
        conversationId: this.conversationId,
        artifactIds: summary.related_artifacts.map(a => a.id),
      },
      'Failed to persist key_findings to artifacts - summary may be inconsistent'
    );
    // Decide: throw to fail compression, or continue with explicit degraded state
  }
}

If fire-and-forget is intentional, at minimum add richer logging with artifact IDs and conversation context.

}
Comment on lines +544 to +561
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The .catch() fire-and-forget is fine for a best-effort persist, but since persistArtifactKeyFindings currently has no effect (see the upsert bug above), this entire block is inert. Once the persist path is fixed, consider whether a failure here should propagate — if the key findings aren't stored, the retrieval guard in default-tools.ts will still return them from memory, creating an inconsistency after restart.


return summary;
}

Expand Down Expand Up @@ -604,6 +614,64 @@ export abstract class BaseCompressor {
return this.cumulativeSummary;
}

hasSummarizedArtifact(artifactId: string): boolean {
return this.cumulativeSummary?.related_artifacts?.some((a) => a.id === artifactId) ?? false;
}

getSummarizedArtifact(
artifactId: string
): { key_findings: string[]; tool_call_id: string } | null {
const artifact = this.cumulativeSummary?.related_artifacts?.find((a) => a.id === artifactId);
if (!artifact) return null;
return {
key_findings: artifact.key_findings,
tool_call_id: artifact.tool_call_id,
};
Comment on lines +639 to +651
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Minor: Missing test coverage for new public methods

Issue: The new hasSummarizedArtifact() and getSummarizedArtifact() methods are called in the critical path of getArtifactTools but have no test coverage.

Why: These methods determine whether to short-circuit artifact retrieval. If broken (e.g., a typo changing a.id to a.artifactId), the agent could:

  1. Fail to recognize already-summarized artifacts, causing redundant retrievals that bloat context
  2. Return incorrect key_findings, causing the agent to use stale/wrong information

Fix: Add unit tests in BaseCompressor.test.ts:

describe('hasSummarizedArtifact', () => {
  it('should return true when artifact exists in related_artifacts', () => {
    compressor['cumulativeSummary'] = {
      ...baseSummary,
      related_artifacts: [
        { id: 'artifact-123', name: 'Test', tool_name: 'search', tool_call_id: 'call-1', content_type: 'results', key_findings: ['finding1'] }
      ],
    };
    expect(compressor.hasSummarizedArtifact('artifact-123')).toBe(true);
  });

  it('should return false when cumulativeSummary is null', () => {
    compressor['cumulativeSummary'] = null;
    expect(compressor.hasSummarizedArtifact('any-id')).toBe(false);
  });
});

describe('getSummarizedArtifact', () => {
  it('should return key_findings and tool_call_id when artifact exists', () => {
    // ... test implementation
  });
});

}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Minor: Missing test coverage for hasSummarizedArtifact() and getSummarizedArtifact() methods

Issue: These new public methods are called in the critical path of getArtifactTools but have no test coverage in BaseCompressor.test.ts.

Why: These methods determine whether to short-circuit artifact retrieval. If broken (e.g., a typo changing a.id to a.artifactId), the agent could fail to recognize already-summarized artifacts, causing redundant retrievals that bloat context, or return incorrect key_findings.

Fix: Add unit tests for edge cases:

  • hasSummarizedArtifact when cumulativeSummary is null
  • hasSummarizedArtifact when artifact exists vs doesn't exist in related_artifacts
  • getSummarizedArtifact returning correct key_findings and tool_call_id

Refs:


protected async persistArtifactKeyFindings(
relatedArtifacts: NonNullable<ConversationSummary['related_artifacts']>
): Promise<void> {
for (const artifact of relatedArtifacts) {
if (!artifact.id || !artifact.key_findings?.length) continue;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: artifact.id is typed as z.string() (non-nullable) from the ConversationSummarySchema, so the !artifact.id guard is dead code — it can never be falsy. Consider removing it for clarity, or if you genuinely need to guard against empty strings, use !artifact.id.length.

try {
const existing = await getLedgerArtifacts(runDbClient)({
scopes: { tenantId: this.tenantId, projectId: this.projectId },
artifactId: artifact.id,
});
if (existing.length === 0) continue;

const dbArtifact = existing[0];
const parts = (dbArtifact.parts ?? []) as any[];
if (parts.length > 0 && parts[0]?.data?.summary) {
parts[0].data.summary = {
...parts[0].data.summary,
key_findings: artifact.key_findings,
};
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mutates a shared object in place. parts aliases dbArtifact.parts — the parts[0].data.summary = { ... } assignment mutates the fetched DB row object. This is harmless today since the object is not reused, but if getLedgerArtifacts ever adds caching, this will silently corrupt the cache. Consider cloning:

const parts = structuredClone(dbArtifact.parts ?? []) as any[];

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 MAJOR: Silent no-op when artifact structure doesn't match expected format

Issue: The code checks if (parts.length > 0 && parts[0]?.data?.summary) but has no else branch. If this condition is false (e.g., artifact has parts: [] or a different structure), the artifact is still upserted with unmodified parts, potentially overwriting existing data without adding key_findings.

Why: This creates a silent failure path where:

  1. The mutation at lines 647-650 never executes
  2. The upsert still runs with original (unmodified) parts
  3. No indication this happened — no log, no error
  4. Could corrupt artifact data if there's a race condition between read and write

Fix: Add explicit handling for unexpected structures:

const parts = (dbArtifact.parts ?? []) as any[];
if (parts.length === 0) {
  logger.debug({ artifactId: artifact.id }, 'Artifact has no parts, skipping key_findings persistence');
  continue; // Skip this artifact entirely
}
if (!parts[0]?.data?.summary) {
  logger.debug(
    { artifactId: artifact.id, partKind: parts[0]?.kind },
    'Artifact first part has no summary structure, skipping key_findings persistence'
  );
  continue; // Skip rather than risk overwriting with unmodified data
}


await upsertLedgerArtifact(runDbClient)({
scopes: { tenantId: this.tenantId, projectId: this.projectId },
contextId: this.conversationId,
taskId: dbArtifact.taskId ?? `task_${this.conversationId}-${this.sessionId}`,
toolCallId: dbArtifact.toolCallId,
artifact: {
...dbArtifact,
parts,
},
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: upsertLedgerArtifact is insert-only — this update is silently lost.

upsertLedgerArtifact calls db.insert() and on a unique-constraint violation returns { created: false, existing } without writing anything. Since compression artifacts already exist at this point (they were created in createNewArtifact / saveToolResultsAsArtifacts), the insert will always conflict and the mutated parts with key_findings will never reach the database.

You need either:

  1. A proper UPDATE query (e.g., db.update(ledgerArtifacts).set({ parts }).where(...)) — likely the right call here since you're patching a single field on an existing row.
  2. An onConflictDoUpdate in the upsert function.

Without this fix, the key_findings data is only held in the in-memory cumulativeSummary and will be lost on process restart.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 CRITICAL: upsertLedgerArtifact is insert-only — key_findings will never be persisted

Issue: The persistArtifactKeyFindings method calls upsertLedgerArtifact to update existing artifacts with key_findings, but this function doesn't actually update existing records. When a duplicate key error occurs, it returns { created: false, existing: ... } without applying any changes.

Why: Looking at ledgerArtifacts.ts:166-196, the function uses db.insert() without .onConflictDoUpdate(). The modified parts array with key_findings is never persisted to the database. This means:

  1. The in-memory cumulativeSummary will have key_findings
  2. The database artifacts will NOT have them
  3. On service restart/session reload, key_findings are lost
  4. The get_artifact_full early-return path will show stale data

Fix: Either:

  1. Add a dedicated updateLedgerArtifact function that uses db.update() with a WHERE clause, or
  2. Modify upsertLedgerArtifact to use .onConflictDoUpdate() similar to other upserts in the codebase (e.g., upsertWorkAppSlackChannelAgentConfig)

Refs:

} catch (error) {
logger.warn(
{
artifactId: artifact.id,
error: error instanceof Error ? error.message : String(error),
},
'Failed to persist key_findings to artifact'
);
}
}
}

cleanup(options: { resetSummary?: boolean; keepRecentToolCalls?: number } = {}): void {
const { resetSummary = false, keepRecentToolCalls = 0 } = options;
if (keepRecentToolCalls > 0) {
Expand Down
Loading