Skip to content
16 changes: 16 additions & 0 deletions agents-api/src/domains/run/agents/tools/default-tools.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ import { type Tool, type ToolSet, tool } from 'ai';
import { getLogger } from '../../../../logger';
import { formatOversizedRetrievalReason } from '../../artifacts/artifact-utils';
import { getModelAwareCompressionConfig } from '../../compression/BaseCompressor';
import { SENTINEL_KEY } from '../../constants/artifact-syntax';
import { agentSessionManager } from '../../session/AgentSession';
import type { AgentRunContext } from '../agent-types';
import { wrapToolWithStreaming } from './tool-wrapper';
Expand All @@ -20,6 +21,21 @@ export function getArtifactTools(ctx: AgentRunContext): Tool<any, any> {
execute: async ({ artifactId, toolCallId }) => {
logger.info({ artifactId, toolCallId }, 'get_artifact_full executed');

const compressor = ctx.currentCompressor;
if (compressor?.hasSummarizedArtifact(artifactId)) {
const summarized = compressor.getSummarizedArtifact(artifactId);
logger.info(
{ artifactId, toolCallId },
'Blocked retrieval of artifact already summarized in compression'
);
return {
artifactId,
status: 'already_summarized',
key_findings: summarized?.key_findings ?? [],
hint: `This artifact's key findings are already in your compressed context. Use them directly to answer. To pass this artifact to a tool, use { "${SENTINEL_KEY.ARTIFACT}": "${artifactId}", "${SENTINEL_KEY.TOOL}": "${summarized?.tool_call_id ?? toolCallId}" } sentinel instead of retrieving it.`,
};
Comment on lines +24 to +36
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Minor: Missing test coverage for early return path

Issue: This new code path that returns status: 'already_summarized' when an artifact is in the compression summary has no test coverage.

Why: This code path:

  1. Returns a different response shape that the agent needs to handle
  2. Constructs a hint with SENTINEL_KEY values — if constants change, tests should fail
  3. Uses null-safety fallbacks (summarized?.key_findings ?? []) that should be validated

Fix: Add tests in the appropriate test file:

describe('getArtifactTools with summarized artifact', () => {
  it('should return already_summarized response when artifact is in compression summary', async () => {
    const mockCompressor = {
      hasSummarizedArtifact: vi.fn().mockReturnValue(true),
      getSummarizedArtifact: vi.fn().mockReturnValue({
        key_findings: ['finding1', 'finding2'],
        tool_call_id: 'original-call-123',
      }),
    };
    
    const ctx = { ...baseAgentRunContext, currentCompressor: mockCompressor };
    const tool = getArtifactTools(ctx);
    const result = await tool.execute({ artifactId: 'artifact-xyz', toolCallId: 'call-456' });
    
    expect(result.status).toBe('already_summarized');
    expect(result.key_findings).toEqual(['finding1', 'finding2']);
    expect(result.hint).toContain('$artifact');
  });
});

}
Comment on lines +24 to +37
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The guard correctly prevents re-fetching a summarized artifact and points the model at the sentinel syntax. One concern: hasSummarizedArtifact matches on artifact.id (the compression-generated compress_* ID), but the model calls get_reference_artifact with an artifactId that came from the artifact list. Are these guaranteed to be the same ID? If the distillation produces the compression-generated ID in related_artifacts[].id but the model sees a different ID in available_artifacts, this guard will never trigger. Worth verifying the ID mapping is consistent end-to-end.


const streamRequestId = ctx.streamRequestId ?? '';
const artifactService = agentSessionManager.getArtifactService(streamRequestId);

Expand Down
105 changes: 104 additions & 1 deletion agents-api/src/domains/run/compression/BaseCompressor.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@ import { randomUUID } from 'node:crypto';
import type { ModelSettings } from '@inkeep/agents-core';
import {
estimateTokens as estimateTokensUtil,
GENERATION_TYPES,
getLedgerArtifacts,
SPAN_KEYS,
updateLedgerArtifactParts,
} from '@inkeep/agents-core';
import { type Span, SpanStatusCode } from '@opentelemetry/api';
import runDbClient from '../../../data/db/runDbClient';
Expand Down Expand Up @@ -539,6 +539,26 @@ export abstract class BaseCompressor {
);

this.cumulativeSummary = summary;

if (summary.related_artifacts?.length) {
try {
const persistResult = await this.persistArtifactKeyFindings(summary.related_artifacts);
logger.debug(
{ ...persistResult, conversationId: this.conversationId },
'Artifact key_findings persistence completed'
);
} catch (err) {
logger.warn(
{
conversationId: this.conversationId,
artifactIds: summary.related_artifacts.map((a) => a.id),
err: err instanceof Error ? err.message : String(err),
},
'Failed to persist key_findings to artifacts'
);
}
}
Comment on lines +544 to +561
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The .catch() fire-and-forget is fine for a best-effort persist, but since persistArtifactKeyFindings currently has no effect (see the upsert bug above), this entire block is inert. Once the persist path is fixed, consider whether a failure here should propagate — if the key findings aren't stored, the retrieval guard in default-tools.ts will still return them from memory, creating an inconsistency after restart.


return summary;
}

Expand Down Expand Up @@ -615,6 +635,89 @@ export abstract class BaseCompressor {
return this.cumulativeSummary;
}

hasSummarizedArtifact(artifactId: string): boolean {
return this.cumulativeSummary?.related_artifacts?.some((a) => a.id === artifactId) ?? false;
}

getSummarizedArtifact(
artifactId: string
): { key_findings: string[]; tool_call_id: string } | null {
const artifact = this.cumulativeSummary?.related_artifacts?.find((a) => a.id === artifactId);
if (!artifact) return null;
return {
key_findings: artifact.key_findings,
tool_call_id: artifact.tool_call_id,
};
Comment on lines +639 to +651
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Minor: Missing test coverage for new public methods

Issue: The new hasSummarizedArtifact() and getSummarizedArtifact() methods are called in the critical path of getArtifactTools but have no test coverage.

Why: These methods determine whether to short-circuit artifact retrieval. If broken (e.g., a typo changing a.id to a.artifactId), the agent could:

  1. Fail to recognize already-summarized artifacts, causing redundant retrievals that bloat context
  2. Return incorrect key_findings, causing the agent to use stale/wrong information

Fix: Add unit tests in BaseCompressor.test.ts:

describe('hasSummarizedArtifact', () => {
  it('should return true when artifact exists in related_artifacts', () => {
    compressor['cumulativeSummary'] = {
      ...baseSummary,
      related_artifacts: [
        { id: 'artifact-123', name: 'Test', tool_name: 'search', tool_call_id: 'call-1', content_type: 'results', key_findings: ['finding1'] }
      ],
    };
    expect(compressor.hasSummarizedArtifact('artifact-123')).toBe(true);
  });

  it('should return false when cumulativeSummary is null', () => {
    compressor['cumulativeSummary'] = null;
    expect(compressor.hasSummarizedArtifact('any-id')).toBe(false);
  });
});

describe('getSummarizedArtifact', () => {
  it('should return key_findings and tool_call_id when artifact exists', () => {
    // ... test implementation
  });
});

}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Minor: Missing test coverage for hasSummarizedArtifact() and getSummarizedArtifact() methods

Issue: These new public methods are called in the critical path of getArtifactTools but have no test coverage in BaseCompressor.test.ts.

Why: These methods determine whether to short-circuit artifact retrieval. If broken (e.g., a typo changing a.id to a.artifactId), the agent could fail to recognize already-summarized artifacts, causing redundant retrievals that bloat context, or return incorrect key_findings.

Fix: Add unit tests for edge cases:

  • hasSummarizedArtifact when cumulativeSummary is null
  • hasSummarizedArtifact when artifact exists vs doesn't exist in related_artifacts
  • getSummarizedArtifact returning correct key_findings and tool_call_id

Refs:


protected async persistArtifactKeyFindings(
relatedArtifacts: NonNullable<ConversationSummary['related_artifacts']>
): Promise<{ persisted: number; skipped: number; failed: number }> {
const result = { persisted: 0, skipped: 0, failed: 0 };

for (const artifact of relatedArtifacts) {
if (!artifact.key_findings?.length) {
result.skipped++;
continue;
}
try {
const existing = await getLedgerArtifacts(runDbClient)({
scopes: { tenantId: this.tenantId, projectId: this.projectId },
artifactId: artifact.id,
});
if (existing.length === 0) {
logger.debug(
{ artifactId: artifact.id },
'Artifact not found in DB, skipping key_findings persistence'
);
result.skipped++;
continue;
}

const dbArtifact = existing[0];
const parts = structuredClone(dbArtifact.parts ?? []) as any[];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Casting to any[] after structuredClone drops all type safety on the parts structure. The subsequent parts[0]?.data?.summary access chain is fragile — if the part shape ever changes (e.g. data gets renamed, or summary moves), this will silently produce malformed artifacts.

Consider adding a minimal type guard or narrowing before mutating, e.g.:

const firstPart = parts[0];
if (firstPart?.kind !== 'data' || !firstPart.data?.summary) { ... }

This also makes the kind === 'data' assumption explicit rather than hidden.

if (parts.length === 0 || !parts[0]?.data?.summary) {
logger.debug(
{ artifactId: artifact.id, hasParts: parts.length > 0 },
'Artifact has no summary structure, skipping key_findings persistence'
);
result.skipped++;
continue;
}

parts[0].data.summary = {
...parts[0].data.summary,
key_findings: artifact.key_findings,
};

const updated = await updateLedgerArtifactParts(runDbClient)({
scopes: { tenantId: this.tenantId, projectId: this.projectId },
artifactId: artifact.id,
parts,
});

if (updated) {
result.persisted++;
} else {
logger.warn({ artifactId: artifact.id }, 'updateLedgerArtifactParts matched no rows');
result.failed++;
}
} catch (error) {
result.failed++;
logger.warn(
{
artifactId: artifact.id,
conversationId: this.conversationId,
error: error instanceof Error ? error.message : String(error),
},
'Failed to persist key_findings to artifact'
);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sequential for...of loop issues a SELECT + UPDATE per artifact. For conversations with many artifacts this creates N×2 round-trips. Not a problem today if artifact counts per compression are small, but worth a // TODO: batch if artifact count grows note if you expect this to scale.


return result;
}

cleanup(options: { resetSummary?: boolean; keepRecentToolCalls?: number } = {}): void {
const { resetSummary = false, keepRecentToolCalls = 0 } = options;
if (keepRecentToolCalls > 0) {
Expand Down
22 changes: 22 additions & 0 deletions packages/agents-core/src/data-access/runtime/ledgerArtifacts.ts
Original file line number Diff line number Diff line change
Expand Up @@ -366,6 +366,28 @@ export const getLedgerArtifactsByContext =
);
};

/**
* Update the parts column of an existing ledger artifact by ID.
* Used to persist modifications (e.g. key_findings) to artifacts that already exist.
*/
export const updateLedgerArtifactParts =
(db: AgentsRunDatabaseClient) =>
async (params: {
scopes: ProjectScopeConfig;
artifactId: string;
parts: Part[];
}): Promise<boolean> => {
const { scopes, artifactId, parts } = params;

const result = await db
.update(ledgerArtifacts)
.set({ parts, updatedAt: new Date().toISOString() })
.where(and(projectScopedWhere(ledgerArtifacts, scopes), eq(ledgerArtifacts.id, artifactId)))
.returning();

return result.length > 0;
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Minor: Missing test coverage for updateLedgerArtifactParts

Issue: This new data-access function has no test coverage in ledgerArtifacts.test.ts.

Why: This function is critical to the key_findings persistence feature. Without tests, regressions (e.g., incorrect WHERE clause, missing scope filtering) could go undetected.

Fix: Add tests covering:

  • Successful update returns true
  • Non-existent artifact returns false
  • Tenant/project scoping is enforced (can't update artifacts from other projects)

Refs:


/**
* Delete ledger artifacts by task ID
*/
Expand Down
Loading