Use actual AI SDK token usage for compression and fix pricing lookup#2803
Use actual AI SDK token usage for compression and fix pricing lookup#2803tim-inkeep wants to merge 30 commits intomainfrom
Conversation
Add append-only usage_events table for tracking LLM generation usage across all call sites. Includes token counts (input, output, reasoning, cached), dynamic pricing cost estimate, generation type classification, and OTel correlation fields. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two-tier dynamic pricing: gateway getAvailableModels() as primary (when AI_GATEWAY_API_KEY is set), models.dev API as universal fallback. In-memory cache with periodic refresh (1h gateway, 6h models.dev). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Insert, query (paginated), and summary aggregation functions for usage_events table. Supports groupBy model/agent/day/generation_type. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
recordUsage() extracts tokens from AI SDK responses, looks up pricing, sets OTel span attributes, and fire-and-forgets a usage_event insert. New SPAN_KEYS: total_tokens, reasoning_tokens, cached_read_tokens, response.model, cost.estimated_usd, generation.step_count, generation.type. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add usage, totalUsage, and response fields to ResolvedGenerationResponse. resolveGenerationResponse now resolves these Promise-based getters from the AI SDK alongside steps/text/finishReason/output. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Call recordUsage() after resolveGenerationResponse in runGenerate(), capturing tenant/project/agent/subAgent context, model, streaming status, and finish reason. Fire-and-forget, non-blocking. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add recordUsage() calls for status_update and artifact_metadata generation types in AgentSession. Compression call sites deferred (need context threading through function signatures). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Consolidate estimateTokens() and AssembleResult into packages/agents-core/src/utils/token-estimator.ts. Update all 10 import sites in agents-api to use @inkeep/agents-core. Removes duplicate code and prepares for usage tracker integration. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace recordUsage() with trackedGenerate() — wraps generateText/ streamText calls to automatically record usage on success AND failure. Failed calls check error type: 429/network = 0 tokens, other errors = estimated input tokens from prompt. All call sites (generate.ts, AgentSession status updates + artifact metadata, EvaluationService simulation) now use the wrapper consistently. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
GET /manage/v1/usage/summary — aggregated usage by model/agent/day/ generation_type with optional projectId filter. GET /manage/v1/usage/events — paginated individual usage events with filters for project, agent, model, generation type. Both enforce tenant auth with project-level access checks. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tenant-level usage dashboard at /{tenantId}/usage with:
- Summary stats: total tokens, estimated cost, generation count, models
- Token usage over time chart (daily buckets via AreaChartCard)
- Breakdown tables by model and generation type
- Project filter and date range picker
- Nav item added to sidebar
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extract UsageDashboard, UsageStatCards, UsageBreakdownTable into
reusable component. Both tenant-level (/{tenantId}/usage) and
project-level (/{tenantId}/projects/{projectId}/usage) pages import
the shared component. Register Usage tag in OpenAPI spec + docs.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Route handlers use c.get('tenantId') from middleware context
- Client fetches through /api/usage Next.js proxy (forwards cookies)
- Initialize PricingService at server startup for cost estimation
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
resolvedModel from the AI SDK doesn't include provider prefix (e.g. 'claude-sonnet-4-6' not 'anthropic/claude-sonnet-4-6'). Parse requestedModel once at the top and use the extracted modelName for pricing lookup, falling back to resolvedModel when available. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…cking Data layer: - Add steps JSONB column for per-step token breakdown - Populate traceId/spanId from active OTel span - Add conversation/message groupBy + conversationId filter - Thread agentId/conversationId through compression call chain - Wrap compression generateText calls with trackedGenerate Traces integration: - Conversation detail route fetches usage events and merges cost into activities by spanId (with parentSpanId fallback) - Cost shows on timeline items and span detail panels - Usage Cost card on conversation detail page UI: - Events table with pagination, trace links, agent/sub-agent columns - 50/50 chart + events layout - conversationId filter in usage API client Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Write path: - Remove Postgres insert from persistEvent, keep OTel span attributes - Add all schema fields as span attributes (requested_model, provider, status, streamed, byok, finish_reason, duration_ms, error_code, message_id) - Add UsageCostSpanProcessor that enriches doGenerate/doStream spans with per-step cost from PricingService before export - Standardize experimental_telemetry.metadata.generationType across all call sites (status_update, artifact_metadata, compression) - Extract USAGE_GENERATION_TYPES constant for shared filter Read path: - Add getUsageCostSummary (SigNoz aggregation by model/agent/type/conversation) - Add getUsageEventsList (SigNoz LIST query for individual spans) - Dashboard fetches from SigNoz via existing signoz-stats client - Events table and chart derived from SigNoz span data - Filter to only spans with known generationType to reduce noise trackedGenerate no longer takes db parameter. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
1 Skipped Deployment
|
🦋 Changeset detectedLatest commit: d61caec The changes in this PR will be included in the next version bump. This PR includes changesets to release 10 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
@pullfrog, please review my PR |
There was a problem hiding this comment.
Solid PR — the shift from heuristic token estimation to actual AI SDK usage data is well-motivated and the implementation is clean. The wrapLanguageModel approach to fix pricing lookups is elegant. Two medium-severity issues (potential PricingService interval leak, v3 middleware spec version risk) and a few minor items below.
| if (this.initialized) return; | ||
| this.initialized = true; | ||
|
|
||
| await Promise.allSettled([this.refreshGateway(), this.refreshModelsDev()]); |
There was a problem hiding this comment.
Medium: initialize() is not idempotent against concurrent callers. The initialized guard prevents re-entry but uses a synchronous boolean — if two callers race past the guard before the first sets this.initialized = true, both will set up duplicate intervals. Consider storing the init promise:
private initPromise: Promise<void> | null = null;
async initialize(): Promise<void> {
if (!this.initPromise) {
this.initPromise = this.doInitialize();
}
return this.initPromise;
}Alternatively, guard the interval creation behind this.gatewayInterval == null to be safe.
| if (this.modelsDevInterval) clearInterval(this.modelsDevInterval); | ||
| this.gatewayInterval = null; | ||
| this.modelsDevInterval = null; | ||
| this.initialized = false; |
There was a problem hiding this comment.
Minor: destroy() does not clear initPromise / caches. If someone calls destroy() then initialize() again, this.initialized is false but the caches still contain stale data from the previous lifecycle. Not blocking — the singletons are long-lived in practice — but worth noting for test hygiene.
| } | ||
|
|
||
| export const usageCostMiddleware: LanguageModelMiddleware = { | ||
| specificationVersion: 'v3', |
There was a problem hiding this comment.
Medium: specificationVersion: 'v3' ties this to an unreleased/experimental middleware API version. If the AI SDK ships a breaking change to the v3 spec (usage shape, callback signatures), this will silently break cost tracking. Confirm this version is stable in the ai package version pinned in your lockfile. If not, add a comment noting the version dependency.
| const result = await doGenerate(); | ||
|
|
||
| try { | ||
| const inputTokens = result.usage.inputTokens.total ?? 0; |
There was a problem hiding this comment.
Minor: result.usage.inputTokens.total assumes a nested .total property. This matches the v3 spec's structured usage shape, but the old v1/v2 shape used flat inputTokens: number. If any codepath bypasses wrapLanguageModel and hits this middleware with the old shape, it will throw. The try/catch on line 77 guards against this, so it's safe — just noting the implicit contract.
| `To access other models, use OpenRouter (openrouter/model-id), Vercel AI Gateway (gateway/model-id), NVIDIA NIM (nim/model-id), or Custom OpenAI-compatible (custom/model-id).` | ||
| ); | ||
| } | ||
| return wrapLanguageModel({ |
There was a problem hiding this comment.
The modelId: modelString here passes the full provider/model-name string (e.g. anthropic/claude-sonnet-4). This is what calculateAndSetCost receives as modelId, and then it splits on / to extract the model name when providerId is present (line 29 of usage-cost-middleware.ts). This works correctly — just confirming the data flow is intentional since the middleware does its own parsing.
| if (hasReliableUsage) { | ||
| // Use actual token counts from the last completed step | ||
| // Next step's context ≈ last step's input + last step's output (assistant response appended) | ||
| totalTokens = actualInputTokens + (actualOutputTokens ?? 0); |
There was a problem hiding this comment.
Correctness check: totalTokens = actualInputTokens + (actualOutputTokens ?? 0) approximates the next step's context size as "last input + last output". This is a good heuristic but slightly oversimplifies — the output gets appended as a new assistant message, so the actual input for the next step includes the original context plus the output tokens, which is what inputTokens already captures for the current step. So the formula effectively double-counts the prior context. In practice this is conservative (triggers compression earlier), which is arguably safer. Worth documenting the rationale.
| safetyBuffer, | ||
| triggerAt, | ||
| remaining: hardLimit - totalTokens, | ||
| source: steps.length > 0 ? 'actual_sdk_usage' : 'estimated', |
There was a problem hiding this comment.
Nit: source: steps.length > 0 ? 'actual_sdk_usage' : 'estimated' — at this point in the code, we're inside the compressionNeeded branch. The source was already determined above, but this ternary re-derives it from steps.length which doesn't account for the hasReliableUsage check (e.g. steps.length > 0 but inputTokens was 0 → fell back to estimate). Consider using a local source variable set at the decision point.
| // USAGE GENERATION TYPES (table removed — usage now tracked via OTel/SigNoz) | ||
| // ============================================================================ | ||
|
|
||
| import { USAGE_GENERATION_TYPES } from '../../constants/otel-attributes'; |
There was a problem hiding this comment.
Importing from ../../constants/otel-attributes inside a schema file is a bit unusual — it creates a dependency from the DB schema layer to the telemetry constants layer. Since this is just a type re-export and the comment says "table removed — usage now tracked via OTel/SigNoz", it makes sense, but consider whether USAGE_GENERATION_TYPES + GenerationType belong in otel-attributes.ts or in a shared usage-types.ts to keep the schema file focused on DB concerns.
| }), | ||
| }; | ||
|
|
||
| const result = await generateText(genConfig as Parameters<typeof generateText>[0]); |
There was a problem hiding this comment.
The as Parameters<typeof generateText>[0] cast here and in several other places (AgentSession.ts, EvaluationService.ts) suggests the config object doesn't naturally satisfy the generateText parameter type. This is a known pattern when building configs incrementally, but the number of casts in this PR is growing. Not blocking — just flagging for awareness.
| const MODEL_ALIASES: Record<string, string[]> = { | ||
| 'claude-sonnet-4': ['claude-sonnet-4'], | ||
| 'claude-opus-4': ['claude-opus-4'], | ||
| 'claude-haiku-3.5': ['claude-3-5-haiku', 'claude-3.5-haiku'], | ||
| 'claude-sonnet-3.5': ['claude-3-5-sonnet', 'claude-3.5-sonnet'], | ||
| 'claude-opus-3': ['claude-3-opus'], | ||
| 'claude-haiku-3': ['claude-3-haiku'], | ||
| }; |
There was a problem hiding this comment.
The alias map is Anthropic-only right now. OpenAI, Google, and other providers have similar aliasing needs (e.g. gpt-4o vs gpt-4o-2024-08-06). This is fine as a starting point — the stripDateSuffix regex handles the most common case — but the map will need expansion as users hit pricing misses for other providers.
|
TL;DR — Replaces the inaccurate character-based token estimation heuristic with actual AI SDK token usage for mid-generation compression decisions, adds a Key changes
Summary | 46 files | 22 commits | base: Actual token usage for mid-generation compression
The new
Pricing service and cost middleware
Enriched OTEL span attributes
The
Usage Dashboard UI
Data is fetched from SigNoz via two new query methods —
|
|
TL;DR — Replaces the Key changes
Summary | 46 files | 21 commits | base: Actual SDK token counts for compression decisions
The
Pricing service and cost middleware
OTel instrumentation and generation telemetry
All generation call sites —
Usage dashboard UI and SigNoz API integration
|
There was a problem hiding this comment.
PR Review Summary
(0) New Issues | Risk: Low
This is a delta review covering 1 commit (62bfdad67) and 2 files changed since the previous automated review (e626b64).
Delta Changes ✅
The commit "updated again" makes only TypeScript syntax improvements:
| File | Change |
|---|---|
pricing-service.test.ts |
5 instances of non-null assertion ! → optional chaining ?. |
usage-cost-middleware.test.ts |
4 instances of non-null assertion ! → optional chaining ?. |
These are style/safety improvements with no functional impact. The tests still properly verify null before accessing properties via expect(...).not.toBeNull() assertions.
🕐 Pending Recommendations (1)
- 🟠
signoz-stats.ts:2769-2777Hardcoded generation types missingcomponent_render— inconsistent withUSAGE_GENERATION_TYPESconstant (line 2667 uses the constant correctly)
💡 APPROVE WITH SUGGESTIONS
Summary: The delta contains only TypeScript style improvements (non-null assertions → optional chaining), which is a good practice. No new issues introduced. One prior issue remains unaddressed: the hardcoded generation types list in signoz-stats.ts:2769-2777 should use [...USAGE_GENERATION_TYPES] for consistency, matching how it's used at line 2667.
Reviewers (1)
| Reviewer | Returned | Main Findings | Consider | While You're Here | Inline Comments | Pending Recs | Discarded |
|---|---|---|---|---|---|---|---|
orchestrator |
2 | 0 | 0 | 0 | 0 | 1 | 0 |
| Total | 2 | 0 | 0 | 0 | 0 | 1 | 0 |
Note: Delta review — verified prior syntax improvement commit. One prior Major issue remains unaddressed (hardcoded generation types).
There was a problem hiding this comment.
PR Review Summary
(0) New Issues | Risk: Low
This is a delta review covering 1 commit (62647f150) and 3 files changed since the previous automated review (62bfdad67).
Prior Issues Addressed ✅
The commit successfully addresses the pending issue from the previous review:
| Prior Issue | Resolution |
|---|---|
🟠 Hardcoded generation types missing component_render in signoz-stats.ts:2769-2777 |
✅ Replaced with [...USAGE_GENERATION_TYPES] constant |
Delta Changes
1. agents-manage-ui/src/lib/api/signoz-stats.ts
- Fixed the hardcoded generation types list by using the
USAGE_GENERATION_TYPESconstant, ensuringbuildUsageCostPayloadstays in sync withgetUsageEventsList
2. packages/agents-core/src/utils/__tests__/pricing-service.test.ts
- Added test:
creates only one set of intervals despite concurrent calls— verifiessetIntervalis called exactly twice (gateway + modelsDev) even with concurrentinitialize()calls - Added test:
clears both intervals on destroy— verifiesclearIntervalis called twice duringdestroy()
3. packages/agents-core/src/utils/__tests__/usage-cost-middleware.test.ts
- Added test:
does not throw when no active spanforwrapStream— mirrors existing test forwrapGenerate - Added test:
trims provider prefix from modelId when provider is set— verifiesanthropic/claude-sonnet-4with explicit provider becomesclaude-sonnet-4for pricing lookup
All test additions directly address the minor test coverage feedback from the previous review cycle.
✅ APPROVE
Summary: The delta addresses all pending feedback cleanly. The hardcoded generation types issue is fixed with the proper constant, and the test coverage gaps have been filled with well-structured tests for interval lifecycle management and edge cases. This PR is ready to merge.
Note: Unable to submit formal GitHub approval due to permission constraints — this is a COMMENT review with approval recommendation.
Reviewers (1)
| Reviewer | Returned | Main Findings | Consider | While You're Here | Inline Comments | Pending Recs | Discarded |
|---|---|---|---|---|---|---|---|
orchestrator |
3 | 0 | 0 | 0 | 0 | 0 | 0 |
| Total | 3 | 0 | 0 | 0 | 0 | 0 | 0 |
Note: Delta review — verified prior pending issue was addressed and reviewed new test coverage additions. No new issues found.
Ito Test Report ✅18 test cases ran. 18 passed. All 18 test cases passed with zero failures, confirming the covered cost/usage and conversation-detail flows behave as expected and matching code review findings that revealed no plausible production defects in scope. Key validated behaviors included correct org/project routing, stable and persistent time-range/query filtering (including rapid interactions), proper project scoping and formatting/no-data fallbacks, resilient conversation error handling with graceful degradation, and security protections such as login redirection for unauthorized deep links, cross-project access isolation, and inert rendering of XSS-like payloads. ✅ Passed (18)Commit: Tell us how we did: Give Ito Feedback |
There was a problem hiding this comment.
PR Review Summary
(0) New Issues | Risk: Low
This is a delta review covering 1 commit (0b8718448) since the last review (62647f150).
Delta Changes Reviewed ✅
The commit "fixed for all llm calls" adds comprehensive telemetry instrumentation:
| File | Change | Assessment |
|---|---|---|
EvaluationService.ts |
Added withOtelBaggage helper + experimental_telemetry for simulation/scoring |
✅ Consistent with existing patterns |
otel-attributes.ts |
Added GENERATION_TYPES object constant |
✅ Type-safe, backward compatible |
artifact-components/generate-render/route.ts |
Added telemetry with GENERATION_TYPES.COMPONENT_RENDER |
✅ Correct |
data-components/generate-render/route.ts |
Added telemetry with GENERATION_TYPES.COMPONENT_RENDER |
✅ Correct |
| UI pages/components | Renamed usage → cost | ✅ Consistent naming |
Verification
-
Pattern consistency — The
withOtelBaggagehelper andexperimental_telemetryusage matches existing patterns in:AgentSession.ts(status updates, artifact metadata)generate.ts(main generation)distill-utils.ts(compression)
-
GENERATION_TYPES refactor — Safe change:
- Old:
USAGE_GENERATION_TYPES = ['sub_agent_generation', ...] as const - New:
GENERATION_TYPES = { SUB_AGENT_GENERATION: 'sub_agent_generation', ... } as const USAGE_GENERATION_TYPES = Object.values(GENERATION_TYPES)preserves backward compatibility
- Old:
-
Type safety — Using
GENERATION_TYPES.EVAL_SIMULATIONprovides compile-time safety vs string literals. -
Change from
generateObjecttogenerateTextwithOutput.object— The EvaluationService now uses the AI SDK pattern consistent with the rest of the codebase (e.g.,AgentSession.ts). The(result as any).objectcast is expected given the AI SDK types.
💭 Consider (1)
💭 1) withOtelBaggage helper duplication
Issue: The withOtelBaggage helper is defined in 3 places: EvaluationService.ts and inline in both artifact/data component routes.
Why: Minor code duplication. The pattern is simple enough that inline duplication is acceptable.
Fix: Optionally extract to a shared utility in @inkeep/agents-core if this pattern expands further.
✅ APPROVE
Summary: The delta adds comprehensive telemetry instrumentation to evaluation service and component render routes, following established patterns. The GENERATION_TYPES object refactor improves type safety while maintaining backward compatibility. All prior review issues remain addressed. Ready to merge.
Note: Unable to submit formal GitHub approval due to permission constraints — this is a COMMENT review with approval recommendation.
Reviewers (1)
| Reviewer | Returned | Main Findings | Consider | While You're Here | Inline Comments | Pending Recs | Discarded |
|---|---|---|---|---|---|---|---|
orchestrator |
5 | 0 | 1 | 0 | 0 | 0 | 0 |
| Total | 5 | 0 | 1 | 0 | 0 | 0 | 0 |
Note: Delta review — verified delta changes follow established patterns. No blocking issues found.
There was a problem hiding this comment.
PR Review Summary
(0) New Issues | Risk: Low
This is a delta review covering 1 commit (adb275078) since the last automated review (0b8718448).
Delta Changes Reviewed ✅
The commit "fixed tests" updates model-factory.test.ts to align mock objects with the wrapped model interface:
| Change | Assessment |
|---|---|
Mock createAzure returns modelId instead of model |
✅ Correct — aligns with wrapLanguageModel changes |
All 6 expect().toMatchObject() assertions updated |
✅ Consistent with wrapped model behavior |
This is a straightforward test alignment fix with no functional impact on production code.
Prior Issues Status
All Critical and Major issues from prior reviews have been addressed:
| Prior Issue | Status |
|---|---|
| 🔴 External HTTP call without timeout | ✅ Fixed — AbortSignal.timeout(10_000) added |
| 🟠 Nested property access may throw TypeError | ✅ Fixed — extractUsageTokens() helper added |
| 🟠 Silent catch swallows model parsing errors | ✅ Fixed — logger.debug() added |
| 🟠 Silent cost calculation failure | ✅ Fixed — gen_ai.cost.pricing_unavailable attribute added |
| 🟠 Sequential data fetching | ✅ Fixed — Promise.allSettled() parallelization |
🟠 initialize() not idempotent |
✅ Fixed — initPromise pattern implemented |
🟠 Hardcoded generation types missing component_render |
✅ Fixed — uses USAGE_GENERATION_TYPES constant |
| 🟠 Test coverage gaps | ✅ Fixed — comprehensive tests added for pricing-service and usage-cost-middleware |
💭 Consider (1)
💭 1) withOtelBaggage helper duplication
Issue: The withOtelBaggage helper is defined inline in 3 places: EvaluationService.ts and both artifact/data component render routes.
Why: Minor code duplication. The pattern is simple enough that inline duplication is acceptable for now.
Fix: Optionally extract to a shared utility in @inkeep/agents-core if this pattern expands further.
✅ APPROVE
Summary: The delta contains only a test alignment fix that updates mock objects to use modelId instead of model, which is consistent with the wrapLanguageModel changes. All prior review feedback has been addressed. The PR is ready to merge.
Note: Unable to submit formal GitHub approval due to permission constraints — this is a COMMENT review with approval recommendation.
Reviewers (1)
| Reviewer | Returned | Main Findings | Consider | While You're Here | Inline Comments | Pending Recs | Discarded |
|---|---|---|---|---|---|---|---|
orchestrator |
1 | 0 | 1 | 0 | 0 | 0 | 0 |
| Total | 1 | 0 | 1 | 0 | 0 | 0 | 0 |
Note: Delta review — verified test fix commit aligns with model wrapping changes. No new issues found.
There was a problem hiding this comment.
PR Review Summary
(0) New Issues | Risk: Low
This is a delta review covering 1 commit (d61caec7b) since the last automated review (adb275078).
Delta Changes ✅
The commit "knipped" removes unused exports from cost-dashboard.tsx:
| Change | Assessment |
|---|---|
export function formatCost → function formatCost |
✅ Correct — function is only used internally |
export function formatTokens → function formatTokens |
✅ Correct — function is only used internally |
This is a standard knip cleanup with no functional impact. The functions remain in use within the component; only the unused exports are removed.
Prior Issues Status
All Critical and Major issues from prior reviews have been addressed:
| Prior Issue | Status |
|---|---|
| 🔴 External HTTP call without timeout | ✅ Fixed — AbortSignal.timeout(10_000) added |
| 🟠 Nested property access may throw TypeError | ✅ Fixed — extractUsageTokens() helper added |
| 🟠 Silent catch swallows model parsing errors | ✅ Fixed — logger.debug() added |
| 🟠 Silent cost calculation failure | ✅ Fixed — gen_ai.cost.pricing_unavailable attribute added |
| 🟠 Sequential data fetching | ✅ Fixed — Promise.allSettled() parallelization |
🟠 initialize() not idempotent |
✅ Fixed — initPromise pattern implemented |
🟠 Hardcoded generation types missing component_render |
✅ Fixed — uses USAGE_GENERATION_TYPES constant |
| 🟠 Test coverage gaps | ✅ Fixed — comprehensive tests added |
💭 Consider (1)
💭 1) withOtelBaggage helper duplication
Issue: The withOtelBaggage helper is defined inline in 3 places: EvaluationService.ts and both artifact/data component render routes.
Why: Minor code duplication. The pattern is simple enough that inline duplication is acceptable for now.
Fix: Optionally extract to a shared utility in @inkeep/agents-core if this pattern expands further.
✅ APPROVE
Summary: The delta is a trivial knip cleanup removing unused exports from two helper functions. All prior Critical and Major issues have been addressed across the previous review iterations. This PR is ready to merge.
Note: Unable to submit formal GitHub approval due to permission constraints — this is a COMMENT review with approval recommendation.
Reviewers (1)
| Reviewer | Returned | Main Findings | Consider | While You're Here | Inline Comments | Pending Recs | Discarded |
|---|---|---|---|---|---|---|---|
orchestrator |
1 | 0 | 1 | 0 | 0 | 0 | 0 |
| Total | 1 | 0 | 1 | 0 | 0 | 0 | 0 |
Note: Delta review — verified knip cleanup commit removes unused exports. No new issues found.



















last completed step. Falls back to estimates when usage data is unavailable (step 0, or providers returning undefined/0) with a warning log.
Pricing misses are now logged (deduplicated per refresh cycle).