fix(ai-sdk): improve and align attributes with gen-ai semantic-conventions #827

galkleinman · 2025-11-19T18:49:44Z

Summary by CodeRabbit

Release Notes

New Features
- Introduced comprehensive OpenTelemetry GenAI semantic attributes for enhanced AI/LLM observability and tracing.
- Added Vector Database operation attributes for structured vector store monitoring.
- Added Workflow tracking attributes for complex AI workflows.
- Deprecated legacy LLM attributes with migration guidance to new standards.
Tests
- Added extensive test coverage for GenAI Semantic Conventions compliance.

Important

Align AI SDK attributes with OpenTelemetry GenAI semantic conventions, adding new attributes and maintaining backward compatibility.

Attributes:
- Add new attributes in SemanticAttributes.ts for GenAI semantic conventions, including GEN_AI_OPERATION_NAME, GEN_AI_PROVIDER_NAME, GEN_AI_REQUEST_MODEL, GEN_AI_RESPONSE_ID, and others.
- Mark several attributes as deprecated, such as LLM_SYSTEM, LLM_REQUEST_MODEL, and LLM_PROMPTS.
Transformations:
- Update ai-sdk-transformations.ts to transform AI SDK attributes to new GenAI attributes, ensuring backward compatibility.
- Implement functions like addOperationName, transformModelId, transformFunctionId, and transformResponseMetadata.
Tests:
- Add comprehensive tests in ai-sdk-otel-attributes.test.ts to verify new GenAI attributes and backward compatibility.
- Update ai-sdk-integration.test.ts and ai-sdk-transformations.test.ts to include new attributes and transformations.
Recordings:
- Add new .har files in recordings to test various scenarios, including provider name setting and operation name setting.

^{This description was created by}^{for 7cea57e. You can customize this summary. It will automatically update as commits are pushed.}

…tions

coderabbitai · 2025-11-19T18:50:01Z

Walkthrough

Adds comprehensive OpenTelemetry GenAI semantic convention support with ~60 new attribute constants, deprecates legacy LLM attributes with migration guidance, and refactors transformation logic to emit both new and deprecated attributes concurrently for backward compatibility.

Changes

Cohort / File(s)	Summary
Semantic Attribute Constants `packages/ai-semantic-conventions/src/SemanticAttributes.ts`	Added ~60 new OpenTelemetry GenAI attributes (GEN_AI_OPERATION_NAME, GEN_AI_PROVIDER_NAME, GEN_AI_REQUEST_, GEN_AI_RESPONSE_, GEN_AI_USAGE_, GEN_AI_INPUT_MESSAGES, GEN_AI_OUTPUT_MESSAGES, GEN_AI_TOOL_DEFINITIONS, GEN_AI_AGENT_NAME, GEN_AI_SYSTEM_INSTRUCTIONS); added non-standard LLM attributes (LLM_REQUEST_TYPE, LLM_USAGE_TOTAL_TOKENS, LLM_TOP_K, LLM_FREQUENCY_PENALTY, LLM_PRESENCE_PENALTY, LLM_CHAT_STOP_SEQUENCES, LLM_REQUEST_FUNCTIONS); added Vector DB attributes (VECTOR_DB_VENDOR, VECTOR_DB_QUERY_TOP_K, VECTOR_DB_TABLE_NAME, VECTOR_DB_ADD_, VECTOR_DB_DELETE_, VECTOR_DB_GET_); added Traceloop workflow attributes (TRACELOOP_SPAN_KIND, TRACELOOP_WORKFLOW_NAME, TRACELOOP_ENTITY_, TRACELOOP_ASSOCIATION_PROPERTIES); marked 12 legacy LLM_ attributes as deprecated with migration guidance.
AI SDK Transformation Logic `packages/traceloop-sdk/src/lib/tracing/ai-sdk-transformations.ts`	Introduced OTEL provider mapping and legacy vendor mapping; added transformers for operation name derivation, model ID, function ID, provider/response metadata, response attributes (including finish reasons array), and structured tool definitions. Enhanced prompts handling to separate system instructions into GEN_AI_SYSTEM_INSTRUCTIONS and combine messages into GEN_AI_INPUT_MESSAGES. Upgraded token usage to populate GEN_AI_USAGE_INPUT_TOKENS and GEN_AI_USAGE_OUTPUT_TOKENS. Updated transformLLMSpans signature to accept optional spanName parameter and forward into transformAiSdkSpanAttributes. All new attributes coexist with deprecated counterparts for backward compatibility.
Integration Test Updates `packages/traceloop-sdk/test/ai-sdk-integration.test.ts`	Updated span lookup from exact match ("text.generate") to prefix-based matching to accommodate transformed span names with model information. Enhanced telemetry assertions to validate new GEN_AI_* attributes (operation name, provider name, request/usage tokens) alongside deprecated LLM_* attributes. Adjusted model and prompt/response references to new SpanAttributes constants.
Test Suite: OTel GenAI Attributes `packages/traceloop-sdk/test/ai-sdk-otel-attributes.test.ts`	Added comprehensive test suite exercising OTel GenAI semantic conventions (operation.name, provider.name, tool.definitions, system_instructions, usage tokens, backward compatibility, span naming). Uses Polly for HTTP recording/replay with OpenAI and Anthropic providers. Validates presence and structure of new and deprecated attributes concurrently.
Transformation Unit Tests `packages/traceloop-sdk/test/ai-sdk-transformations.test.ts`	Added assertion verifying empty-string provider handling: both GEN_AI_PROVIDER_NAME and deprecated provider attributes set to empty string when ai.model.provider is empty.
Test Recordings (HAR) `packages/traceloop-sdk/recordings/AI-SDK-OTel-GenAI-Semantic-Conventions_247892713/.../recording.har` (8 files)	Added test fixtures capturing OpenAI and Anthropic API interactions: backward compatibility (deprecated + new attributes), span naming patterns, operation name ("chat"), provider name attribution, system instructions separation, tool definitions structure, and token usage validation.

Sequence Diagram

sequenceDiagram
    participant AI as AI SDK Span
    participant Transform as Transformation Pipeline
    participant OTel as OTel Attributes
    participant Legacy as Legacy Attributes

    AI->>Transform: ai.model.id, ai.model.provider,<br/>ai.response.*, ai.telemetry.functionId
    
    rect rgb(200, 220, 240)
    Note over Transform: New GenAI Mappings
    Transform->>OTel: addOperationName() → GEN_AI_OPERATION_NAME
    Transform->>OTel: transformModelId() → GEN_AI_REQUEST_MODEL
    Transform->>OTel: transformProviderMetadata() → GEN_AI_PROVIDER_NAME
    Transform->>OTel: transformResponseMetadata() → GEN_AI_RESPONSE_ID/MODEL/FINISH_REASONS
    end
    
    rect rgb(220, 240, 200)
    Note over Transform: Enhanced Message/Tool Handling
    Transform->>OTel: transformPrompts() → GEN_AI_SYSTEM_INSTRUCTIONS,<br/>GEN_AI_INPUT_MESSAGES
    Transform->>OTel: transformTools() → GEN_AI_TOOL_DEFINITIONS
    Transform->>OTel: transformResponse*() → GEN_AI_OUTPUT_MESSAGES
    end
    
    rect rgb(240, 220, 200)
    Note over Transform: Token Usage
    Transform->>OTel: transformPromptTokens() → GEN_AI_USAGE_INPUT_TOKENS
    Transform->>OTel: transformCompletionTokens() → GEN_AI_USAGE_OUTPUT_TOKENS
    end
    
    rect rgb(220, 200, 220)
    Note over Transform: Backward Compatibility
    Transform->>Legacy: LLM_SYSTEM, LLM_REQUEST_MODEL,<br/>LLM_INPUT_MESSAGES, LLM_OUTPUT_MESSAGES,<br/>LLM_USAGE_PROMPT_TOKENS,<br/>LLM_USAGE_COMPLETION_TOKENS
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Areas requiring extra attention:

ai-sdk-transformations.ts: Core transformation pipeline with multiple interdependent transformation functions and dual-attribute emission logic; verify backward compatibility semantics and token calculation accuracy.
Deprecation markers and migration paths: Ensure all 12 deprecated attributes correctly reference their GEN_AI_* equivalents in JSDoc; validate that legacy attribute values align with new ones.
Span naming logic: Review changes to span name construction with model appending and ensure alignment with OTel naming pattern.
Test coverage: Verify that the new ai-sdk-otel-attributes.test.ts suite adequately exercises all new attributes and that updated assertions in ai-sdk-integration.test.ts properly validate dual-attribute emission without test flakiness from prefix matching.

Possibly related PRs

fix(sdk): support vercel AI SDK tool calling + structured outputs #675 — Modifies the same ai-sdk-transformations.ts pipeline with related span naming, vendor mapping, and tool/prompt transformations.
fix(sdk): proper formatting for vercel AI SDK tool calls #736 — Updates transformLLMSpans and splits span-name vs. attribute transformation logic in the same file.
feat(vercel): add gen.ai.input.messages + gen.ai.output.messages #734 — Introduces gen_ai.input.messages and gen_ai.output.messages attributes and their consumption in transformation code.

Suggested reviewers

nirga
avivhalfon

Poem

🐰 Hop along the OTel path so bright,
New GenAI attributes take their flight!
Provider names and operation calls,
Legacy friends still answer the halls—
Backward compat hops, both old and new, 🌟
A rabbit's refactor, through and through!

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically describes the main change: aligning AI SDK attributes with gen-ai semantic conventions, which accurately reflects the comprehensive updates to SemanticAttributes.ts adding new GEN_AI_* attributes.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch gk/ai-sdk-semconv

Tip

📝 Customizable high-level summaries are now available in beta!

You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.

Provide your own instructions using the high_level_summary_instructions setting.
Format the summary however you like (bullet lists, tables, multi-section layouts, contributor stats, etc.).
Use high_level_summary_in_walkthrough to move the summary from the description to the walkthrough section.

Example instruction:

"Divide the high-level summary into five sections:

📝 Description — Summarize the main change in 50–60 words, explaining what was done.

📓 References — List relevant issues, discussions, documentation, or related PRs.

📦 Dependencies & Requirements — Mention any new/updated dependencies, environment variable changes, or configuration updates.

📊 Contributor Summary — Include a Markdown table showing contributions:
| Contributor | Lines Added | Lines Removed | Files Changed |

✔️ Additional Notes — Add any extra reviewer context.
Keep each section concise (under 200 words) and use bullet or numbered lists for clarity."

Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)

packages/traceloop-sdk/test/ai-sdk-transformations.test.ts (1)

1-1753: Fix Prettier formatting and resolve ESLint warnings

Prettier detected code style issues, and ESLint identified unused imports and variables that should be cleaned up:

Remove unused imports: context (line 4), ASSOCATION_PROPERTIES_KEY (line 5), transformAiSdkSpanAttributes (line 8), transformAiSdkSpanNames (line 9)

Remove unused helper functions: createMockSpan (line 13), createMockSpanWithUpdate (line 25)

Replace any types with specific types in createMockSpanWithUpdate parameter (line 27) and the removed helper functions' parameters (line 15)

Run prettier --write packages/traceloop-sdk/test/ai-sdk-transformations.test.ts to fix formatting

After cleanup and formatting, re-run tests to confirm no snapshots or logic are affected.

packages/traceloop-sdk/test/ai-sdk-integration.test.ts (1)

1-326: Fix Prettier issues reported by CI

CI flagged code style issues in this file as well. Please run the formatter (e.g., prettier --write packages/traceloop-sdk/test/ai-sdk-integration.test.ts) so tests follow the repo’s formatting rules.

packages/traceloop-sdk/src/lib/tracing/ai-sdk-transformations.ts (1)

1-703: Fix Prettier issues reported by CI for this file

CI reports style problems here as well. Please run the configured formatter (e.g., prettier --write packages/traceloop-sdk/src/lib/tracing/ai-sdk-transformations.ts) so this file conforms to the project’s formatting rules.

🧹 Nitpick comments (6)

packages/traceloop-sdk/test/ai-sdk-transformations.test.ts (1)

1001-1015: Empty provider test correctly exercises new + deprecated attributes

The new assertion for SpanAttributes.GEN_AI_PROVIDER_NAME alongside SpanAttributes.LLM_SYSTEM for ai.model.provider: "" gives good coverage of the “present but empty” case and ensures both new and deprecated attributes are kept in sync. This matches the deprecation strategy in SemanticAttributes.ts.

If you anticipate additional edge-cases (e.g., whitespace-only provider strings), consider adding a small parametrized test to pin that behavior as well.

packages/traceloop-sdk/test/ai-sdk-otel-attributes.test.ts (1)

71-87: Avoid redundant forceFlush in individual tests

afterEach already calls await traceloop.forceFlush(); and resets the exporter, so the extra forceFlush calls inside individual tests (for example, Line 86) are redundant. You can rely on the shared teardown unless a specific test needs an extra flush.

packages/traceloop-sdk/test/ai-sdk-integration.test.ts (1)

115-117: Align span-name expectation comment with actual assertion

The comments say the span name “should be transformed and include model name,” but the assertions only check startsWith("text.generate"). If you do want to enforce model presence, consider asserting that the name includes the model string; otherwise, update the comment to match the looser condition.

Also applies to: 210-212

packages/traceloop-sdk/src/lib/tracing/ai-sdk-transformations.ts (2)

100-163: Model/response/metadata transforms look correct; centralize provider-metadata key

transformModelId safely migrates ai.model.id to GEN_AI_REQUEST_MODEL without clobbering an existing value.

transformFunctionId maps ai.telemetry.functionId into TRACELOOP_ENTITY_NAME, which keeps entity naming consistent.

transformResponseMetadata correctly adapters ai.response.{id,model,finishReason} into the new GEN_AI_* response attributes.

For transformProviderMetadata, you introduce gen_ai.provider.metadata as a new attribute but emit it as a bare string key. To keep all AI/LLM span attributes centralized (per SemanticAttributes.ts guidance), it would be better to:

Add a GEN_AI_PROVIDER_METADATA: "gen_ai.provider.metadata" constant to SpanAttributes, and

Use that constant here instead of the hardcoded string.

331-400: Tool definitions transformation is solid; consider richer support for non-function tools

transformTools builds an OTel-compliant GEN_AI_TOOL_DEFINITIONS array while still emitting the flat LLM_REQUEST_FUNCTIONS.* attributes. For function tools this is spot-on and matches the tests.

For non-type === "function" tools, you currently only record the type in the structured definition (details are only in the flat attributes). If you expect other tool types in the future, you might want to extend the structured payload to preserve name/description/parameters (where applicable) there as well.
packages/ai-semantic-conventions/src/SemanticAttributes.ts (1)
47-52: Add a SpanAttributes entry for gen_ai.provider.metadata

transformProviderMetadata in ai-sdk-transformations.ts emits a gen_ai.provider.metadata attribute directly as a string key. To fully follow the “centralize AI/LLM attributes here” guideline, consider adding:
GEN_AI_PROVIDER_METADATA: "gen_ai.provider.metadata",
to SpanAttributes and updating the transformer to reference it. This keeps all GenAI attribute names discoverable from a single place.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 2561982 and 7cea57e.

📒 Files selected for processing (13)

packages/ai-semantic-conventions/src/SemanticAttributes.ts (1 hunks)
packages/traceloop-sdk/recordings/AI-SDK-OTel-GenAI-Semantic-Conventions_247892713/Backward-compatibility_2327270860/should-maintain-all-deprecated-attributes-alongside-new-ones_410416891/recording.har (1 hunks)
packages/traceloop-sdk/recordings/AI-SDK-OTel-GenAI-Semantic-Conventions_247892713/Span-naming_1988960947/should-follow-OTel-pattern-operation-model_2170500561/recording.har (1 hunks)
packages/traceloop-sdk/recordings/AI-SDK-OTel-GenAI-Semantic-Conventions_247892713/gen_ai-operation-name-attribute_34593844/should-set-operation-name-to-chat-for-generateText_962058512/recording.har (1 hunks)
packages/traceloop-sdk/recordings/AI-SDK-OTel-GenAI-Semantic-Conventions_247892713/gen_ai-provider-name-attribute_149611164/should-set-provider-name-to-anthropic-for-Anthropic_274976646/recording.har (1 hunks)
packages/traceloop-sdk/recordings/AI-SDK-OTel-GenAI-Semantic-Conventions_247892713/gen_ai-provider-name-attribute_149611164/should-set-provider-name-to-openai-for-OpenAI_936627494/recording.har (1 hunks)
packages/traceloop-sdk/recordings/AI-SDK-OTel-GenAI-Semantic-Conventions_247892713/gen_ai-system_instructions-attribute_2872379897/should-separate-system-instructions-from-input-messages_554757073/recording.har (1 hunks)
packages/traceloop-sdk/recordings/AI-SDK-OTel-GenAI-Semantic-Conventions_247892713/gen_ai-tool-definitions-attribute_2710797144/should-create-structured-tool-definitions-for-tools_2917601873/recording.har (1 hunks)
packages/traceloop-sdk/recordings/AI-SDK-OTel-GenAI-Semantic-Conventions_247892713/gen_ai-usage-tokens-attributes_1495437318/should-set-both-new-and-deprecated-token-attributes_3016620579/recording.har (1 hunks)
packages/traceloop-sdk/src/lib/tracing/ai-sdk-transformations.ts (14 hunks)
packages/traceloop-sdk/test/ai-sdk-integration.test.ts (5 hunks)
packages/traceloop-sdk/test/ai-sdk-otel-attributes.test.ts (1 hunks)
packages/traceloop-sdk/test/ai-sdk-transformations.test.ts (1 hunks)

🧰 Additional context used

📓 Path-based instructions (4)

**/recordings/**

📄 CodeRabbit inference engine (CLAUDE.md)

Store HTTP interaction recordings for tests under recordings/ directories for Polly.js replay

Files:

packages/traceloop-sdk/recordings/AI-SDK-OTel-GenAI-Semantic-Conventions_247892713/gen_ai-system_instructions-attribute_2872379897/should-separate-system-instructions-from-input-messages_554757073/recording.har
packages/traceloop-sdk/recordings/AI-SDK-OTel-GenAI-Semantic-Conventions_247892713/gen_ai-operation-name-attribute_34593844/should-set-operation-name-to-chat-for-generateText_962058512/recording.har
packages/traceloop-sdk/recordings/AI-SDK-OTel-GenAI-Semantic-Conventions_247892713/gen_ai-usage-tokens-attributes_1495437318/should-set-both-new-and-deprecated-token-attributes_3016620579/recording.har
packages/traceloop-sdk/recordings/AI-SDK-OTel-GenAI-Semantic-Conventions_247892713/gen_ai-tool-definitions-attribute_2710797144/should-create-structured-tool-definitions-for-tools_2917601873/recording.har
packages/traceloop-sdk/recordings/AI-SDK-OTel-GenAI-Semantic-Conventions_247892713/gen_ai-provider-name-attribute_149611164/should-set-provider-name-to-openai-for-OpenAI_936627494/recording.har
packages/traceloop-sdk/recordings/AI-SDK-OTel-GenAI-Semantic-Conventions_247892713/Span-naming_1988960947/should-follow-OTel-pattern-operation-model_2170500561/recording.har
packages/traceloop-sdk/recordings/AI-SDK-OTel-GenAI-Semantic-Conventions_247892713/Backward-compatibility_2327270860/should-maintain-all-deprecated-attributes-alongside-new-ones_410416891/recording.har
packages/traceloop-sdk/recordings/AI-SDK-OTel-GenAI-Semantic-Conventions_247892713/gen_ai-provider-name-attribute_149611164/should-set-provider-name-to-anthropic-for-Anthropic_274976646/recording.har

packages/{instrumentation-*,traceloop-sdk}/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Import AI/LLM semantic attribute constants from @traceloop/ai-semantic-conventions rather than hardcoding strings

Files:

packages/traceloop-sdk/test/ai-sdk-otel-attributes.test.ts
packages/traceloop-sdk/test/ai-sdk-transformations.test.ts
packages/traceloop-sdk/test/ai-sdk-integration.test.ts
packages/traceloop-sdk/src/lib/tracing/ai-sdk-transformations.ts

packages/traceloop-sdk/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

packages/traceloop-sdk/**/*.{ts,tsx}: Use the provided decorators (@workflow, @task, @agent) for workflow/task/agent spans instead of re-implementing them
For manual LLM operations, use trace.withLLMSpan from @traceloop/node-server-sdk

Files:

packages/traceloop-sdk/test/ai-sdk-otel-attributes.test.ts
packages/traceloop-sdk/test/ai-sdk-transformations.test.ts
packages/traceloop-sdk/test/ai-sdk-integration.test.ts
packages/traceloop-sdk/src/lib/tracing/ai-sdk-transformations.ts

packages/ai-semantic-conventions/src/SemanticAttributes.ts

📄 CodeRabbit inference engine (CLAUDE.md)

Define all AI/LLM span attribute constants in packages/ai-semantic-conventions/src/SemanticAttributes.ts

Files:

packages/ai-semantic-conventions/src/SemanticAttributes.ts

🧠 Learnings (8)

📓 Common learnings

Learnt from: CR
Repo: traceloop/openllmetry-js PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-08-24T22:08:07.023Z
Learning: Applies to packages/{instrumentation-*,traceloop-sdk}/**/*.{ts,tsx} : Import AI/LLM semantic attribute constants from traceloop/ai-semantic-conventions rather than hardcoding strings

Learnt from: CR
Repo: traceloop/openllmetry-js PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-08-24T22:08:07.023Z
Learning: Applies to packages/ai-semantic-conventions/src/SemanticAttributes.ts : Define all AI/LLM span attribute constants in packages/ai-semantic-conventions/src/SemanticAttributes.ts