Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added tracing for inference #30800

Closed
wants to merge 2 commits into from
Closed

Added tracing for inference #30800

wants to merge 2 commits into from

Conversation

howieleung
Copy link
Member

@howieleung howieleung commented Aug 18, 2024

Packages impacted by this PR

Issues associated with this PR

Describe the problem that is addressed by this PR

Added telemetry tracing for inference
TODO before merging:
add test
PR for typespecs
Add event tracking
Add condition to enable/disable certain attributes

What are the possible designs available to address the problem? If there are more than one possible design, why was the one in this PR chosen?

Are there test cases added in this PR? (If not, why?)

Provide a list of related PRs (if any)

Command used to generate this PR:**(Applicable only to SDK release request PRs)

Checklists

  • Added impacted package name to the issue description
  • Does this PR needs any fixes in the SDK Generator?** (If so, create an Issue in the Autorest/typescript repository and link it here)
  • Added a changelog (if necessary)

@azure-sdk
Copy link
Collaborator

azure-sdk commented Aug 18, 2024

API change check

API changes are not detected in this pull request.

Copy link
Member

@maorleger maorleger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My meta-question is whether we can put this into core-client-rest or one of the other RLC core packages instead of a one-off? code changes look good

sdk/ai/ai-inference-rest/src/tracingPolicy.ts Outdated Show resolved Hide resolved
sdk/ai/ai-inference-rest/src/tracingPolicy.ts Outdated Show resolved Hide resolved
@howieleung
Copy link
Member Author

My meta-question is whether we can put this into core-client-rest or one of the other RLC core packages instead of a one-off? code changes look good

Good question. My assignment is specifically for inference only mirroring this MR for python.
Azure/azure-sdk-for-python#36890
And we are following this schema
https://github.com/open-telemetry/semantic-conventions/blob/main/docs/gen-ai/gen-ai-spans.md

I guess this schema isn't for all clients and this is the first one we are instrumenting.

Copy link
Member

@lmolkova lmolkova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great start!

Would it be possible to try it out and attach some screenshots?

It's probably the easiest to create Azure Monitor (Application Insights) resource and enable azmon otel -
https://github.com/Azure/azure-sdk-for-js/tree/main/sdk/monitor/monitor-opentelemetry#enable-azure-monitor-opentelemetry-client

Let me know if you need any help!

sdk/ai/ai-inference-rest/src/tracingPolicy.ts Outdated Show resolved Hide resolved
sdk/ai/ai-inference-rest/src/tracingPolicy.ts Outdated Show resolved Hide resolved
sdk/ai/ai-inference-rest/src/tracingPolicy.ts Outdated Show resolved Hide resolved
sdk/ai/ai-inference-rest/src/tracingPolicy.ts Outdated Show resolved Hide resolved
sdk/ai/ai-inference-rest/src/tracingPolicy.ts Outdated Show resolved Hide resolved

function tryProcessError(span: TracingSpan, error: unknown): void {
try {
span.setStatus({
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to set error.type here too (to fully qualified exception type name)

sdk/ai/ai-inference-rest/src/tracingPolicy.ts Outdated Show resolved Hide resolved
};
*/
span.setStatus({
status: "success",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should not set status to success, let's leave it unset

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unset now

sdk/ai/ai-inference-rest/src/tracingPolicy.ts Outdated Show resolved Hide resolved
console.log("== Chat Completions Sample ==");

// initialize a span named "main" with default options for the spans
await tracingClient.withSpan("main", {}, async (updatedOptions) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use OTel API here - this is end-user code -

Suggested change
await tracingClient.withSpan("main", {}, async (updatedOptions) => {
await tracer.startActiveSpan('rollTheDice', (span: Span) => {

https://opentelemetry.io/docs/languages/js/instrumentation/#create-spans

@howieleung
Copy link
Member Author

My meta-question is whether we can put this into core-client-rest or one of the other RLC core packages instead of a one-off? code changes look good

With the new changes, there is trace and traceAsync function in core-tracing that accept several callbacks functions. LLM devs are welcome to use them to trace their own function. In additional, ai-inference has the mapper functions to create the span attributes to invoke tracing.

console.log("== Chat Completions Sample ==");

// Initialize a span named "main" with default options for the spans
await tracingClient.withSpan("main", {}, async (updatedOptions) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

sdk/ai/ai-inference-rest/src/trace.ts Outdated Show resolved Hide resolved
const body = (request as RequestParameterWithBodyType).body;

map.set(TracingAttributesEnum.Operation_Name, getOperationName(path));
map.set(TracingAttributesEnum.System, "az.ai_inference");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry for changing things, we'll need to update it again to whatever will be merged in open-telemetry/semantic-conventions#1393 (currently it's az.ai.inference)

map.set(TracingAttributesEnum.Request_Frequency_Penalty, body?.frequency_penalty);
map.set(TracingAttributesEnum.Request_Max_Tokens, body?.max_tokens);
map.set(TracingAttributesEnum.Request_Presence_Penalty, body?.presence_penalty);
map.set(TracingAttributesEnum.Request_Stop_Sequences, body?.stop);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the type of body.stop? The attribute should be an array, even if there is one stop sequence (and not set if there is no stop sequence)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. It is an array.

const map = new Map<string, unknown>();
if (error) {
if (error instanceof Error) {
map.set(TracingAttributesEnum.Error_Type, error.message);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the error type attribute should be the fully qualified type of the exception, not the message

if error has happened, we also need to set status on the span, e.g. like here

span.setStatus({
status: "error",
error: isError(error) ? error : undefined,
});


const request = args as RequestParameterWithBodyType;

const name = `${getOperationName(path)} ${request.body?.model ?? ""}`;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would there be a trailing comma if there is no model?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NO. it will just be the word, chat after trimming

import { createTracingClient, OperationTracingOptions } from "@azure/core-tracing";
import { GetChatCompletionsBodyParam, GetEmbeddingsBodyParam, GetImageEmbeddingsBodyParam } from "./parameters.js";

const traceCLient = createTracingClient({ namespace: "Micirsoft.CognitiveServices", packageName: "ai-inference-rest", packageVersion: "1.0.0" });
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const traceCLient = createTracingClient({ namespace: "Micirsoft.CognitiveServices", packageName: "ai-inference-rest", packageVersion: "1.0.0" });
const traceClient = createTracingClient({ namespace: "Micirsoft.CognitiveServices", packageName: "ai-inference-rest", packageVersion: "1.0.0" });

sdk/ai/ai-inference-rest/src/trace.ts Outdated Show resolved Hide resolved
sdk/ai/ai-inference-rest/src/trace.ts Outdated Show resolved Hide resolved
@maorleger
Copy link
Member

maorleger commented Sep 6, 2024

My meta-question is whether we can put this into core-client-rest or one of the other RLC core packages instead of a one-off? code changes look good

With the new changes, there is trace and traceAsync function in core-tracing that accept several callbacks functions. LLM devs are welcome to use them to trace their own function. In additional, ai-inference has the mapper functions to create the span attributes to invoke tracing.

core-tracing is not meant to be used by LLM devs, only Azure SDK client libraries. I think there's a misunderstanding of roles and responsibilities here that I would like to clarify:

  • core-tracing: provides wrappers and convenience for adding implementation-agnostic tracing to client libraries
    • Decoupled from opentelemetry
  • client libraries: use core-tracing to create a tracing client with the correct attributes, then wrap async operations with calls to withSpan
    • Decoupled from opentelemetry
  • @azure/opentelemetry-instrumentation-azure-sdk - the glue that hooks up the core-tracing APIs with opentelemetry at runtime.
  • end users: developers who use your client library to interact with Azure services. They will use opentelemetry's APIs as needed to configure their tracing provider, exporter, etc.
    • They will also, as part of their configuration, use the opentelemetry-instrumentation-azure-sdk to wire up OTel as the implementation provider for core-tracing. They are decoupled from core-tracing and do not use it directly, the instrumentation package does

You, as a client library author, probably do not need to:

  • Use OTel's APIs directly
  • Make changes to @azure/core-tracing

An end user probably do not need to:

  • call createTracingClient
  • Use @azure/core-tracing directly

So I am not 100% sure why these changes are needed and I think we're going down the wrong path. Let's take a step back. All I would expect to see from a client library code is:

Is all this stuff needed because RLC code-gen does not support tracing today? Should we prioritize that work or add something in core-client-rest? cc @joheredi who might know the state of RLC tracing

Happy to set up a meeting about this if it helps

@howieleung
Copy link
Member Author

My meta-question is whether we can put this into core-client-rest or one of the other RLC core packages instead of a one-off? code changes look good

With the new changes, there is trace and traceAsync function in core-tracing that accept several callbacks functions. LLM devs are welcome to use them to trace their own function. In additional, ai-inference has the mapper functions to create the span attributes to invoke tracing.

core-tracing is not meant to be used by LLM devs, only Azure SDK client libraries. I think there's a misunderstanding of roles and responsibilities here that I would like to clarify:

  • core-tracing: provides wrappers and convenience for adding implementation-agnostic tracing to client libraries

    • Decoupled from opentelemetry
  • client libraries: use core-tracing to create a tracing client with the correct attributes, then wrap async operations with calls to withSpan

    • Decoupled from opentelemetry
  • @azure/opentelemetry-instrumentation-azure-sdk - the glue that hooks up the core-tracing APIs with opentelemetry at runtime.

  • end users: developers who use your client library to interact with Azure services. They will use opentelemetry's APIs as needed to configure their tracing provider, exporter, etc.

    • They will also, as part of their configuration, use the opentelemetry-instrumentation-azure-sdk to wire up OTel as the implementation provider for core-tracing. They are decoupled from core-tracing and do not use it directly, the instrumentation package does

You, as a client library author, probably do not need to:

  • Use OTel's APIs directly
  • Make changes to @azure/core-tracing

An end user probably do not need to:

  • call createTracingClient
  • Use @azure/core-tracing directly

So I am not 100% sure why these changes are needed and I think we're going down the wrong path. Let's take a step back. All I would expect to see from a client library code is:

Is all this stuff needed because RLC code-gen does not support tracing today? Should we prioritize that work or add something in core-client-rest? cc @joheredi who might know the state of RLC tracing

Happy to set up a meeting about this if it helps

My meta-question is whether we can put this into core-client-rest or one of the other RLC core packages instead of a one-off? code changes look good

With the new changes, there is trace and traceAsync function in core-tracing that accept several callbacks functions. LLM devs are welcome to use them to trace their own function. In additional, ai-inference has the mapper functions to create the span attributes to invoke tracing.

core-tracing is not meant to be used by LLM devs, only Azure SDK client libraries. I think there's a misunderstanding of roles and responsibilities here that I would like to clarify:

  • core-tracing: provides wrappers and convenience for adding implementation-agnostic tracing to client libraries

    • Decoupled from opentelemetry
  • client libraries: use core-tracing to create a tracing client with the correct attributes, then wrap async operations with calls to withSpan

    • Decoupled from opentelemetry
  • @azure/opentelemetry-instrumentation-azure-sdk - the glue that hooks up the core-tracing APIs with opentelemetry at runtime.

  • end users: developers who use your client library to interact with Azure services. They will use opentelemetry's APIs as needed to configure their tracing provider, exporter, etc.

    • They will also, as part of their configuration, use the opentelemetry-instrumentation-azure-sdk to wire up OTel as the implementation provider for core-tracing. They are decoupled from core-tracing and do not use it directly, the instrumentation package does

You, as a client library author, probably do not need to:

  • Use OTel's APIs directly
  • Make changes to @azure/core-tracing

An end user probably do not need to:

  • call createTracingClient
  • Use @azure/core-tracing directly

So I am not 100% sure why these changes are needed and I think we're going down the wrong path. Let's take a step back. All I would expect to see from a client library code is:

Is all this stuff needed because RLC code-gen does not support tracing today? Should we prioritize that work or add something in core-client-rest? cc @joheredi who might know the state of RLC tracing

Happy to set up a meeting about this if it helps

@maorleger I have the following requirement:
The SDK will provide support for flexible decorators that developers can easily apply to mark methods for tracing purposes, especially for functions that are not LLM calls. These decorators serve as annotations within the codebase, allowing developers to selectively enable tracing for specific methods or functions without extensive modifications. This approach ensures that tracing can be integrated into existing codebases with minimal disruption, enhancing flexibility and maintainability. This will be achieved via supporting language-specific decorators for tracing purposes. User function tracing will be provided by the existing API's in in open telemetry and Azure SDK

Since javascript doesn't support decorators, I propose an alternative solution here
https://github.com/howieleung/tracer
And trace and traceAsync are the implementation. Since I am adding tracing feature on the ai-inference function, I use it here as well and try to get the feeling.

@lmolkova
Copy link
Member

lmolkova commented Sep 6, 2024

@howieleung

The SDK will provide support for flexible decorators that developers can easily apply to mark methods for tracing purposes, especially for functions that are not LLM calls. These decorators serve as annotations within the codebase, allowing developers to selectively enable tracing for specific methods or functions without extensive modifications. This approach ensures that tracing can be integrated into existing codebases with minimal disruption, enhancing flexibility and maintainability. This will be achieved via supporting language-specific decorators for tracing purposes. User function tracing will be provided by the existing API's in in open telemetry and Azure SDK

OTel JS already provides flexible general-purpose decorators - https://github.com/open-telemetry/opentelemetry-js/blob/main/api/src/experimental/trace/SugaredTracer.ts

they are experimental, so we can't use them here, but users can if they want to.
These decorators will eventually become stable and your requirement will fix itself. If you want to improve the design, please share suggestions in https://github.com/open-telemetry/opentelemetry-js.

Azure SDKs don't provide user-facing tracing API - users are expected to interact with opentelemetry directly.

I'll leave it up to JS experts and JS core-tracing owners such as @maorleger and @mpodwysocki to decide whether new APIs are necessary/make sense for internal usage and convenience, but by any means, we should never recommend them to end users.

@howieleung
Copy link
Member Author

@howieleung

The SDK will provide support for flexible decorators that developers can easily apply to mark methods for tracing purposes, especially for functions that are not LLM calls. These decorators serve as annotations within the codebase, allowing developers to selectively enable tracing for specific methods or functions without extensive modifications. This approach ensures that tracing can be integrated into existing codebases with minimal disruption, enhancing flexibility and maintainability. This will be achieved via supporting language-specific decorators for tracing purposes. User function tracing will be provided by the existing API's in in open telemetry and Azure SDK

OTel JS already provides flexible general-purpose decorators - https://github.com/open-telemetry/opentelemetry-js/blob/main/api/src/experimental/trace/SugaredTracer.ts

they are experimental, so we can't use them here, but users can if they want to. These decorators will eventually become stable and your requirement will fix itself. If you want to improve the design, please share suggestions in https://github.com/open-telemetry/opentelemetry-js.

Azure SDKs don't provide user-facing tracing API - users are expected to interact with opentelemetry directly.

I'll leave it up to JS experts and JS core-tracing owners such as @maorleger and @mpodwysocki to decide whether new APIs are necessary/make sense for internal usage and convenience, but by any means, we should never recommend them to end users.
Yes. I read this a while ago. And I have been trying to come up with a better way to help LLM devs to write cleaner code. I just got new idea and pushed my new design again.

function tryCreateTracingClient(): TracingClient | undefined {
try {
return createTracingClient({
namespace: "Micirsoft.CognitiveServices", packageName: "ai-inference-rest", packageVersion: "1.0.0"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: Micirsoft -> Microsoft

// @public
export function getClient(endpoint: string, credentials?: TokenCredential | KeyCredential, options?: ClientOptions): Client;
export function getClient(endpoint: string, credentials?: TokenCredential | KeyCredential, options?: ClientOptions, tracer?: TracerCallback): Client;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this intentional? We cannot add random properties to our core public APIs without design, cross-language consistency, and archboard reviews

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is absolutely needed. I got to find a way to pass the tracing logic from the inference to the post function in core-client-rest. I am having a SDK review meeting on Tuesday. I will bring up this topic.

@@ -68,6 +70,8 @@ export interface TracingClient {
span: TracingSpan;
updatedOptions: OptionsWithTracingContext<Options>;
};
trace<Arguments, Return>(name: string, args: Arguments, methodToTrace: () => Return, onStartTracing?: (span: TracingSpan, args: Arguments) => void, onEndTracing?: (span: TracingSpan, args: Arguments, rt?: Return, error?: unknown) => void, options?: OperationTracingOptions, spanKind?: TracingSpanKind): Return;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, we will not be able to add this to our core tracing APIs

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can move this to inference or other more public facing library and let inference consume it.

Copy link
Member

@maorleger maorleger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately we cannot add these properties to our core interfaces - we can discuss your scenarios but this is not the right approach

@maorleger
Copy link
Member

My meta-question is whether we can put this into core-client-rest or one of the other RLC core packages instead of a one-off? code changes look good

With the new changes, there is trace and traceAsync function in core-tracing that accept several callbacks functions. LLM devs are welcome to use them to trace their own function. In additional, ai-inference has the mapper functions to create the span attributes to invoke tracing.

Just to clarify this isn't quite what I meant - LLM devs should not be using core-tracing APIs directly. They should use OTel for tracing their own functions. Core tracing provides implementation-agnostic tracing primitives for client library authors that can be enabled via our OTel plugin by end users

@howieleung
Copy link
Member Author

My meta-question is whether we can put this into core-client-rest or one of the other RLC core packages instead of a one-off? code changes look good

With the new changes, there is trace and traceAsync function in core-tracing that accept several callbacks functions. LLM devs are welcome to use them to trace their own function. In additional, ai-inference has the mapper functions to create the span attributes to invoke tracing.

Just to clarify this isn't quite what I meant - LLM devs should not be using core-tracing APIs directly. They should use OTel for tracing their own functions. Core tracing provides implementation-agnostic tracing primitives for client library authors that can be enabled via our OTel plugin by end users

My meta-question is whether we can put this into core-client-rest or one of the other RLC core packages instead of a one-off? code changes look good

With the new changes, there is trace and traceAsync function in core-tracing that accept several callbacks functions. LLM devs are welcome to use them to trace their own function. In additional, ai-inference has the mapper functions to create the span attributes to invoke tracing.

Just to clarify this isn't quite what I meant - LLM devs should not be using core-tracing APIs directly. They should use OTel for tracing their own functions. Core tracing provides implementation-agnostic tracing primitives for client library authors that can be enabled via our OTel plugin by end users

I hear you. But there is a requirement to let user trace their custom function in a handy way. If we need to deliver this, we need to introduce function. I will once again ask people in the SDK review meeting whether they should use otel or we really should design someting.

@howieleung howieleung marked this pull request as ready for review September 16, 2024 18:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants