-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added tracing for inference #30800
Added tracing for inference #30800
Conversation
API change check API changes are not detected in this pull request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My meta-question is whether we can put this into core-client-rest or one of the other RLC core packages instead of a one-off? code changes look good
Good question. My assignment is specifically for inference only mirroring this MR for python. I guess this schema isn't for all clients and this is the first one we are instrumenting. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great start!
Would it be possible to try it out and attach some screenshots?
It's probably the easiest to create Azure Monitor (Application Insights) resource and enable azmon otel -
https://github.com/Azure/azure-sdk-for-js/tree/main/sdk/monitor/monitor-opentelemetry#enable-azure-monitor-opentelemetry-client
Let me know if you need any help!
|
||
function tryProcessError(span: TracingSpan, error: unknown): void { | ||
try { | ||
span.setStatus({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to set error.type
here too (to fully qualified exception type name)
}; | ||
*/ | ||
span.setStatus({ | ||
status: "success", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should not set status to success, let's leave it unset
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unset now
2ad5d64
to
a3dbfde
Compare
040eacd
to
be263b4
Compare
sdk/ai/ai-inference-rest/samples/v1-beta/javascript/telemetry.js
Outdated
Show resolved
Hide resolved
console.log("== Chat Completions Sample =="); | ||
|
||
// initialize a span named "main" with default options for the spans | ||
await tracingClient.withSpan("main", {}, async (updatedOptions) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use OTel API here - this is end-user code -
await tracingClient.withSpan("main", {}, async (updatedOptions) => { | |
await tracer.startActiveSpan('rollTheDice', (span: Span) => { |
https://opentelemetry.io/docs/languages/js/instrumentation/#create-spans
sdk/ai/ai-inference-rest/samples/v1-beta/javascript/telemetry.js
Outdated
Show resolved
Hide resolved
With the new changes, there is trace and traceAsync function in core-tracing that accept several callbacks functions. LLM devs are welcome to use them to trace their own function. In additional, ai-inference has the mapper functions to create the span attributes to invoke tracing. |
sdk/ai/ai-inference-rest/samples/v1-beta/typescript/src/telemetry.ts
Outdated
Show resolved
Hide resolved
console.log("== Chat Completions Sample =="); | ||
|
||
// Initialize a span named "main" with default options for the spans | ||
await tracingClient.withSpan("main", {}, async (updatedOptions) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
const body = (request as RequestParameterWithBodyType).body; | ||
|
||
map.set(TracingAttributesEnum.Operation_Name, getOperationName(path)); | ||
map.set(TracingAttributesEnum.System, "az.ai_inference"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry for changing things, we'll need to update it again to whatever will be merged in open-telemetry/semantic-conventions#1393 (currently it's az.ai.inference
)
map.set(TracingAttributesEnum.Request_Frequency_Penalty, body?.frequency_penalty); | ||
map.set(TracingAttributesEnum.Request_Max_Tokens, body?.max_tokens); | ||
map.set(TracingAttributesEnum.Request_Presence_Penalty, body?.presence_penalty); | ||
map.set(TracingAttributesEnum.Request_Stop_Sequences, body?.stop); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the type of body.stop
? The attribute should be an array, even if there is one stop sequence (and not set if there is no stop sequence)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. It is an array.
const map = new Map<string, unknown>(); | ||
if (error) { | ||
if (error instanceof Error) { | ||
map.set(TracingAttributesEnum.Error_Type, error.message); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the error type attribute should be the fully qualified type of the exception, not the message
if error has happened, we also need to set status on the span, e.g. like here
azure-sdk-for-js/sdk/core/ts-http-runtime/src/policies/tracingPolicy.ts
Lines 138 to 141 in f169f9e
span.setStatus({ status: "error", error: isError(error) ? error : undefined, });
|
||
const request = args as RequestParameterWithBodyType; | ||
|
||
const name = `${getOperationName(path)} ${request.body?.model ?? ""}`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would there be a trailing comma if there is no model?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NO. it will just be the word, chat
after trimming
import { createTracingClient, OperationTracingOptions } from "@azure/core-tracing"; | ||
import { GetChatCompletionsBodyParam, GetEmbeddingsBodyParam, GetImageEmbeddingsBodyParam } from "./parameters.js"; | ||
|
||
const traceCLient = createTracingClient({ namespace: "Micirsoft.CognitiveServices", packageName: "ai-inference-rest", packageVersion: "1.0.0" }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const traceCLient = createTracingClient({ namespace: "Micirsoft.CognitiveServices", packageName: "ai-inference-rest", packageVersion: "1.0.0" }); | |
const traceClient = createTracingClient({ namespace: "Micirsoft.CognitiveServices", packageName: "ai-inference-rest", packageVersion: "1.0.0" }); |
core-tracing is not meant to be used by LLM devs, only Azure SDK client libraries. I think there's a misunderstanding of roles and responsibilities here that I would like to clarify:
You, as a client library author, probably do not need to:
An end user probably do not need to:
So I am not 100% sure why these changes are needed and I think we're going down the wrong path. Let's take a step back. All I would expect to see from a client library code is:
Is all this stuff needed because RLC code-gen does not support tracing today? Should we prioritize that work or add something in core-client-rest? cc @joheredi who might know the state of RLC tracing Happy to set up a meeting about this if it helps |
@maorleger I have the following requirement: Since javascript doesn't support decorators, I propose an alternative solution here |
OTel JS already provides flexible general-purpose decorators - https://github.com/open-telemetry/opentelemetry-js/blob/main/api/src/experimental/trace/SugaredTracer.ts they are experimental, so we can't use them here, but users can if they want to. Azure SDKs don't provide user-facing tracing API - users are expected to interact with opentelemetry directly. I'll leave it up to JS experts and JS core-tracing owners such as @maorleger and @mpodwysocki to decide whether new APIs are necessary/make sense for internal usage and convenience, but by any means, we should never recommend them to end users. |
|
9f1ee25
to
3aee693
Compare
function tryCreateTracingClient(): TracingClient | undefined { | ||
try { | ||
return createTracingClient({ | ||
namespace: "Micirsoft.CognitiveServices", packageName: "ai-inference-rest", packageVersion: "1.0.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: Micirsoft -> Microsoft
c39c4c7
to
cac0b81
Compare
cac0b81
to
ddc4daa
Compare
// @public | ||
export function getClient(endpoint: string, credentials?: TokenCredential | KeyCredential, options?: ClientOptions): Client; | ||
export function getClient(endpoint: string, credentials?: TokenCredential | KeyCredential, options?: ClientOptions, tracer?: TracerCallback): Client; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this intentional? We cannot add random properties to our core public APIs without design, cross-language consistency, and archboard reviews
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is absolutely needed. I got to find a way to pass the tracing logic from the inference to the post function in core-client-rest. I am having a SDK review meeting on Tuesday. I will bring up this topic.
@@ -68,6 +70,8 @@ export interface TracingClient { | |||
span: TracingSpan; | |||
updatedOptions: OptionsWithTracingContext<Options>; | |||
}; | |||
trace<Arguments, Return>(name: string, args: Arguments, methodToTrace: () => Return, onStartTracing?: (span: TracingSpan, args: Arguments) => void, onEndTracing?: (span: TracingSpan, args: Arguments, rt?: Return, error?: unknown) => void, options?: OperationTracingOptions, spanKind?: TracingSpanKind): Return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, we will not be able to add this to our core tracing APIs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can move this to inference or other more public facing library and let inference consume it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately we cannot add these properties to our core interfaces - we can discuss your scenarios but this is not the right approach
Just to clarify this isn't quite what I meant - LLM devs should not be using core-tracing APIs directly. They should use OTel for tracing their own functions. Core tracing provides implementation-agnostic tracing primitives for client library authors that can be enabled via our OTel plugin by end users |
I hear you. But there is a requirement to let user trace their custom function in a handy way. If we need to deliver this, we need to introduce function. I will once again ask people in the SDK review meeting whether they should use otel or we really should design someting. |
22be8e3
to
7040082
Compare
7040082
to
5f66030
Compare
Packages impacted by this PR
Issues associated with this PR
Describe the problem that is addressed by this PR
Added telemetry tracing for inference
TODO before merging:
add test
PR for typespecs
Add event tracking
Add condition to enable/disable certain attributes
What are the possible designs available to address the problem? If there are more than one possible design, why was the one in this PR chosen?
Are there test cases added in this PR? (If not, why?)
Provide a list of related PRs (if any)
Command used to generate this PR:**(Applicable only to SDK release request PRs)
Checklists