Skip to content

Commit

Permalink
.Net: New Azure AI Inference Connector (#7963)
Browse files Browse the repository at this point in the history
# Motivation and Context

This PR brings support for Azure AI Studio Model Catalogs also deployed
thru GitHub Models, this Connector uses the `Azure AI Inference SDK`
library client.

Closes #3992 
Closes #7958
  • Loading branch information
RogerBarreto authored Sep 10, 2024
1 parent f79eaaf commit c11ab29
Show file tree
Hide file tree
Showing 40 changed files with 2,871 additions and 23 deletions.
2 changes: 2 additions & 0 deletions .github/workflows/dotnet-build-and-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,8 @@ jobs:
Bing__ApiKey: ${{ secrets.BING__APIKEY }}
OpenAI__ApiKey: ${{ secrets.OPENAI__APIKEY }}
OpenAI__ChatModelId: ${{ vars.OPENAI__CHATMODELID }}
AzureAIInference__ApiKey: ${{ secrets.AZUREAIINFERENCE__APIKEY }}
AzureAIInference__Endpoint: ${{ secrets.AZUREAIINFERENCE__ENDPOINT }}

# Generate test reports and check coverage
- name: Generate test reports
Expand Down
46 changes: 46 additions & 0 deletions docs/decisions/0051-dotnet-azure-model-as-a-service.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
---
# These are optional elements. Feel free to remove any of them.
status: proposed
contact: rogerbarreto
date: 2024-08-07
deciders: rogerbarreto, markwallace-microsoft
consulted: taochen
---

# Support Connector for .Net Azure Model-as-a-Service (Azure AI Studio)

## Context and Problem Statement

There has been a demand from customers to use and support natively models deployed in [Azure AI Studio - Serverless APIs](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/model-catalog-overview#model-deployment-managed-compute-and-serverless-api-pay-as-you-go), This mode of consumption operates on a pay-as-you-go basis, typically using tokens for billing purposes. Clients can access the service via the [Azure AI Model Inference API](https://learn.microsoft.com/en-us/azure/ai-studio/reference/reference-model-inference-api?tabs=azure-studio) or client SDKs.

At present, there is no official support for [Azure AI Studio](https://learn.microsoft.com/en-us/azure/ai-studio/what-is-ai-studio). The purpose of this ADR is to examine the constraints of the service and explore potential solutions to enable support for the service via the development of a new AI connector.

## Azure Inference Client library for .NET

The Azure team has a new client library, namely [Azure.AI.Inference](https://github.com/Azure/azure-sdk-for-net/blob/Azure.AI.Inference_1.0.0-beta.1/sdk/ai/Azure.AI.Inference/README.md) in .Net, for effectively interacting with the service. While the service API is OpenAI-compatible, it is not permissible to use the OpenAI and the Azure OpenAI client libraries for interacting with the service as they are not independent with respect to both the models and their providers. This is because Azure AI Studio features a diverse range of open-source models, other than OpenAI models.

### Limitations

Currently is known that the first version of the client SDK will only support: `Chat Completion` and `Text Embedding Generation` and `Image Embedding Generation` with `TextToImage Generation` planned.

There are no current plans to support `Text Generation` modality.

## AI Connector

### Namespace options

- `Microsoft.SemanticKernel.Connectors.AzureAI`
- `Microsoft.SemanticKernel.Connectors.AzureAIInference`
- `Microsoft.SemanticKernel.Connectors.AzureAIModelInference`

Decision: `Microsoft.SemanticKernel.Connectors.AzureAIInference`

### Support for model-specific parameters

Models can possess supplementary parameters that are not part of the default API. The service API and the client SDK enable the provision of model-specific parameters. Users can provide model-specific settings via a dedicated argument along with other settings, such as `temperature` and `top_p`, among others.

Azure AI Inference specialized `PromptExecutionSettings`, will support those customizable parameters.

### Feature Branch

The development of the Azure AI Inference connector will be done in a feature branch named `feature-connectors-azureaiinference`.
1 change: 1 addition & 0 deletions dotnet/Directory.Packages.props
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
<ManagePackageVersionsCentrally>true</ManagePackageVersionsCentrally>
</PropertyGroup>
<ItemGroup>
<PackageVersion Include="Azure.AI.Inference" Version="1.0.0-beta.1" />
<PackageVersion Include="OpenAI" Version="2.0.0-beta.10" />
<PackageVersion Include="System.ClientModel" Version="1.1.0-beta.7" />
<PackageVersion Include="Azure.AI.ContentSafety" Version="1.0.0" />
Expand Down
18 changes: 18 additions & 0 deletions dotnet/SK-dotnet.sln
Original file line number Diff line number Diff line change
Expand Up @@ -334,6 +334,10 @@ Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "Connectors.AzureOpenAI", "s
EndProject
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "Connectors.AzureOpenAI.UnitTests", "src\Connectors\Connectors.AzureOpenAI.UnitTests\Connectors.AzureOpenAI.UnitTests.csproj", "{8CF06B22-50F3-4F71-A002-622DB49DF0F5}"
EndProject
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "Connectors.AzureAIInference", "src\Connectors\Connectors.AzureAIInference\Connectors.AzureAIInference.csproj", "{063044B2-A901-43C5-BFDF-5E4E71C7BC33}"
EndProject
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "Connectors.AzureAIInference.UnitTests", "src\Connectors\Connectors.AzureAIInference.UnitTests\Connectors.AzureAIInference.UnitTests.csproj", "{E0D45DDB-6D32-40FC-AC79-E1F342C4F513}"
EndProject
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "OnnxSimpleRAG", "samples\Demos\OnnxSimpleRAG\OnnxSimpleRAG.csproj", "{8972254B-B8F0-4119-953B-378E3BACA59A}"
EndProject
Global
Expand Down Expand Up @@ -853,6 +857,18 @@ Global
{8CF06B22-50F3-4F71-A002-622DB49DF0F5}.Publish|Any CPU.Build.0 = Debug|Any CPU
{8CF06B22-50F3-4F71-A002-622DB49DF0F5}.Release|Any CPU.ActiveCfg = Release|Any CPU
{8CF06B22-50F3-4F71-A002-622DB49DF0F5}.Release|Any CPU.Build.0 = Release|Any CPU
{063044B2-A901-43C5-BFDF-5E4E71C7BC33}.Debug|Any CPU.ActiveCfg = Debug|Any CPU
{063044B2-A901-43C5-BFDF-5E4E71C7BC33}.Debug|Any CPU.Build.0 = Debug|Any CPU
{063044B2-A901-43C5-BFDF-5E4E71C7BC33}.Publish|Any CPU.ActiveCfg = Publish|Any CPU
{063044B2-A901-43C5-BFDF-5E4E71C7BC33}.Publish|Any CPU.Build.0 = Publish|Any CPU
{063044B2-A901-43C5-BFDF-5E4E71C7BC33}.Release|Any CPU.ActiveCfg = Release|Any CPU
{063044B2-A901-43C5-BFDF-5E4E71C7BC33}.Release|Any CPU.Build.0 = Release|Any CPU
{E0D45DDB-6D32-40FC-AC79-E1F342C4F513}.Debug|Any CPU.ActiveCfg = Debug|Any CPU
{E0D45DDB-6D32-40FC-AC79-E1F342C4F513}.Debug|Any CPU.Build.0 = Debug|Any CPU
{E0D45DDB-6D32-40FC-AC79-E1F342C4F513}.Publish|Any CPU.ActiveCfg = Debug|Any CPU
{E0D45DDB-6D32-40FC-AC79-E1F342C4F513}.Publish|Any CPU.Build.0 = Debug|Any CPU
{E0D45DDB-6D32-40FC-AC79-E1F342C4F513}.Release|Any CPU.ActiveCfg = Release|Any CPU
{E0D45DDB-6D32-40FC-AC79-E1F342C4F513}.Release|Any CPU.Build.0 = Release|Any CPU
{8972254B-B8F0-4119-953B-378E3BACA59A}.Debug|Any CPU.ActiveCfg = Debug|Any CPU
{8972254B-B8F0-4119-953B-378E3BACA59A}.Debug|Any CPU.Build.0 = Debug|Any CPU
{8972254B-B8F0-4119-953B-378E3BACA59A}.Publish|Any CPU.ActiveCfg = Debug|Any CPU
Expand Down Expand Up @@ -975,6 +991,8 @@ Global
{36DDC119-C030-407E-AC51-A877E9E0F660} = {1B4CBDE0-10C2-4E7D-9CD0-FE7586C96ED1}
{7AAD7388-307D-41FB-B80A-EF9E3A4E31F0} = {1B4CBDE0-10C2-4E7D-9CD0-FE7586C96ED1}
{8CF06B22-50F3-4F71-A002-622DB49DF0F5} = {1B4CBDE0-10C2-4E7D-9CD0-FE7586C96ED1}
{063044B2-A901-43C5-BFDF-5E4E71C7BC33} = {1B4CBDE0-10C2-4E7D-9CD0-FE7586C96ED1}
{E0D45DDB-6D32-40FC-AC79-E1F342C4F513} = {1B4CBDE0-10C2-4E7D-9CD0-FE7586C96ED1}
{8972254B-B8F0-4119-953B-378E3BACA59A} = {5D4C0700-BBB5-418F-A7B2-F392B9A18263}
EndGlobalSection
GlobalSection(ExtensibilityGlobals) = postSolution
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
// Copyright (c) Microsoft. All rights reserved.

using System.Text;
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.ChatCompletion;
using Microsoft.SemanticKernel.Connectors.AzureAIInference;

namespace ChatCompletion;

// The following example shows how to use Semantic Kernel with Azure AI Inference / Azure AI Studio
public class AzureAIInference_ChatCompletion(ITestOutputHelper output) : BaseTest(output)
{
[Fact]
public async Task ServicePromptAsync()
{
Console.WriteLine("======== Azure AI Inference - Chat Completion ========");

var chatService = new AzureAIInferenceChatCompletionService(
endpoint: new Uri(TestConfiguration.AzureAIInference.Endpoint),
apiKey: TestConfiguration.AzureAIInference.ApiKey);

Console.WriteLine("Chat content:");
Console.WriteLine("------------------------");

var chatHistory = new ChatHistory("You are a librarian, expert about books");

// First user message
chatHistory.AddUserMessage("Hi, I'm looking for book suggestions");
OutputLastMessage(chatHistory);

// First assistant message
var reply = await chatService.GetChatMessageContentAsync(chatHistory);
chatHistory.Add(reply);
OutputLastMessage(chatHistory);

// Second user message
chatHistory.AddUserMessage("I love history and philosophy, I'd like to learn something new about Greece, any suggestion");
OutputLastMessage(chatHistory);

// Second assistant message
reply = await chatService.GetChatMessageContentAsync(chatHistory);
chatHistory.Add(reply);
OutputLastMessage(chatHistory);

/* Output:
Chat content:
------------------------
System: You are a librarian, expert about books
------------------------
User: Hi, I'm looking for book suggestions
------------------------
Assistant: Sure, I'd be happy to help! What kind of books are you interested in? Fiction or non-fiction? Any particular genre?
------------------------
User: I love history and philosophy, I'd like to learn something new about Greece, any suggestion?
------------------------
Assistant: Great! For history and philosophy books about Greece, here are a few suggestions:
1. "The Greeks" by H.D.F. Kitto - This is a classic book that provides an overview of ancient Greek history and culture, including their philosophy, literature, and art.
2. "The Republic" by Plato - This is one of the most famous works of philosophy in the Western world, and it explores the nature of justice and the ideal society.
3. "The Peloponnesian War" by Thucydides - This is a detailed account of the war between Athens and Sparta in the 5th century BCE, and it provides insight into the political and military strategies of the time.
4. "The Iliad" by Homer - This epic poem tells the story of the Trojan War and is considered one of the greatest works of literature in the Western canon.
5. "The Histories" by Herodotus - This is a comprehensive account of the Persian Wars and provides a wealth of information about ancient Greek culture and society.
I hope these suggestions are helpful!
------------------------
*/
}

[Fact]
public async Task ChatPromptAsync()
{
StringBuilder chatPrompt = new("""
<message role="system">You are a librarian, expert about books</message>
<message role="user">Hi, I'm looking for book suggestions</message>
""");

var kernel = Kernel.CreateBuilder()
.AddAzureAIInferenceChatCompletion(
endpoint: new Uri(TestConfiguration.AzureAIInference.Endpoint),
apiKey: TestConfiguration.AzureAIInference.ApiKey)
.Build();

var reply = await kernel.InvokePromptAsync(chatPrompt.ToString());

chatPrompt.AppendLine($"<message role=\"assistant\"><![CDATA[{reply}]]></message>");
chatPrompt.AppendLine("<message role=\"user\">I love history and philosophy, I'd like to learn something new about Greece, any suggestion</message>");

reply = await kernel.InvokePromptAsync(chatPrompt.ToString());

Console.WriteLine(reply);
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,176 @@
// Copyright (c) Microsoft. All rights reserved.

using System.Text;
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.ChatCompletion;
using Microsoft.SemanticKernel.Connectors.AzureAIInference;

namespace ChatCompletion;

/// <summary>
/// These examples demonstrate the ways different content types are streamed by OpenAI LLM via the chat completion service.
/// </summary>
public class AzureAIInference_ChatCompletionStreaming(ITestOutputHelper output) : BaseTest(output)
{
/// <summary>
/// This example demonstrates chat completion streaming using OpenAI.
/// </summary>
[Fact]
public Task StreamChatAsync()
{
Console.WriteLine("======== Azure AI Inference - Chat Completion Streaming ========");

var chatService = new AzureAIInferenceChatCompletionService(
endpoint: new Uri(TestConfiguration.AzureAIInference.Endpoint),
apiKey: TestConfiguration.AzureAIInference.ApiKey);

return this.StartStreamingChatAsync(chatService);
}

/// <summary>
/// This example demonstrates chat completion streaming using OpenAI via the kernel.
/// </summary>
[Fact]
public async Task StreamChatPromptAsync()
{
Console.WriteLine("======== Azure AI Inference - Chat Prompt Completion Streaming ========");

StringBuilder chatPrompt = new("""
<message role="system">You are a librarian, expert about books</message>
<message role="user">Hi, I'm looking for book suggestions</message>
""");

var kernel = Kernel.CreateBuilder()
.AddAzureAIInferenceChatCompletion(
endpoint: new Uri(TestConfiguration.AzureAIInference.Endpoint),
apiKey: TestConfiguration.AzureAIInference.ApiKey)
.Build();

var reply = await StreamMessageOutputFromKernelAsync(kernel, chatPrompt.ToString());

chatPrompt.AppendLine($"<message role=\"assistant\"><![CDATA[{reply}]]></message>");
chatPrompt.AppendLine("<message role=\"user\">I love history and philosophy, I'd like to learn something new about Greece, any suggestion</message>");

reply = await StreamMessageOutputFromKernelAsync(kernel, chatPrompt.ToString());

Console.WriteLine(reply);
}

/// <summary>
/// This example demonstrates how the chat completion service streams text content.
/// It shows how to access the response update via StreamingChatMessageContent.Content property
/// and alternatively via the StreamingChatMessageContent.Items property.
/// </summary>
[Fact]
public async Task StreamTextFromChatAsync()
{
Console.WriteLine("======== Stream Text from Chat Content ========");

// Create chat completion service
var chatService = new AzureAIInferenceChatCompletionService(
endpoint: new Uri(TestConfiguration.AzureAIInference.Endpoint),
apiKey: TestConfiguration.AzureAIInference.ApiKey);

// Create chat history with initial system and user messages
ChatHistory chatHistory = new("You are a librarian, an expert on books.");
chatHistory.AddUserMessage("Hi, I'm looking for book suggestions.");
chatHistory.AddUserMessage("I love history and philosophy. I'd like to learn something new about Greece, any suggestion?");

// Start streaming chat based on the chat history
await foreach (StreamingChatMessageContent chatUpdate in chatService.GetStreamingChatMessageContentsAsync(chatHistory))
{
// Access the response update via StreamingChatMessageContent.Content property
Console.Write(chatUpdate.Content);

// Alternatively, the response update can be accessed via the StreamingChatMessageContent.Items property
Console.Write(chatUpdate.Items.OfType<StreamingTextContent>().FirstOrDefault());
}
}

/// <summary>
/// Starts streaming chat with the chat completion service.
/// </summary>
/// <param name="chatCompletionService">The chat completion service instance.</param>
private async Task StartStreamingChatAsync(IChatCompletionService chatCompletionService)
{
Console.WriteLine("Chat content:");
Console.WriteLine("------------------------");

var chatHistory = new ChatHistory("You are a librarian, expert about books");
OutputLastMessage(chatHistory);

// First user message
chatHistory.AddUserMessage("Hi, I'm looking for book suggestions");
OutputLastMessage(chatHistory);

// First assistant message
await StreamMessageOutputAsync(chatCompletionService, chatHistory, AuthorRole.Assistant);

// Second user message
chatHistory.AddUserMessage("I love history and philosophy, I'd like to learn something new about Greece, any suggestion?");
OutputLastMessage(chatHistory);

// Second assistant message
await StreamMessageOutputAsync(chatCompletionService, chatHistory, AuthorRole.Assistant);
}

/// <summary>
/// Streams the message output from the chat completion service.
/// </summary>
/// <param name="chatCompletionService">The chat completion service instance.</param>
/// <param name="chatHistory">The chat history instance.</param>
/// <param name="authorRole">The author role.</param>
private async Task StreamMessageOutputAsync(IChatCompletionService chatCompletionService, ChatHistory chatHistory, AuthorRole authorRole)
{
bool roleWritten = false;
string fullMessage = string.Empty;

await foreach (var chatUpdate in chatCompletionService.GetStreamingChatMessageContentsAsync(chatHistory))
{
if (!roleWritten && chatUpdate.Role.HasValue)
{
Console.Write($"{chatUpdate.Role.Value}: {chatUpdate.Content}");
roleWritten = true;
}

if (chatUpdate.Content is { Length: > 0 })
{
fullMessage += chatUpdate.Content;
Console.Write(chatUpdate.Content);
}
}

Console.WriteLine("\n------------------------");
chatHistory.AddMessage(authorRole, fullMessage);
}

/// <summary>
/// Outputs the chat history by streaming the message output from the kernel.
/// </summary>
/// <param name="kernel">The kernel instance.</param>
/// <param name="prompt">The prompt message.</param>
/// <returns>The full message output from the kernel.</returns>
private async Task<string> StreamMessageOutputFromKernelAsync(Kernel kernel, string prompt)
{
bool roleWritten = false;
string fullMessage = string.Empty;

await foreach (var chatUpdate in kernel.InvokePromptStreamingAsync<StreamingChatMessageContent>(prompt))
{
if (!roleWritten && chatUpdate.Role.HasValue)
{
Console.Write($"{chatUpdate.Role.Value}: {chatUpdate.Content}");
roleWritten = true;
}

if (chatUpdate.Content is { Length: > 0 })
{
fullMessage += chatUpdate.Content;
Console.Write(chatUpdate.Content);
}
}

Console.WriteLine("\n------------------------");
return fullMessage;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ private async Task SimpleChatAsync(Kernel kernel)
chatHistory.AddUserMessage("Hi, I'm looking for new power tools, any suggestion?");
await MessageOutputAsync(chatHistory);

// First bot assistant message
// First assistant message
var reply = await chat.GetChatMessageContentAsync(chatHistory);
chatHistory.Add(reply);
await MessageOutputAsync(chatHistory);
Expand All @@ -105,7 +105,7 @@ private async Task SimpleChatAsync(Kernel kernel)
chatHistory.AddUserMessage("I'm looking for a drill, a screwdriver and a hammer.");
await MessageOutputAsync(chatHistory);

// Second bot assistant message
// Second assistant message
reply = await chat.GetChatMessageContentAsync(chatHistory);
chatHistory.Add(reply);
await MessageOutputAsync(chatHistory);
Expand Down
Loading

0 comments on commit c11ab29

Please sign in to comment.