Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature ASK] Add fields to the metadata file "templates.json" to enable better retrieval from AI systems. #4242

Open
abhishekbhombore opened this issue Aug 23, 2024 · 12 comments
Milestone

Comments

@abhishekbhombore
Copy link

This is not a bug .

This is an ask on the AZD team to consider the addition of some new fields that will help AI systems discover these templates better. Currently the static workloads team has done work to aid discovery of the AZD templates in Copilots on Azure Portal and GitHub. The endpoint Indexes information from the templates.json file - templates.json

The information in these fields are used to match one of the maintained templates to a User Intent and is then recommended as a good starting point(or a close match to the intent).

The static workloads team, presently, maintains a copy of some of these templates here - static workload templates.

We have added some additional fields to support better matching and recommendations via the Copilot systems in the Azure Portal and GitHub.

Adding these additional fields (if it is reasonable after consideration) to this central location has the following benefits:

  • The AI systems can source this information directly from here.
  • Provide the Authors the ability to define and fine-tune how these templates are recommended

Specifically, consider adding these optional fields:

  • Products - string[] - List of Azure services that are used in the template. This will help match better when users ask for a suggestion regarding a specific Azure service. Eg: Azure Open AI , Azure Key Vault.
  • Sample queries - string[] - A list of sample User queries that are good candidate examples for what this template would be a recommendation.
  • Tech - string[] - The tech stack used. Eg: Python/Node.js - Very useful in matching Intents for specific queries- "I want a web app built using Node.js"
  • Negative match terms - string[] - A list of terms that should NOT be considered a match for the template. Eg - LLMs have been observed to match RAG (Retrieval Augmented Generation) to React and GraphQL leading to some poor recommendations.

Please reach out to - [email protected] if you would like to discuss this further outside of this Issue. Happy to elaborate more on the ask.

@abhishekbhombore
Copy link
Author

@kristenwomack - FYI

@rajeshkamal5050 rajeshkamal5050 added this to the Backlog milestone Aug 23, 2024
@rajeshkamal5050
Copy link
Contributor

Thanks for reaching out @unrealab

Lets us see how to get those additional requirements into our templates.json and have a single source of truth cc @gkulin @hemarina @vhvb1989

@kristenwomack
Copy link
Collaborator

Thank you for this suggestion @unrealab

@spboyer
Copy link
Member

spboyer commented Aug 23, 2024

Would also add a flag to this metadata for approved or other flag used to make the template exposed through the API, as not all of them should be readily available for various reasons

@abhishekbhombore
Copy link
Author

Would also add a flag to this metadata for approved or other flag used to make the template exposed through the API, as not all of them should be readily available for various reasons

This is a great addition.

@vhvb1989
Copy link
Member

@unrealab would it be ok to group all those fields in one node key, like:

{
        "title": "ChatGPT + Enterprise data with Azure OpenAI and AI Search",
        "description": "A sample app for the Retrieval-Augmented",
        "author": "Microsoft",
        "source": "https://github.com/Azure-Samples/azure-search-openai-demo",
        "tags": [
            "OpenAI",
            "Azure",
            "AI Search",
            "ChatGPT",
            "Enterprise"
        ],
        "ai-integration": {
          "id": "88bce5a4-8e70-4c83-a87c-4d68e34eaf7e",
          "deploymentOptions": [
            "AzD"
          ],
          "deploymentConfig": {},
          "products": [
              "Azure OpenAI and AI Search"
          ],
          "sampleQueries": [
              "How to use Azure AI Search to power ChatGPT-style and Q&A experiences"
          ],
          "sourceType": "Azd"
       }
    }

See the key ai-integration. We can use and reserve the name you chose. We don't need an approved field in this model, as you will just ignore entries without the ai-integration field.

This will also help us to easily add more field for you and quickly recognize what fields are for the gallery and which are for integrating to your system.

@abhishekbhombore
Copy link
Author

@unrealab would it be ok to group all those fields in one node key, like:

{
        "title": "ChatGPT + Enterprise data with Azure OpenAI and AI Search",
        "description": "A sample app for the Retrieval-Augmented",
        "author": "Microsoft",
        "source": "https://github.com/Azure-Samples/azure-search-openai-demo",
        "tags": [
            "OpenAI",
            "Azure",
            "AI Search",
            "ChatGPT",
            "Enterprise"
        ],
        "ai-integration": {
          "id": "88bce5a4-8e70-4c83-a87c-4d68e34eaf7e",
          "deploymentOptions": [
            "AzD"
          ],
          "deploymentConfig": {},
          "products": [
              "Azure OpenAI and AI Search"
          ],
          "sampleQueries": [
              "How to use Azure AI Search to power ChatGPT-style and Q&A experiences"
          ],
          "sourceType": "Azd"
       }
    }

See the key ai-integration. We can use and reserve the name you chose. We don't need an approved field in this model, as you will just ignore entries without the ai-integration field.

This will also help us to easily add more field for you and quickly recognize what fields are for the gallery and which are for integrating to your system.

@vhvb1989 - Yes that would work, the ai-integration reserve name works fine. I can use its presence to determine whether a template is ready for consumption by the Copilot system.

Quick note: We don't need all the fields (from your snippet). Some fields are specific to the Copilot client(Copilot in Azure Portal) that we need on our end, and AZD doesn't need to be aware of, eg: deploymentOptions, deploymentConfig, sourceType etc. Do you have a list of fields you will add (in addition to the ones I mentioned in the issue) ? I can confirm what fields are Client specific and what make sense to be included here.

@vhvb1989
Copy link
Member

@abhishekbhombore , do you have some public docs about the supported fields and why/when folks would want to add them for the ai-integration?

Something like what you described on the top with

Products - string[] - List of Azure services that are used in the template. This will help match better when users ask for a suggestion regarding a specific Azure service. Eg: Azure Open AI , Azure Key Vault.

Sample queries - string[] - A list of sample User queries that are good candidate examples for what this template would be a recommendation.

Tech - string[] - The tech stack used. Eg: Python/Node.js - Very useful in matching Intents for specific queries- "I want a web app built using Node.js"

Negative match terms - string[] - A list of terms that should NOT be considered a match for the template. Eg - LLMs have been observed to match RAG (Retrieval Augmented Generation) to React and GraphQL leading to some poor recommendations.

We don't own all the samples/templates from the gallery. We can update the ones we are familiar with as a way for you to test (it will be just a few ones).
Then, I can create an issue in all of other repositories without the ai-integration key with an invitation to add the key and specific configuration. But, for that, we would need some public docs where folks can learn what fields they can add, which are mandatory and how they can test/use/advertise the integration after merging the changes to the gallery.

For new templates, we can include such documentation as part of the - How to add your template - docs here: https://azure.github.io/awesome-azd/docs/contribute/

@spboyer
Copy link
Member

spboyer commented Aug 23, 2024

We would want to create requirements, much like we have in the azure.yaml for the template name etc. for these and then have the validation pipeline also check for these as "optional".

@abhishekbhombore
Copy link
Author

@vhvb1989 - I don't have public docs, but I can share a doc (I'll do it over teams) with an explanation of what each of those fields do and how the Author should use it, and how it helps.

Updating a few to begin with, will work as a test.

This would be great as well.

Then, I can create an issue in all of other repositories without the ai-integration key with an invitation to add the key and specific configuration. But, for that, we would need some public docs where folks can learn what fields they can add, which are mandatory and how they can test/use/advertise the integration after merging the changes to the gallery.

Could we add the information about the fields in ai-integration to the AZD public docs for authors - https://azure.github.io/awesome-azd/docs/contribute/ ? Given we're introducing the fields directly into this repo, it might be the best place for the docs about them to live as well.

@spboyer - Not very familiar with what azure.yaml is, but I can help with providing all the information about the fields and their optionality.

We would want to create requirements, much like we have in the azure.yaml for the template name etc. for these and then have the validation pipeline also check for these as "optional".

@abhishekbhombore
Copy link
Author

@vhvb1989 , @kristenwomack , @rajeshkamal5050 , @spboyer - I've shared the doc with the fields and descriptions with you. Let me know if there's anything else needed from me.

@abhishekbhombore
Copy link
Author

@vhvb1989 @kristenwomack @rajeshkamal5050 @spboyer - Folks, want to check if this work was committed to a sprint? I'd like to move away from keeping a copy of the AZD templates, when possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants