Generative Pretrained Transformer (GPT) Retrieval-Augmented Generation (RAG)

Overview

The RAG pattern enables businesses to use the reasoning capabilities of LLMs, using their existing models to process and generate responses based on new data. RAG facilitates periodic data updates without the need for fine-tuning, thereby streamlining the integration of LLMs into businesses.

The Enterprise RAG Solution Accelerator (GPT-RAG) offers a robust architecture tailored for enterprise-grade deployment of the RAG pattern. It ensures grounded responses and is built on Zero-trust security and Responsible AI, ensuring availability, scalability, and auditability. Ideal for organizations transitioning from exploration and PoC stages to full-scale production and MVPs.

✨ See our User & Admin Guide for complete setup and usage details.

Application Components

GPT-RAG follows a modular approach, consisting of three components, each with a specific function.

Data Ingestion - Optimizes data chunking and indexing for the RAG retrieval step.
Orchestrator - Coordinates the flow to retrieve information and generate a user response. It offers two options: Function, using Semantic Kernel functions (default), and Agentic, using AutoGen agents. See deployment instructions to switch to Agentic.
App Front-End - Uses the Backend for Front-End pattern to provide a scalable and efficient web interface.

Concepts

If you want to learn more about the RAG Pattern and GPT-RAG architecture.

Setup Guide

Basic Architecture Deployment: for quick demos with no network isolation ⚙️

Learn how to quickly set up the basic architecture for scenarios without network isolation. Click the link to proceed.
Standard Zero-Trust Architecture Deployment: fastest Zero-Trust deployment option ⚡

Deploy the solution accelerator using the standard zero-trust architecture with pre-configured solution settings. No customization needed. Click the link to proceed.
Custom Zero-Trust Architecture Setup: most used ⭐

Explore options for customizing the deployment of the solution accelerator with a zero-trust architecture, adjusting solution settings to your needs. Click the link to proceed.
Step-by-Step Manual Setup: Zero-Trust Architecture: hands-on approach 🛠️

For those who prefer complete control, follow this detailed guide to manually set up the solution accelerator with a zero-trust architecture. Click the link to proceed.

Getting Started

This guide will walk you through the deployment process of Enterprise RAG. There are two deployment options available, Basic Architecture and Zero Trust Architecture. Before beginning the deployment, please ensure you have prepared all the necessary tools and services as outlined in the Prerequisites section.

Prerequisites

Azure Developer CLI: Download azd for Windows, Other OS's.
PowerShell 7+ (Windows only): PowerShell.
Git: Download Git.
Node.js 16+ Windows/Mac Linux/WSL.
Python 3.11: Download Python.
Initiate an Azure AI services creation and agree to the Responsible AI terms.¹

Basic Architecture Deployment

For quick demonstrations or proof-of-concept projects without network isolation requirements, you can deploy the accelerator using its basic architecture.

The deployment procedure is quite simple, just install the prerequisites mentioned above and follow these four steps using Azure Developer (AZD) CLI in a terminal:

Clone the Repository

git clone https://github.com/0Upjh80d/gpt-rag

Note

If using the Agentic AutoGen-based orchestrator, uncomment the git clone command for the gpt-rag-agentic repository and comment out the git clone command for the gpt-rag-orchestrator repository in both the fetchComponents.sh and fetchComponents.ps1 scripts. Below is a screenshot of the fetchComponents.sh script as an example:

Login to Azure

2a. Azure Developer CLI:
```
azd auth login
```
2b. Azure CLI:
```
az login
```
Provision the Infrastructure and Deploy the Application
```
azd up
```
Add Source Documents

Upload your documents to the documents folder located in the Azure Storage Account. The name of this account should start with strag. This is the default storage account, as shown in the sample image below.

Note

If you want to upload documents for ingestion into the GPT-RAG storage account, you must have the Storage Blob Data Contributor role assigned in Microsoft Entra ID (formerly Azure Active Directory).

✅ Done! Basic deployment is completed.

Tip

(Recommended) Add app authentication. Watch this quick tutorial for step-by-step guidance.

Zero Trust Architecture Deployment

For more secure and isolated deployments, you can opt for the Zero Trust architecture. This architecture is ideal for production environments where network isolation and stringent security measures are highly valued.

Before deploying the Zero Trust architecture, make sure to review the prerequisites. It's important to note that you will only need Node.js and Python for the second part of the process, which will be carried out on the Virtual Machine created during the deployment of this architecture.

The deployment procedure is similar to that of the Basic Architecture, but with some additional steps. For a detailed guide on deploying this option, refer to the instructions below:

Clone the Repository

git clone https://github.com/0Upjh80d/gpt-rag

Note

If using the Agentic AutoGen-based orchestrator, uncomment the git clone command for the gpt-rag-agentic repository and comment out the git clone command for the gpt-rag-orchestrator repository in both the fetchComponents.sh and fetchComponents.ps1 scripts. Below is a screenshot of the fetchComponents.sh script as an example:

Enable Network Isolation

azd env set AZURE_NETWORK_ISOLATION true

Login to Azure

3a. Azure Developer CLI:
```
azd auth login
```
3b. Azure CLI:
```
az login
```
Provision the Infrastructure
```
azd provision
```

Tip

The regions that were tested most often were eastus, eastus2, westus3.

Next, we will use the Virtual Machine with the Bastion connection (created during step 4) to continue the deployment. Log into the created Virtual Machine with the user gptrag and authenticate with the password stored in the Azure Key Vault, similar to the figure below:
Upon accessing Windows, install PowerShell, as the other prerequisites are already installed on the Virtual Machine.
Open the command prompt and run the following command to update azd to the latest version:
```
choco upgrade azd
```
After updating azd, simply close and reopen the terminal.
Create a new directory, for example, deploy then enter the created directory.
```
mkdir deploy
cd deploy
```
To finalize the procedure, execute the subsequent commands in the command prompt to successfully complete the deployment:
```
git clone https://github.com/0Upjh80d/gpt-rag
azd auth login
az login
azd env refresh
azd deploy
```

Important

When running the azd env refresh, use the same environment name, subscription, and region used in the initial provisioning of the infrastructure.

✅ Done! Zero trust deployment is completed.

Note

If you want to upload documents for ingestion into the GPT-RAG storage account, you must have the Storage Blob Data Contributor role assigned in Microsoft Entra ID (formerly Azure Active Directory).

Tip

(Recommended) Add app authentication. Watch this quick tutorial for step-by-step guidance.

How To Guide

This section provides quick guides for customizing, managing, and troubleshooting your deployment.

Customize Your Deployment

The standard deployment process sets up Azure resources and deploys the accelerator components with a standard configuration. To tailor the deployment to your specific needs, follow the steps in the Custom Deployment section for further customization options.

Multi-Environment Deployment

Once you have successfully deployed the GPT-RAG solution as a proof of concept and you are ready to formalize the deployment using a proper CI/CD process to accelerate your deployment to production, refer to the multi-environment deployment guides for either Azure DevOps or GitHub.

Troubleshooting Deployment Issues

If you encounter any errors during the deployment process, consult the Troubleshooting page for guidance on resolving common issues.

Performance Evaluation

To assess the performance of your deployment, refer to the Performance Testing guide for testing methodologies and best practices.

Querying Conversation History

Learn how to query and analyze conversation data by following the steps outlined in the How to Query and Analyze Conversations document.

Pricing Estimation

Understand the cost implications of your deployment by reviewing the Pricing Model for detailed pricing estimation.

Governance Management

Ensure proper governance of your deployment by following the guidelines provided in the Governance Model.

Contributing

We appreciate your interest in contributing to this project! Please refer to the CONTRIBUTING.md page for detailed guidelines on how to contribute and the process for submitting pull requests.

Thank you for your support and contributions!

If you have not created an Azure AI service resource in the subscription before. ↩

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.azdo/pipelines		.azdo/pipelines
.devcontainer		.devcontainer
.github		.github
datasources		datasources
docs		docs
infra		infra
loadtest		loadtest
media		media
scripts		scripts
.gitignore		.gitignore
.markdownlint.jsonc		.markdownlint.jsonc
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SUPPORT.md		SUPPORT.md
azure.yaml		azure.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Generative Pretrained Transformer (GPT) Retrieval-Augmented Generation (RAG)

Table of Contents

Overview

Application Components

Concepts

Setup Guide

Getting Started

Prerequisites

Basic Architecture Deployment

Zero Trust Architecture Deployment

How To Guide

Customize Your Deployment

Multi-Environment Deployment

Troubleshooting Deployment Issues

Performance Evaluation

Querying Conversation History

Pricing Estimation

Governance Management

Contributing

About

Releases

Packages

Languages

License

0Upjh80d/gpt-rag

Folders and files

Latest commit

History

Repository files navigation

Generative Pretrained Transformer (GPT) Retrieval-Augmented Generation (RAG)

Table of Contents

Overview

Application Components

Concepts

Setup Guide

Getting Started

Prerequisites

Basic Architecture Deployment

Zero Trust Architecture Deployment

How To Guide

Customize Your Deployment

Multi-Environment Deployment

Troubleshooting Deployment Issues

Performance Evaluation

Querying Conversation History

Pricing Estimation

Governance Management

Contributing

Footnotes

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages