RAG Application Template

License

This template is licensed under Apache 2.0 and contains the following components:

Langchain MIT
OpenAI

What is a RAG Application?

A Retrieval-Augmented Generation (RAG) application is a type of AI-powered system that combines the capabilities of retrieval-based models and generation-based models. The main goal of a RAG application is to enhance the quality and accuracy of generated content by incorporating relevant information retrieved from a vast knowledge base. This approach leverages the strengths of both retrieval and generation, providing more coherent and contextually appropriate outputs.

About this template

This project provides a template for a streamlit application which is then hosted on Domino. We are using an OpenAI embedding model to create the emebeddings of our documents and a gpt model (GPT-4, GPT-4o, etc.) as the base model.

The important files in this template are:

.env: Our environment file, a simple text file used to set environment variables.
utils.py: This file contains utility functions that are used throughout the codebase.
base_agent.py: This file defines a BaseAgent class that initializes a chat model and embeddings using Azure's OpenAI services, configured with environment variables for deployment and API access.
UseCases.py: The page on the streamlit application for use case management.
pages/Chatbot.py: The chatbot interface on the application where users can select their use case and speak with BioRAG.
app.sh: The script needed to run the app.

Setup instructions

External Services Set Up

Step 1 Ensure you have the following external services set up with the correct credentials provided in the .env file

Azure blob storage - Credentials tutorial
Qdrant Vector Database - Tutorial
Azure OpenAI Chat Model Resource - Deployment Types (See below for Azure OpenAI Keys and Endpoints)
Azure OpenAI Embeddings Resource - Deployment Types (See below for Azure OpenAI Keys and Endpoints)

Location of keys for Azure OpenAI Resources

Should you require assistance setting these services up, please reach out to Gabriele Oliva from BIP, [email protected]

Environment Requirements

The necessary packages and versions can be found in the requirements.txt file. Ensure these packages are installed in a custom Domino Environment. Please find the docker instructions below:

Step 2 Create a custom environment on Domino with a base environment of Domino Standard Environment

Step 3 Under dockerfile instructions use the instructions provided below to install the python packages into the environment.

RUN pip install langchain==0.2.0 --user
RUN pip install langchain_community==0.2.0 --user
RUN pip install langchain_openai==0.1.7 --user
RUN pip install langchain_text_splitters==0.2.0 --user
RUN pip install pandas==2.2.2 --user
RUN pip install pdfplumber==0.11.0 --user
RUN pip install python-dotenv==1.0.1 --user
RUN pip install qdrant_client==1.9.1 --user
RUN pip install streamlit==1.33.0 --user
RUN pip install azure-storage-blob==12.20.0 --user
RUN pip install python-docx==1.1.1 --user

Hardware Requirements

Step 4 Utilize small hardware tier requirements

Environment Variables

In your domino project, you will want to add the following environment variables in your project's settings:

Environment Variable Name	Your value
QDRANT_URL	"YOUR QDRANT URL"
QDRANT_KEY	"YOUR QDRANT KEY"
AZURE_EMBEDDINGS_DEPLOYMENT_NAME	"YOUR AZURE EMBEDDINGS DEPLOYMENT NAME"
AZURE_EMBEDDINGS_API_KEY	"YOUR AZURE EMBEDDINGS API KEY"
AZURE_EMBEDDINGS_MODEL_NAME	"text-embedding-ada-002"
AZURE_EMBEDDINGS_ENDPOINT	"YOUR AZURE EMBEDDINGS ENDPOINT"
OPENAI_API_VERSION	"2023-05-15"
OPEN_AI_TYPE	"azure"
AZURE_CHAT_ENDPOINT	"YOUR AZURE CHAT ENDPOINT"
AZURE_CHAT_DEPLOYMENT_NAME	"YOUR AZURE CHAT DEPLOYMENT NAME"
AZURE_CHAT_API_KEY	"YOUR AZURE CHAT API KEY"
AZURE_CHAT_MODEL	"YOUR AZURE CHAT MODEL"
AZURE_BLOB_CONTAINER_NAME	"YOUR AZURE BLOB CONTAINER NAME"
AZURE_BLOB_CONNECTION_STRING	"YOUR AZURE BLOB CONNECTION STRING"

Once all are saved, you are ready!

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
pages		pages
tests		tests
CHANGELOG.md		CHANGELOG.md
Clinical Trial Protocol.pdf		Clinical Trial Protocol.pdf
Investigator Brochure.pdf		Investigator Brochure.pdf
LICENSE.md		LICENSE.md
README.md		README.md
Training Material.pdf		Training Material.pdf
UseCases.py		UseCases.py
app.sh		app.sh
base_agent.py		base_agent.py
requirements.txt		requirements.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAG Application Template

License

What is a RAG Application?

About this template

Setup instructions

External Services Set Up

Location of keys for Azure OpenAI Resources

Environment Requirements

Hardware Requirements

Environment Variables

About

Releases

Packages

Languages

License

dominodatalab/BioRAG-AI-Template

Folders and files

Latest commit

History

Repository files navigation

RAG Application Template

License

What is a RAG Application?

About this template

Setup instructions

External Services Set Up

Location of keys for Azure OpenAI Resources

Environment Requirements

Hardware Requirements

Environment Variables

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages