GitHub - Fady-Ibra/RAG

Overview

We'll create a QA system that understands both text and images
We'll build this system using Vertex AI

Focus on Fundamentals: We will start with the essential design pattern of Retrieval Augmented Generation (RAG)—a way to find and use relevant information to answer questions.
Work with Text and Images: We will expand RAG to handle both text and images found in PDF docs.
Use Vertex AI: We will use Vertex AI Embeddings API and Vertex AI Gemini API.

By the end, we will have a solid foundation in building multimodal QA systems.

Gemini

Gemini is a family of GenAI models that is designed for multimodal use cases.
The Vertex AI Gemini API gives us access to:
- Gemini 1.0 Pro Model
- Gemini 1.0 Pro Vision Model
- Gemini 1.5 Pro Model
- Gemini 1.5 Flash Model

Comparing text-based and mRAG

Multimodal RAG (mRAG) offers several advantages over text-based RAG:

Enhanced knowledge access: mRAG can access and process both textual and visual info, providing a more comprehensive knowledge base for the LLM.
Improved reasoning capabilities: By incorporating visual cues, mRAG can make better-informed inferences across different types of data modalities.

How to implement RAG

We will implement RAG using:

Vertex AI Gemini API
Vertex AI Embeddings API
- text embeddings
- multimodal embeddings, to build a doc search engine.

Objectives

This notebook provides a guide to building a doc search engine using mRAG, step by step:

Extract and store metadata of docs containing both text and images and generate embeddings of the docs
Search the metadata with text queries to find similar text or images
Search the metadata with image queries to find similar images
Using a text query as input, search for contextual answers using both text and images

Costs

Vertex AI pricing
Pricing Calculator (generates a cost estimate)

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
LICENSE		LICENSE
README.md		README.md
mRAG.ipynb		mRAG.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

Gemini

Comparing text-based and mRAG

How to implement RAG

Objectives

Costs

About

Releases

Packages

Languages

License

Fady-Ibra/RAG

Folders and files

Latest commit

History

Repository files navigation

Overview

Gemini

Comparing text-based and mRAG

How to implement RAG

Objectives

Costs

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages