Skip to content

Latest commit

 

History

History
52 lines (36 loc) · 2.97 KB

gpt4v.md

File metadata and controls

52 lines (36 loc) · 2.97 KB

Enabling GPT-4 Turbo with Vision

This repository now includes an example of integrating GPT-4 Turbo with Vision with Azure AI Search. This feature enables indexing and searching images and graphs, such as financial documents, in addition to text-based content.

Feature Overview

  • Document Handling: Source documents are split into pages and saved as PNG files in blob storage. Each file's name and page number are embedded for reference.
  • Data Extraction: Text data is extracted using OCR.
  • Data Indexing: Text and image embeddings, generated using Azure AI Vision (Azure AI Vision Embeddings), are indexed in Azure AI Search along with the raw text.
  • Search and Response: Searches can be conducted using vectors or hybrid methods. Responses are generated by GPT-4 Turbo with Vision based on the retrieved content.

Getting Started

Prerequisites

Setup and Usage

  1. Update repository: Pull the latest changes.

  2. Enable GPT-4 Turbo with Vision: Set the environment variable with azd env set USE_GPT4V true. This flag is used to deploy necessary components for vision fuctionality and to toggle UI components.

  3. Clean old deployments (optional): Run azd down --purge for a fresh setup.

  4. Start the application: Execute azd up to build, provision, deploy, and initiate document preparation.

  5. Web Application Usage: GPT4V configuration screenshot

    • Access the developer options in the web app and select "Use GPT-4 Turbo with Vision".
    • Sample questions will be updated for testing.
    • Interact with the questions to view responses.
    • The 'Thought Process' tab shows the retrieved data and its processing by GPT-4 Turbo with Vision.

Feel free to explore and contribute to enhancing this feature. For questions or feedback, use the repository's issue tracker.