Lens - AI-powered Nepali Document Digitization and Interaction Platform

Lens is a platform that uses AI and OCR technology to digitize Nepali documents, enable editing and printing, and save receipts and invoices. With features like "Chat with PDF", users can query and interact with the document content, and the Text-to-Speech tool allows for reading and translations.

Features

PDF Upload & Management: Upload Nepali PDF documents and have them processed.
Text Extraction: Extracts text from Nepali PDFs for further querying and manipulation.
AI-Powered Chat: Ask questions about the document content, powered by Google's Generative AI (Gemini).
Text-to-Speech: Listen to the text extracted from PDFs using Text-to-Speech technology.
Editable Content: Modify and interact with the content of scanned documents.
OCR: Digitize scanned Nepali receipts and invoices for easy access and management.

Architecture

The Lens platform consists of the following main components:

Frontend (Streamlit UI/UX): Provides an intuitive interface for users to upload PDFs, ask questions, and interact with the document.
Backend (Text Extraction, AI, and Storage): Handles the PDF text extraction (using PyMuPDF), processes user questions using the Generative AI Model (Google Gemini), and stores data for faster access.
Text-to-Speech Module: Converts the extracted text to speech for a hands-free experience.
User Interaction Logging: Logs user interactions with the platform for analysis and improvements.

System Architecture

+------------------+             +---------------------------+       +-------------------------------+
|   User Device    |   --->      |   Frontend (Streamlit)     |   ---> |   Backend (Server/Cloud)      |
|  (Web Browser)   |             |   (UI/UX Interface)        |       |   (AI Models, PDF Processor)  |
+------------------+             +---------------------------+       +-------------------------------+
                                              |                                     |
                                              |                                     |
                                              v                                     v
                          +--------------------------------+       +-------------------------------+
                          |    PDF Upload & Management    |       |    Text Extraction Module     |
                          |    (File upload, storage)     |       |   (PyMuPDF)                   |
                          +--------------------------------+       +-------------------------------+
                                              |                                     |
                                              v                                     v
                          +--------------------------------+       +-------------------------------+
                          |   Text-to-Speech Module        |       |   Generative AI Model (Gemini)|
                          |   (Convert text to speech)    |       |   (Question answering, chat)  |
                          +--------------------------------+       +-------------------------------+
                                              |                                     |
                                              v                                     v
                          +--------------------------------+       +-------------------------------+
                          |  Text Storage & Caching       |       |   Response Generation         |
                          |  (Cache extracted text)       |       |   (Process responses)         |
                          +--------------------------------+       +-------------------------------+
                                              |
                                              v
                          +--------------------------------+
                          |   User Interaction Logging    |
                          |   (Store user questions &    |
                          |    interactions)             |
                          +--------------------------------+

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
hack		hack
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Lens - AI-powered Nepali Document Digitization and Interaction Platform

Features

Architecture

System Architecture

About

Uh oh!

Releases

Packages

Languages

stha-sanket/LENS_OCR-digitalizer_ObrbitHack

Folders and files

Latest commit

History

Repository files navigation

Lens - AI-powered Nepali Document Digitization and Interaction Platform

Features

Architecture

System Architecture

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages