📚 Multi-Source Knowledge Intelligence Agent

Successfully developed a Streamlit-based AI system that intelligently analyzes and interprets information from multiple data sources - including URLs and uploaded documents (PDF, DOCX, CSV, PPTX, TXT, JSON). It performs automated summarization, sentiment analysis, intent detection, stance interpretation, and knowledge graph visualization, while also enabling an interactive chat interface powered by Retrieval-Augmented Generation (RAG) and contextual memory.

🚀 Key Features

Multi-Source Input: Supports both URL-based and file-based knowledge ingestion.
Automated Insights: Generates summary, sentiment, intent, and stance analysis.
Dynamic Knowledge Graphs: Visualizes entity relationships with interactive graph rendering.
Conversational Agent: A context-aware chatbot that answers factual and personal queries using RAG and intent classification.
Robust Session Handling: Independent workflows for URL and file modes with state management and persistent memory.
Seamless User Experience: Real-time feedback, auto-refresh, and intuitive interface design.

🧠 Tech Stack

Frontend: Streamlit
Backend: LangGraph, LangChain, LangSmith, FAISS Vector Store, Python
LLM: OpenAI GPT (via langchain_openai)
Visualization: Matplotlib, NetworkX
Memory & Retrieval: ConversationalRetrievalChain, ContextualCompressionRetriever

⚙️ Installation

# Clone the repository
git clone https://github.com/yourusername/multi-source-knowledge-agent.git
cd multi-source-knowledge-agent

# Create a virtual environment
python -m venv venv
source venv/bin/activate  # On Windows use venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

🔑 Environment Setup

Create a .env file in the root directory and add your OpenAI API key:

OPENAI_API_KEY=your_openai_api_key_here

▶️ Usage

streamlit run app.py

Then open the displayed local URL in your browser.

Choose between:

🌐 URL Mode: Enter a web link to extract insights.
📁 File Upload Mode: Upload a supported document for intelligent analysis.

After processing, you will receive:

Comprehensive report (summary, sentiment, stance, etc.)
Interactive knowledge graph
RAG-powered chatbot interface

🧩 Core Functionalities

Component	Description
Workflow Engine	Orchestrates knowledge extraction and LLM-driven insights.
Intent Classifier	Categorizes user queries into personal, factual, or hybrid.
Memory Recall System	Remembers contextual user information and relevant past facts.
RAG Chatbot	Answers user queries using retrieved document context and memory.
Graph Visualizer	Displays structured knowledge as interconnected entities.

🧩 System Architecture Overview

flowchart TD

    subgraph UI["🖥️ Streamlit Frontend"]
        A1["🌐 URL / 📁 File Input"]
        A2["⚙️ Workflow Execution"]
        A3["💬 Conversational Agent Interface"]
    end

    subgraph CORE["🧠 LangChain Core Pipeline"]
        B1["Retriever (FAISS VectorStore)"]
        B2["LLM (OpenAI GPT via LangChain)"]
        B3["Memory (ConversationBufferMemory)"]
        B4["Contextual Compression (LLMChainExtractor)"]
        B5["Intent Classifier (OpenAI Mini LLM)"]
    end

    subgraph WORKFLOW["🔄 LangGraph Workflow Orchestration"]
        C1["Document Preprocessing"]
        C2["Content Extraction (URL/File)"]
        C3["Embedding Generation"]
        C4["Knowledge Graph Construction (NetworkX)"]
        C5["Insights Generation (Summary, Sentiment, Intent, Stance)"]
    end

    subgraph OBSERVABILITY["📊 LangSmith & Monitoring"]
        D1["Trace & Debug Chains"]
        D2["Performance Metrics & Latency Tracking"]
        D3["Prompt Optimization & Error Reporting"]
    end

    subgraph STORAGE["🗃️ Persistent Storage Layer"]
        E1["FAISS Vector Database"]
        E2["Temporary File Cache"]
        E3["Session & State Management"]
    end

    %% Connections
    UI --> WORKFLOW
    WORKFLOW --> CORE
    CORE --> STORAGE
    CORE --> OBSERVABILITY
    WORKFLOW --> OBSERVABILITY
    UI --> CORE

⚙️ Data Flow Summary

Input Phase: User provides a URL or uploads a document (PDF, DOCX, CSV, etc.).
Extraction & Processing: The LangGraph workflow extracts raw text, cleans it, and generates embeddings.
Analysis: LangChain orchestrates summarization, sentiment analysis, and intent detection using connected LLMs.
Knowledge Graph Generation: Extracted entities and relationships are visualized interactively using NetworkX.
Conversational Querying: A RAG-based Conversational Agent allows users to query the processed content contextually.
Monitoring: LangSmith tracks every chain invocation, latency, and model performance for continuous observability.

🧱 Architectural Highlights

🧩 Modular, Agentic Design — Each component (ingestion, analysis, visualization) runs as an independent node within a LangGraph workflow.
🧠 Dynamic Context Memory — Past user queries and responses are retained via LangChain Memory for personalized interactions.
🔄 Isolated Workflows per Input Mode — URL and File pipelines function independently with clean session and cache resets.
🪶 Observability First — Full tracing, debugging, and metrics through LangSmith integration.
🚀 Scalable Foundation — Designed to easily extend toward multi-user, API-driven, or enterprise knowledge management use cases.

📈 Example Use Cases

Academic and research paper summarization
Intelligent document-based Q&A
Sentiment and author stance profiling
Context-aware data interpretation
Enterprise knowledge management

🌟 Future Enhancements

Citation tracing and evidence linking
Multi-user shared memory persistence
Integration with dashboards and APIs
Support for additional formats (Excel, HTML scraping)

🧾 License

Released under the Apache 2.0 License — free for modification and distribution with attribution.

🤝 Acknowledgments

Built with ❤️ using LangGraph, Streamlit, LangChain, FAISS, LangSmith, and OpenAI GPT to enable intelligent, explainable, and interactive document analysis.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
index.faiss		index.faiss
index.pkl		index.pkl
requirements.txt		requirements.txt
workflow.py		workflow.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

📚 Multi-Source Knowledge Intelligence Agent

🚀 Key Features

🧠 Tech Stack

⚙️ Installation

🔑 Environment Setup

▶️ Usage

🧩 Core Functionalities

🧩 System Architecture Overview

⚙️ Data Flow Summary

🧱 Architectural Highlights

📈 Example Use Cases

🌟 Future Enhancements

🧾 License

🤝 Acknowledgments

About

Uh oh!

Releases

Packages

Languages

License

SayamAlt/Multi-Source-Knowledge-Intelligence-Agent

Folders and files

Latest commit

History

Repository files navigation

📚 Multi-Source Knowledge Intelligence Agent

🚀 Key Features

🧠 Tech Stack

⚙️ Installation

🔑 Environment Setup

▶️ Usage

🧩 Core Functionalities

🧩 System Architecture Overview

⚙️ Data Flow Summary

🧱 Architectural Highlights

📈 Example Use Cases

🌟 Future Enhancements

🧾 License

🤝 Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages