Skip to content

Conversation

@ai-solution-dev
Copy link

Description

This PR integrates the Polaris AI DataInsight Document Loader into LangChain.js.
The integration enables the use of Polaris AI's advanced document extraction capabilities to process various document formats and extract text, images, tables, charts, and mathematical equations.


Key Features

  • Added Polaris AI DataInsight Document Loader support in LangChain.js

    • Load documents either from the file system or buffer data
    • Extract text, images, tables, charts, and equations from various document formats
    • Process documents through Polaris AI's DataInsight API service
  • Flexible document loading modes

    • element: Load each element in the pages as a separate Document object
    • page: Load each page in the document as a separate Document object
    • single: Load the entire document as a single Document object
  • Comprehensive document element support

    • Text content extraction
    • Table processing with HTML output
    • Chart data with CSV format and metadata
    • Image handling with file path references
  • Source: Located under libs/langchain-community

  • Includes comprehensive testing and examples for the integration


Example Usage

import { PolarisAIDataInsightLoader } from "@langchain/community/document_loaders/web/polaris_ai_datainsight";
import fs from "fs";



const loaderFromPath = new PolarisAIDataInsightLoader({
  filePath: "path/to/file.docx",
  apiKey: process.env.POLARIS_AI_DATA_INSIGHT_API_KEY,
  resourcesDir: "path/to/save/resources/",
  mode: "single",
});
const docsFromPath = await loaderFromPath.load();



const docsFromBuffer = await loaderFromBuffer.load();

…s and examples

    - Add PolarisAIDataInsightLoader for extracting documents from Polaris AI DataInsight API
    - Implement load(), file handling, unzip, and resource mapping
    - Match error messages and comments with Python implementation
    - Provide success and failure unit tests covering element, page, and single modes
    - Add example script demonstrating usage of PolarisAIDataInsightLoader
    - Ensure compliance with LangChain.js coding and linting guidelines
@changeset-bot
Copy link

changeset-bot bot commented Oct 13, 2025

🦋 Changeset detected

Latest commit: 59d6409

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 5 packages
Name Type
@langchain/redis Patch
@langchain/openai Patch
@langchain/anthropic Patch
@langchain/mongodb Minor
langchain Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@github-actions github-actions bot added community Issues related to `@langchain/community` examples labels Oct 13, 2025
@vercel
Copy link

vercel bot commented Oct 13, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Preview Updated (UTC)
langchainjs-docs Ignored Ignored Oct 13, 2025 5:56am

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community Issues related to `@langchain/community` examples

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant