Skip to content

Latest commit

 

History

History
115 lines (77 loc) · 4.49 KB

welcome.mdx

File metadata and controls

115 lines (77 loc) · 4.49 KB
title sidebarTitle
Unstructured
Overview

Unstructured provides a platform and tools to ingest and process unstructured documents for Retrieval Augmented Generation (RAG) and model fine-tuning.

This 60-second video describes more about what Unstructured does and its benefits:

<iframe width="560" height="315" src="https://www.youtube.com/embed/b2AcxJDXOLs" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen ></iframe>

This 40-second video demonstrates a simple use case that Unstructured helps solve:

<iframe width="560" height="315" src="https://www.youtube.com/embed/E-tupjji22U" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen ></iframe>

Unstructured offers the Unstructured user interface (UI) and the Unstructured API. Read on to learn more.

Unstructured user interface (UI)

No-code UI. Production-ready. Pay as you go. Learn more.

Here is a screenshot of the Unstructured UI Start page:

Partial view of the Unstructured UI

This 90-second video provides a brief overview of the Unstructured UI:

<iframe width="560" height="315" src="https://www.youtube.com/embed/IVKcQDZa9Zc" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen ></iframe>

To start using the Unstructured UI right away, skip ahead to the quickstart.

Unstructured API

Use scripts or code. Production-ready. Pay as you go. Learn more.

The Unstructured API consists of two parts:

  • The Unstructured Workflow Endpoint enables a full range of partitioning, chunking, embedding, and enrichment options for your files and data. It is designed to batch-process files and data in remote locations; send processed results to various storage, databases, and vector stores; and use the latest and highest-performing models on the market today. It has built-in logic to deliver the highest quality results at the lowest cost. Learn more.
  • The Unstructured Partition Endpoint is intended for rapid prototyping of Unstructured's various partitioning strategies, with limited support for chunking. It is designed to work only with processing of local files, one file at a time. Use the Unstructured Workflow Endpoint for production-level scenarios, file processing in batches, files and data in remote locations, generating embeddings, applying post-transform enrichments, using the latest and highest-performing models, and for the highest quality results at the lowest cost. Learn more.

Here is a screenshot of some Python code that calls the Unstructured Workflow Endpoint:

Python code that calls the Unstructured Workflow Endpoint

To start using the Unstructured Workflow Endpoint right away, skip ahead to the quickstart.


Supported file types

import SupportedFileTypes from '/snippets/general-shared-text/supported-file-types.mdx';


Unstructured UI quickstart

import SharedSingleFileUI from '/snippets/quickstarts/single-file-ui.mdx';

Learn more about the Unstructured UI.


import LocalToLocalPythonIngestLibrary from '/snippets/ingestion/local-to-local.v2.py.mdx'; import AdditionalIngestDependencies from '/snippets/general-shared-text/ingest-dependencies.mdx';

Unstructured Workflow Endpoint quickstart

import SharedPlatformAPI from '/snippets/quickstarts/platform-api.mdx';

Learn more about the Unstructured API.


Get in touch

If you can't find the information you're looking for in the documentation, or if you need help, contact us directly, or join our Slack where our team and community can help you.