LlamaExtract (Experimental)

LlamaExtract is an API created by LlamaIndex to efficiently infer schema and extract data from unstructured files.

LlamaExtract directly integrates with LlamaIndex.

Note: LlamaExtract is currently experimental and may change in the future.

Read below for some quickstart information, or see the full documentation.

Getting Started

First, login and get an api-key from https://cloud.llamaindex.ai ↗.

Install the package:

pip install llama-extract

Now you can easily infer schemas and extract data from your files:

import nest_asyncio

nest_asyncio.apply()

from llama_extract import LlamaExtract

extractor = LlamaExtract(
    api_key="llx-...",  # can also be set in your env as LLAMA_CLOUD_API_KEY
    num_workers=4,  # if multiple files passed, split in `num_workers` API calls
    verbose=True,
)

# Infer schema
schema = extractor.infer_schema(
    "my_schema", ["./my_file1.pdf", "./my_file2.pdf"]
)

# Extract data
results = extractor.extract(schema.id, ["./my_file1.pdf", "./my_file2.pdf"])

Examples

Several end-to-end examples can be found in the examples folder

Getting Started

Documentation

https://docs.cloud.llamaindex.ai/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

LlamaExtract (Experimental)

Getting Started

Examples

Documentation

Files

README.md

Latest commit

History

README.md

File metadata and controls

LlamaExtract (Experimental)

Getting Started

Examples

Documentation