How to read html files #14939
Unanswered
SarahSohana27
asked this question in
Q&A
Replies: 1 comment 6 replies
-
To properly read and load HTML files into LlamaIndex if from llama_index.readers.file import UnstructuredFileLoader
from pathlib import Path
file_path = Path('./output1.html')
loader = UnstructuredFileLoader(
file_path=file_path, continue_on_failure=False, headers={"User-Agent": "value"}
)
documents = loader.load_data()
# Process the documents as needed
for doc in documents:
print(doc.text) This code initializes the pip install llama-index-readers-file This approach uses the |
Beta Was this translation helpful? Give feedback.
6 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I have some documents in .html files. How to load them in llama index? I tried UnstructuredReader. It's not working.
Beta Was this translation helpful? Give feedback.
All reactions