Skip to content

Commit 3e164c4

Browse files
tryptofanikManul from Pathway
authored andcommitted
Rename parsers to be consistent (#8106)
GitOrigin-RevId: 94d099a84e913a56289b66baa8b6a1f5a7a2fd41
1 parent 969cfcc commit 3e164c4

File tree

8 files changed

+14
-14
lines changed

8 files changed

+14
-14
lines changed

cookbooks/self-rag-agents/pathway_deploy_langgraph_agents.ipynb

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@
4242
"id": "9e055de9-723f-44e3-ad39-70cc5f8932bf",
4343
"metadata": {},
4444
"source": [
45-
"Magic library is used for detecting file types in the `ParseUnstructured` module.\n",
45+
"Magic library is used for detecting file types in the `UnstructuredParser` module.\n",
4646
"\n",
4747
"If you are running this notebook on **MacOS**, you can install it with:\n",
4848
"> `brew install libmagic`\n",
@@ -193,7 +193,7 @@
193193
"\n",
194194
"\n",
195195
"1. [Connectors](https://pathway.com/developers/user-guide/connect/pathway-connectors): Use Pathway’s file reader to ingest all text files under the `DATA_PATH`.\n",
196-
"2. [Parsers](https://pathway.com/developers/api-docs/pathway-xpacks-llm/parsers): Utilize the ParseUnstructured to parse the documents. This parser supports multiple file types, including PDF, DOCX, and PPTX.\n",
196+
"2. [Parsers](https://pathway.com/developers/api-docs/pathway-xpacks-llm/parsers): Utilize the UnstructuredParser to parse the documents. This parser supports multiple file types, including PDF, DOCX, and PPTX.\n",
197197
"3. [Text Splitters](https://pathway.com/developers/api-docs/pathway-xpacks-llm/splitters): Split the document content into chunks.\n",
198198
"4. [Embedders](https://pathway.com/developers/api-docs/pathway-xpacks-llm/embedders): Use OpenAI API for embeddings."
199199
]
@@ -242,7 +242,7 @@
242242
"sources = [folder]\n",
243243
"\n",
244244
"# define the document processing steps\n",
245-
"parser = parsers.ParseUnstructured()\n",
245+
"parser = parsers.UnstructuredParser()\n",
246246
"\n",
247247
"text_splitter = splitters.TokenCountSplitter(min_tokens=150, max_tokens=450)\n",
248248
"\n",

cookbooks/self-rag-agents/pathway_langgraph_agentic_rag.ipynb

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@
3131
"id": "f42f3015-33e6-48f4-827a-1e44541507cd",
3232
"metadata": {},
3333
"source": [
34-
"Magic library is used for detecting file types in the `ParseUnstructured` module.\n",
34+
"Magic library is used for detecting file types in the `UnstructuredParser` module.\n",
3535
"\n",
3636
"If you are running this notebook on **MacOS**, you can install it with:\n",
3737
"> `brew install libmagic`\n",
@@ -193,7 +193,7 @@
193193
"\n",
194194
"\n",
195195
"1. [Connectors](https://pathway.com/developers/user-guide/connect/pathway-connectors): Use Pathway’s file reader to ingest all text files under the `DATA_PATH`.\n",
196-
"2. [Parsers](https://pathway.com/developers/api-docs/pathway-xpacks-llm/parsers): Utilize the ParseUnstructured to parse the documents. This parser supports multiple file types, including PDF, DOCX, and PPTX.\n",
196+
"2. [Parsers](https://pathway.com/developers/api-docs/pathway-xpacks-llm/parsers): Utilize the UnstructuredParser to parse the documents. This parser supports multiple file types, including PDF, DOCX, and PPTX.\n",
197197
"3. [Text Splitters](https://pathway.com/developers/api-docs/pathway-xpacks-llm/splitters): Split the document content into chunks.\n",
198198
"4. [Embedders](https://pathway.com/developers/api-docs/pathway-xpacks-llm/embedders): Use OpenAI API for embeddings.\n",
199199
"5. [VectorStore](https://pathway.com/developers/api-docs/pathway-xpacks-llm/vectorstore): Orchestrates all the above modules."
@@ -255,7 +255,7 @@
255255
"sources = [folder]\n",
256256
"\n",
257257
"# define the document processing steps\n",
258-
"parser = parsers.ParseUnstructured()\n",
258+
"parser = parsers.UnstructuredParser()\n",
259259
"\n",
260260
"text_splitter = splitters.TokenCountSplitter(min_tokens=150, max_tokens=450)\n",
261261
"\n",

examples/pipelines/adaptive-rag/app.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ $embedder: !pw.xpacks.llm.embedders.OpenAIEmbedder
3939
$splitter: !pw.xpacks.llm.splitters.TokenCountSplitter
4040
max_tokens: 400
4141

42-
$parser: !pw.xpacks.llm.parsers.ParseUnstructured
42+
$parser: !pw.xpacks.llm.parsers.UnstructuredParser
4343
cache_strategy: !pw.udfs.DefaultCache
4444

4545
$retriever_factory: !pw.stdlib.indexing.BruteForceKnnFactory

examples/pipelines/demo-document-indexing/app.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ $embedder: !pw.xpacks.llm.embedders.SentenceTransformerEmbedder
3434
$splitter: !pw.xpacks.llm.splitters.TokenCountSplitter
3535
max_tokens: 400
3636

37-
$parser: !pw.xpacks.llm.parsers.ParseUnstructured
37+
$parser: !pw.xpacks.llm.parsers.UnstructuredParser
3838
cache_strategy: !pw.udfs.DefaultCache
3939

4040
$retriever_factory: !pw.stdlib.indexing.BruteForceKnnFactory

examples/pipelines/demo-question-answering/app.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ $embedder: !pw.xpacks.llm.embedders.OpenAIEmbedder
3939
$splitter: !pw.xpacks.llm.splitters.TokenCountSplitter
4040
max_tokens: 400
4141

42-
$parser: !pw.xpacks.llm.parsers.ParseUnstructured
42+
$parser: !pw.xpacks.llm.parsers.UnstructuredParser
4343
cache_strategy: !pw.udfs.DefaultCache
4444

4545
$retriever_factory: !pw.stdlib.indexing.BruteForceKnnFactory

examples/pipelines/drive_alert/app.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@
3535
from pathway.stdlib.ml.index import KNNIndex
3636
from pathway.xpacks.llm.embedders import OpenAIEmbedder
3737
from pathway.xpacks.llm.llms import OpenAIChat, prompt_chat_single_qa
38-
from pathway.xpacks.llm.parsers import ParseUnstructured
38+
from pathway.xpacks.llm.parsers import UnstructuredParser
3939
from pathway.xpacks.llm.splitters import TokenCountSplitter
4040

4141
# To use advanced features with Pathway Scale, get your free license key from
@@ -165,7 +165,7 @@ def run(
165165
service_user_credentials_file=service_user_credentials_file,
166166
refresh_interval=30, # interval between fetch operations in seconds, lower this for more responsiveness
167167
)
168-
parser = ParseUnstructured()
168+
parser = UnstructuredParser()
169169
documents = files.select(texts=parser(pw.this.data))
170170
documents = documents.flatten(pw.this.texts)
171171
documents = documents.select(texts=pw.this.texts[0])

examples/pipelines/private-rag/app.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,7 @@ $embedder: !pw.xpacks.llm.embedders.SentenceTransformerEmbedder
4646
$splitter: !pw.xpacks.llm.splitters.TokenCountSplitter
4747
max_tokens: 400
4848

49-
$parser: !pw.xpacks.llm.parsers.ParseUnstructured
49+
$parser: !pw.xpacks.llm.parsers.UnstructuredParser
5050
cache_strategy: !pw.udfs.DefaultCache
5151

5252
$retriever_factory: !pw.stdlib.indexing.BruteForceKnnFactory

examples/pipelines/unstructured_to_sql_on_the_fly/app.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,7 @@
6565
import tiktoken
6666
from pathway.stdlib.utils.col import unpack_col
6767
from pathway.xpacks.llm.llms import OpenAIChat, prompt_chat_single_qa
68-
from pathway.xpacks.llm.parsers import ParseUnstructured
68+
from pathway.xpacks.llm.parsers import UnstructuredParser
6969

7070
# To use advanced features with Pathway Scale, get your free license key from
7171
# https://pathway.com/features and paste it below.
@@ -302,7 +302,7 @@ def run(
302302
data_dir,
303303
format="binary",
304304
)
305-
parser = ParseUnstructured()
305+
parser = UnstructuredParser()
306306
unstructured_documents = files.select(texts=parser(pw.this.data)).select(
307307
texts=strip_metadata(pw.this.texts)
308308
)

0 commit comments

Comments
 (0)