Replies: 1 comment
-
The discrepancy in the number of points in your vector database (572 points instead of 113) could be due to the way the ingestion pipeline and vector store are handling the data. Specifically, it seems that the From the provided document, the Here is a snippet of the relevant part of the index structure: {
"index_struct": {
"__type__": "simple_dict",
"__data__": {
"index_id": "9343c0c6-8313-4549-a423-1348960b6958",
"summary": null,
"nodes_dict": {
"ddb890e7-82d5-4dc5-a950-b498c3ef2494": "ddb890e7-82d5-4dc5-a950-b498c3ef2494",
"5a03268d-b654-48a4-b3ea-b9620a5ec614": "5a03268d-b654-48a4-b3ea-b9620a5ec614",
"8d29350a-a932-4665-b436-a810e66def48": "8d29350a-a932-4665-b436-a810e66def48",
"6b151549-3bcc-4e11-a181-c1ac4dd34c8a": "6b151549-3bcc-4e11-a181-c1ac4dd34c8a",
"e2ac2af1-b59f-4c83-b365-f0205ba1f732": "e2ac2af1-b59f-4c83-b365-f0205ba1f732",
"307b2feb-5b0a-4b86-aec7-994074c94d29": "307b2feb-5b0a-4b86-aec7-994074c94d29"
},
"doc_id_dict": {
"707af1fd-62af-4693-93e1-01433b900aca": [
"ddb890e7-82d5-4dc5-a950-b498c3ef2494",
"5a03268d-b654-48a4-b3ea-b9620a5ec614",
"8d29350a-a932-4665-b436-a810e66def48",
"6b151549-3bcc-4e11-a181-c1ac4dd34c8a",
"e2ac2af1-b59f-4c83-b365-f0205ba1f732",
"307b2feb-5b0a-4b86-aec7-994074c94d29"
]
},
"embeddings_dict": {}
}
}
} To resolve this issue, you should verify the following:
Additionally, based on the new context from the LlamaIndex integration tests, it is important to ensure that the ingestion pipeline is correctly transforming documents into nodes with embeddings. The test cases provided in the document show how to validate the ingestion pipeline and vector store operations: def test_nodes(nodes: List[TextNode]) -> None:
"""Test Ingestion Pipeline transforming documents into nodes with embeddings."""
assert isinstance(nodes, list)
assert isinstance(nodes[0], TextNode) Make sure that your ingestion pipeline is correctly transforming documents into nodes and that the nodes are being correctly added to the vector store. This will help ensure that the number of points in your vector database matches the expected count [1][2]. |
Beta Was this translation helpful? Give feedback.
-
I am using Qdrant as my vector db and using ingestion pipeline to create nodes.
this is my ingestion pipeline code.
as you can see I have total 113 nodes.
Now when I am trying to create an index using these piece of code.
However , in my qdrant it is showing 572 points.
Can anyone please help me out why it is showing 572 points. I have even tried pinecone, milvus all of them is showing 572 points.
Beta Was this translation helpful? Give feedback.
All reactions