Skip to content

PGVectorStore faild with AzureAIEmbeddingsModel (Document objects could not be iterable) #253

@pccrdotnet

Description

@pccrdotnet

At the moment, when I try to create documents using the method store.aadd_documents() with AzureAIEmbeddingsModel as embedding_service it shows an error about an iterable problem (TypeError: 'async_generator' object is not iterable). I think the problem could be when the async method tries to read the metadata object to interact for the insertion operation into the DB, next this is the exception:

File "/home/pablo/projects/RAGPrototype/.venv/lib/python3.10/site-packages/langchain_postgres/v2/vectorstores.py", line 218, in aadd_documents
return await self._engine._run_as_async(
File "/home/pablo/projects/RAGPrototype/.venv/lib/python3.10/site-packages/langchain_postgres/v2/engine.py", line 121, in _run_as_async
return await asyncio.wrap_future(
File "/home/pablo/projects/RAGPrototype/.venv/lib/python3.10/site-packages/langchain_postgres/v2/async_vectorstore.py", line 397, in aadd_documents
ids = await self.aadd_texts(texts, metadatas=metadatas, ids=ids, **kwargs)
File "/home/pablo/projects/RAGPrototype/.venv/lib/python3.10/site-packages/langchain_postgres/v2/async_vectorstore.py", line 377, in aadd_texts
ids = await self.aadd_embeddings(
File "/home/pablo/projects/RAGPrototype/.venv/lib/python3.10/site-packages/langchain_postgres/v2/async_vectorstore.py", line 286, in aadd_embeddings
for id, content, embedding, metadata in zip(ids, texts, embeddings, metadatas):
TypeError: 'async_generator' object is not iterable

Example of the code with the error:

import asyncio
from langchain_postgres.v2.engine import Column
from langchain_postgres import PGEngine, PGVectorStore
from langchain_core.embeddings import DeterministicFakeEmbedding
from langchain_core.documents import Document
from langchain_azure_ai.embeddings import AzureAIEmbeddingsModel

async def StartAgent() -> None:
    # Configuration
    CONNECTION_STRING = "postgresql+psycopg://postgres:XXXXXXXXXX@localhost:5432/TestDb3"
    VECTOR_SIZE = 1536
    TABLE_NAME = "my_doc_collection2"

    azKey = "XXXXXXXXXXX"
    azEndPointEmbedding = "https://XXXXXXXXX.openai.azure.com/openai/deployments/text-embedding-ada-002/"
    azModelEmbedding = "text-embedding-ada-002"

    # Initialize engine and embedding
    engine = PGEngine.from_connection_string(url=CONNECTION_STRING)

    embedding = AzureAIEmbeddingsModel(
            endpoint=azEndPointEmbedding,
            credential=azKey,
            model_name=azModelEmbedding
        )

    #Define metadata columns when creating the table
    # engine.init_vectorstore_table(
    #     table_name=TABLE_NAME,
    #     vector_size=VECTOR_SIZE,
    #     metadata_columns=[
    #         Column(name="country", data_type="text"),
    #         Column(name="city", data_type="text"),
    #         Column(name="address", data_type="text"),
    #         # Add more columns as needed
    #     ]
    # )

    # Create the vector store
    store = await PGVectorStore.create(
        engine=engine,
        table_name=TABLE_NAME,
        embedding_service=embedding,
        metadata_columns=["country", "city", "address"]  # Must match the columns defined above
    )

    # Add documents with metadata
    docs = [
        Document(
            page_content="Apples and oranges",
            metadata={"country": "USA", "city": "New York", "address": "123 Main St"}
        ),
        Document(
            page_content="Cars and airplanes",
            metadata={"country": "France", "city": "Paris", "address": "456 Rue de la Paix"}
        ),
        Document(
            page_content="Train",
            metadata={"country": "Japan", "city": "Tokyo", "address": "789 Shinjuku Ave"}
        )
    ]

    await store.aadd_documents(documents=docs)


if __name__ == "__main__":    
    asyncio.run(StartAgent())

Libraries:
langchain 0.3.27
langchain-azure-ai 0.1.5
langchain-cohere 0.4.6
langchain-community 0.3.27
langchain-core 0.3.76
langchain-experimental 0.3.4
langchain-openai 0.3.33
langchain-postgres 0.0.15
langchain-text-splitters 0.3.9
langgraph 0.6.7
langgraph-checkpoint 2.1.1
langgraph-prebuilt 0.6.4
langgraph-sdk 0.2.6
langsmith 0.4.14

Note: This code works fine when the embedding service changes to DeterministicFakeEmbedding.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions