Fixed the bug of reindex and drop_index when specifying the schema #222

zhcn000000 · 2025-06-26T07:22:25Z

Fixed the bug of reindex and drop_index when specifying the schema and
Add the use_jsonb parameter to PGEngine for storing metadata using JSONB , and the default value is False

averikitsch · 2025-07-02T22:32:38Z

Hi @zhcn000000, thank you for this PR. Can you provide more details on the purpose of this change? Currently, we recommend that any metadata that should be indexed and filtered on to be specified as specific "metadata_columns" for even better performance than using JSONB. Additionally, the JSON data type has faster insertion performance than JSONB.

zhcn000000 · 2025-07-04T02:58:33Z

Although this may not have an obvious effect, jsonb is faster in reading and json is faster in writing, but it can provide users with additional options, just like the use_jsonb option in traditional engines PGVector

    def __init__(
        self,
        embeddings: Embeddings,
        *,
        connection: Union[None, DBConnection, Engine, AsyncEngine, str] = None,
        embedding_length: Optional[int] = None,
        collection_name: str = _LANGCHAIN_DEFAULT_COLLECTION_NAME,
        collection_metadata: Optional[dict] = None,
        distance_strategy: DistanceStrategy = DEFAULT_DISTANCE_STRATEGY,
        pre_delete_collection: bool = False,
        logger: Optional[logging.Logger] = None,
        relevance_score_fn: Optional[Callable[[float], float]] = None,
        engine_args: Optional[dict[str, Any]] = None,
        use_jsonb: bool = True,
        create_extension: bool = True,
        async_mode: bool = False,
    ) -> None:
        """Initialize the PGVector store.
        For an async version, use `PGVector.acreate()` instead.

        Args:
            connection: Postgres connection string or (async)engine.
            embeddings: Any embedding function implementing
                `langchain.embeddings.base.Embeddings` interface.
            embedding_length: The length of the embedding vector. (default: None)
                NOTE: This is not mandatory. Defining it will prevent vectors of
                any other size to be added to the embeddings table but, without it,
                the embeddings can't be indexed.
            collection_name: The name of the collection to use. (default: langchain)
                NOTE: This is not the name of the table, but the name of the collection.
                The tables will be created when initializing the store (if not exists)
                So, make sure the user has the right permissions to create tables.
            distance_strategy: The distance strategy to use. (default: COSINE)
            pre_delete_collection: If True, will delete the collection if it exists.
                (default: False). Useful for testing.
            engine_args: SQLAlchemy's create engine arguments.
            use_jsonb: Use JSONB instead of JSON for metadata. (default: True)
                Strongly discouraged from using JSON as it's not as efficient
                for querying.
                It's provided here for backwards compatibility with older versions,
                and will be removed in the future.
            create_extension: If True, will create the vector extension if it
                doesn't exist. disabling creation is useful when using ReadOnly
                Databases.
        """

zhcn000000 · 2025-07-04T03:00:37Z

Setting the default value of use_jsonb to false enables users to still store using the original scheme (json) by default

zhcn000000 · 2025-07-23T11:37:26Z

Fixed the bug of reindex and drop_index when specifying the schema

zhcn000000 and others added 5 commits June 18, 2025 16:48

Remove wrong async

c50b8a8

Merge branch 'main' into main

9629def

Merge branch 'langchain-ai:main' into main

b487335

Add feature use_jsonb

2841f25

Add feature use_jsonb

25c8cc2

Fix bug on reindex and drop index when schema set

fc8e9ce

zhcn000000 changed the title ~~Add the use_jsonb parameter to PGEngine for storing metadata using JSONB~~ Fixed the bug of reindex and drop_index when specifying the schema Jul 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fixed the bug of reindex and drop_index when specifying the schema #222

Fixed the bug of reindex and drop_index when specifying the schema #222

Uh oh!

zhcn000000 commented Jun 26, 2025 •

edited

Loading

Uh oh!

averikitsch commented Jul 2, 2025

Uh oh!

zhcn000000 commented Jul 4, 2025

Uh oh!

zhcn000000 commented Jul 4, 2025

Uh oh!

zhcn000000 commented Jul 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fixed the bug of reindex and drop_index when specifying the schema #222

Are you sure you want to change the base?

Fixed the bug of reindex and drop_index when specifying the schema #222

Uh oh!

Conversation

zhcn000000 commented Jun 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

averikitsch commented Jul 2, 2025

Uh oh!

zhcn000000 commented Jul 4, 2025

Uh oh!

zhcn000000 commented Jul 4, 2025

Uh oh!

zhcn000000 commented Jul 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

zhcn000000 commented Jun 26, 2025 •

edited

Loading