Skip to content

Conversation

GromNaN
Copy link
Member

@GromNaN GromNaN commented Sep 5, 2025

Q A
Type feature
BC Break no
Fixed issues PHPORM-381

Summary

Add a new #[ODM\VectorSearchIndex] attribute to create Atlas Vector Search Indexes.

In XML, attribute and annotations, this is a new type of object. But they create Search Indexes in the class metadata with the type vectorSearch. The index management is exactly the same for search and vector search indexes.

@GromNaN GromNaN added the Feature label Sep 5, 2025
@GromNaN GromNaN added this to the 2.13.0 milestone Sep 5, 2025
{
/** @param list<VectorSearchIndexField> $fields */
public function __construct(
public array $fields,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the fields are required, it has to be the 1st parameter of the constructor.

@GromNaN GromNaN marked this pull request as ready for review September 5, 2025 15:17
@GromNaN GromNaN requested a review from Copilot September 5, 2025 15:17
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for vector search indexes in MongoDB ODM, introducing a new #[VectorSearchIndex] attribute for creating Atlas Vector Search Indexes. This feature enables efficient similarity search on vector fields, commonly used for machine learning embeddings.

Key changes include:

  • New #[VectorSearchIndex] attribute for annotating document classes
  • Extended ClassMetadata to handle vector search indexes alongside regular search indexes
  • XML mapping support for vector search index definitions

Reviewed Changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
lib/Doctrine/ODM/MongoDB/Mapping/Annotations/VectorSearchIndex.php New annotation class for vector search index definition
lib/Doctrine/ODM/MongoDB/Mapping/ClassMetadata.php Extended metadata handling to support vector search indexes with validation
lib/Doctrine/ODM/MongoDB/Mapping/Driver/AttributeDriver.php Added attribute driver support for processing VectorSearchIndex annotations
lib/Doctrine/ODM/MongoDB/Mapping/Driver/XmlDriver.php Added XML driver support for vector-search-indexes configuration
lib/Doctrine/ODM/MongoDB/Mapping/MappingException.php New exception for empty vector search index validation
tests/Documents/VectorEmbedding.php Test document demonstrating vector search index usage
tests/Doctrine/ODM/MongoDB/Tests/SchemaManagerTest.php Updated tests to handle vector search indexes in schema management
tests/Doctrine/ODM/MongoDB/Tests/Mapping/ClassMetadataTest.php Added tests for vector search index validation and creation
doctrine-mongo-mapping.xsd Updated XSD schema to support vector search index XML definitions
docs/en/reference/attributes-reference.rst Documentation for the new VectorSearchIndex attribute

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

}

if (isset($vectorField['hnswNumEdgeCandidates'])) {
$field['hnswOptions']['numEdgeCandidates'] = (int) $vectorField['hnswNumEdgeCandidates'];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for my information, when/how do the types specified in the XML schema get checked? I see that the schema defines the types, but does that get validated somewhere before we get here and cast them?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type is only used when validating the XML file with the XSD. It's never used to cast the node to the correct type.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I mean it's cast here on line 778 so I was wondering if we're making sure somewhere else that it actually is a number. But ok cool, so it is validated!

@paulinevos
Copy link
Contributor

Please hold, I accepted and realized I'd only reviewed one commit. Can't remove the review 💀

@GromNaN GromNaN requested a review from paulinevos September 8, 2025 11:38
Copy link
Contributor

@paulinevos paulinevos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good now as far as I'm concerned

@GromNaN GromNaN requested a review from alcaeus September 8, 2025 16:32
- ``fields`` (required): A list of field definitions. Each field definition is an associative array describing a vector or filter field. For vector fields, the following keys are supported:

- ``type``: Must be set to ``'vector'`` for vector fields or ``'filter'`` for filter fields.
- ``path``: The name of the field in the document to index.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it use the class metadata to map the doctrine field name to the mongodb field name?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, we do not do that for the fields option on SearchIndex. The values here should correspond to the exact names in the database schema.

This may be something worth clarifying in the documentation files for both, though.

@GromNaN GromNaN merged commit 9a0e994 into doctrine:2.13.x Sep 10, 2025
22 checks passed
@GromNaN GromNaN deleted the PHPORM-381 branch September 10, 2025 10:45
- ``type``: Must be set to ``'vector'`` for vector fields or ``'filter'`` for filter fields.
- ``path``: The name of the field in the document to index.
- ``numDimensions``: (vector fields only) The number of dimensions in the vector.
- ``similarity``: (vector fields only) The vector similarity function to use. Supported values include ``'euclidean'``, ``'cosine'``, and ``'dotProduct'``. Use the constants from ``Doctrine\ODM\MongoDB\Mapping\ClassMetadata::VECTOR_SIMILARITY_*`` for best compatibility.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason you chose not to use an enum here? I suppose keeping this open makes it easier to be forward-compatible should the server introduce more similarity types.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I want to keep it open for new values, and let people use strings from the documentation.

- ``fields`` (required): A list of field definitions. Each field definition is an associative array describing a vector or filter field. For vector fields, the following keys are supported:

- ``type``: Must be set to ``'vector'`` for vector fields or ``'filter'`` for filter fields.
- ``path``: The name of the field in the document to index.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, we do not do that for the fields option on SearchIndex. The values here should correspond to the exact names in the database schema.

This may be something worth clarifying in the documentation files for both, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants