-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cosmos Full Text Search support #35868
base: main
Are you sure you want to change the base?
Conversation
@@ -927,7 +927,19 @@ protected override ShapedQueryExpression TranslateCast(ShapedQueryExpression sou | |||
|
|||
if (TranslateLambdaExpression(source, keySelector) is SqlExpression translation) | |||
{ | |||
((SelectExpression)source.QueryExpression).ApplyOrdering(new OrderingExpression(translation, ascending)); | |||
if (translation is SqlFunctionExpression { IsScoringFunction: true }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a bit magical (using ORDER BY vs ORDER BY RANK based on the type of argument passed to the OrderBy linq method). Initially I had an separate OrderByRank method (on CosmosQueryablEextension), so users could be explicit about wanting to order by scoring function, but hit all sorts of issues in the nav expansion - we need process the navigations inside the scoring function and/or push includes via pending selector/orderby, but Nav expansion doesn't know about the new OrderByRank method and there is no good way to extend it :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, adding a queryable operator to EF is kinda impossible because of our nav expansion design, I've ran into this several times in the past.
Will review when this is out of draft, we'll see what makes sense here...
{ | ||
} | ||
|
||
internal SqlFunctionExpression( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will add to SqlExpressionFactory later
@@ -276,9 +298,26 @@ private static async Task<bool> CreateContainerIfNotExistsOnceAsync( | |||
containerProperties.VectorEmbeddingPolicy = new VectorEmbeddingPolicy(embeddings); | |||
} | |||
|
|||
if (vectorIndexes.Any()) | |||
if (vectorIndexes.Any()/* || fullTextIndexes.Any()*/) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will fix and clean-up this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds support for full-text search in Cosmos by introducing new full-text index types, translators, model validations, and extensions.
- Introduces full-text indexing support in container creation and client wrappers.
- Adds a new CosmosFullTextSearchTranslator and corresponding DbFunctions extensions.
- Updates model validation and Fluent API extensions to include full-text search configuration.
Reviewed Changes
Copilot reviewed 23 out of 26 changed files in this pull request and generated 1 comment.
Show a summary per file
File | Description |
---|---|
src/EFCore.Cosmos/Storage/Internal/CosmosClientWrapper.cs | Adds support for full-text indexing in container creation, but the new full-text index assignment may override vector index settings. |
src/EFCore.Cosmos/Query/Internal/Translators/CosmosVectorSearchTranslator.cs | Modifies method checking logic that may prematurely reject valid method calls by changing && to |
src/EFCore.Cosmos/Query/Internal/Translators/CosmosFullTextSearchTranslator.cs | Implements a new translator for full-text search functions. |
src/EFCore.Cosmos/Query/Internal/Expressions/SqlFunctionExpression.cs | Introduces an extra constructor parameter to flag scoring functions. |
src/EFCore.Cosmos/Query/Internal/Expressions/SelectExpression.cs | Adds a RankOrdering property and methods to apply rank-based ordering. |
src/EFCore.Cosmos/Query/Internal/CosmosQueryableMethodTranslatingExpressionVisitor.cs | Adjusts ordering logic to support scoring functions and rank ordering. |
src/EFCore.Cosmos/Query/Internal/CosmosQuerySqlGenerator.cs | Updates SQL generation to include rank ordering handling. |
src/EFCore.Cosmos/Metadata/Internal/CosmosAnnotationNames.cs | Defines new annotations for full-text search configuration. |
src/EFCore.Cosmos/Infrastructure/Internal/CosmosModelValidator.cs | Validates full-text index configuration, ensuring only single-property indexes with language annotation are defined. |
src/EFCore.Cosmos/Extensions/CosmosPropertyExtensions.cs, CosmosPropertyBuilderExtensions.cs, CosmosIndexExtensions.cs, CosmosIndexBuilderExtensions.cs, CosmosDbFunctionsExtensions.cs | Adds Fluent API and extension methods for configuring full-text search on properties and indexes. |
Files not reviewed (3)
- Directory.Packages.props: Language not supported
- src/EFCore.Cosmos/EFCore.Cosmos.csproj: Language not supported
- src/EFCore.Cosmos/Properties/CosmosStrings.Designer.cs: Language not supported
Comments suppressed due to low confidence (2)
src/EFCore.Cosmos/Query/Internal/Translators/CosmosVectorSearchTranslator.cs:33
- Changing the logical operator from '&&' to '||' alters the intended logic and causes the translator to return null for methods declared on CosmosDbFunctionsExtensions that are not named 'VectorDistance'. Revisit the condition to ensure it only returns null when both conditions are unmet.
if (method.DeclaringType != typeof(CosmosDbFunctionsExtensions) || method.Name != nameof(CosmosDbFunctionsExtensions.VectorDistance))
src/EFCore.Cosmos/Extensions/CosmosPropertyExtensions.cs:137
- [nitpick] The Experimental attribute for full-text search methods is using 'CosmosVectorSearchExperimental' instead of the more appropriate 'CosmosFullTextSearchExperimental'. This may cause confusion regarding the feature's experiment status.
[Experimental(EFDiagnostics.CosmosVectorSearchExperimental)]
@@ -52,6 +52,41 @@ public static T CoalesceUndefined<T>( | |||
T expression2) | |||
=> throw new InvalidOperationException(CoreStrings.FunctionOnClient(nameof(CoalesceUndefined))); | |||
|
|||
/// <summary> | |||
/// TODO | |||
/// </summary> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
leverage CoPilot to fill these in, and then review quickly
@@ -12,6 +12,7 @@ | |||
<NoWarn>$(NoWarn);EF9101</NoWarn> <!-- Metrics is experimental --> | |||
<NoWarn>$(NoWarn);EF9102</NoWarn> <!-- Paging is experimental --> | |||
<NoWarn>$(NoWarn);EF9103</NoWarn> <!-- Vector search is experimental --> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this still experimental - I remember a discussion to update these, or am I mis-remembering?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, you're right - I think it's no longer experimental. Open an issue for us to make this non-experimental for 10?
0f39e44
to
a836b63
Compare
/// any release. You should only use it directly in your code with extreme caution and knowing that | ||
/// doing so can result in application failures when updating to a new Entity Framework Core release. | ||
/// </summary> | ||
public class RankOrderingExpression(SqlExpression expression) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could use OrderingExpression and set IsAscending to true, cause we don't generate ASC for it on Cosmos, just DESC for descending. But it feels a bit hacky. Upside is less public surface. Thoughts?
- Adding model building API to configure property as full-text search enabled, as well as setup the index for it, - Adding model validation (e.g. FTS index not matching FTS property), - Adding EF.Functions stubs and translations for FullTextContains, FullTextContainsAll, FullTextContainsAny, FullTextScore and RRF (for hybrid), - Adding logic in SelectExpression to produce ORDER BY RANK when necessary, - Adding validation when attempting to mix with ORDER BY RANK with regular ORDER BY, - Rewrite OFFSET/LIMIT from parameter to constant when ORDER BY RANK is present. outstanding work: - support for FTS Container building using Azure.ResourceManager.CosmosDb (currently blocked) - support for Owned types (adjust model in tests and fix paths) - this also needs to happen for vector search - add model building support for default language, - add more tests for hybrid search, - add more model validation tests for invalid scenarios (index on multiple columns, FTS on non-string etc) - add xml docs, - clean up exception messages and put them in a resource file, Fixes #35476 Fixes #35853 Fixes #35867
a836b63
to
898cc5f
Compare
@maumar just to note that I'm holding back from reviewing since this is still in draft (you seemed to maybe want people to take a look in yesterday's meeting). Let me know if you want a review etc. |
outstanding work:
Fixes #35476
Fixes #35853
Fixes #35867