-
Notifications
You must be signed in to change notification settings - Fork 72
[LCORE-1331] Add Solr filter and update doc #1178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -282,6 +282,14 @@ providers: | |||||||||||||||||||||||||||
| content_field: chunk | ||||||||||||||||||||||||||||
| embedding_dimension: 384 | ||||||||||||||||||||||||||||
| embedding_model: ${env.EMBEDDING_MODEL_DIR} | ||||||||||||||||||||||||||||
| chunk_window_config: | ||||||||||||||||||||||||||||
| chunk_parent_id_field: "parent_id" | ||||||||||||||||||||||||||||
| chunk_content_field: "chunk_field" | ||||||||||||||||||||||||||||
| chunk_index_field: "chunk_index" | ||||||||||||||||||||||||||||
| chunk_token_count_field: "num_tokens" | ||||||||||||||||||||||||||||
| parent_total_chunks_field: "total_chunks" | ||||||||||||||||||||||||||||
| parent_total_tokens_field: "total_tokens" | ||||||||||||||||||||||||||||
| chunk_filter_query: "is_chunk:true" | ||||||||||||||||||||||||||||
| persistence: | ||||||||||||||||||||||||||||
| namespace: portal-rag | ||||||||||||||||||||||||||||
| backend: kv_default | ||||||||||||||||||||||||||||
|
|
@@ -294,6 +302,19 @@ registered_resources: | |||||||||||||||||||||||||||
| embedding_dimension: 384 | ||||||||||||||||||||||||||||
| ``` | ||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| Note: if the vector database (portal-rag) is not in the persistent data store within the vector_io provider | ||||||||||||||||||||||||||||
| (e.g. after deleting the llama stack cache) you will need to register the vector database under registered resources: | ||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| ```yaml | ||||||||||||||||||||||||||||
| vector_stores: | ||||||||||||||||||||||||||||
| - embedding_dimension: 384 | ||||||||||||||||||||||||||||
| embedding_model: sentence-transformers/${env.EMBEDDING_MODEL_DIR} | ||||||||||||||||||||||||||||
| provider_id: solr-vector | ||||||||||||||||||||||||||||
| vector_store_id: portal-rag | ||||||||||||||||||||||||||||
| ``` | ||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| **2. Configure Lightspeed Stack (`lightspeed-stack.yaml`):** | ||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| ```yaml | ||||||||||||||||||||||||||||
|
|
@@ -324,6 +345,14 @@ Note: Solr does not currently work with RAG tools. You will need to specify "no_ | |||||||||||||||||||||||||||
| - **Offline mode**: Uses `parent_id` with Mimir base URL | ||||||||||||||||||||||||||||
| - **Online mode**: Uses `reference_url` from document metadata | ||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| **Query Filtering:** | ||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| To filter the Solr context edit the *chunk_filter_query* field in the | ||||||||||||||||||||||||||||
| Solr **vector_io** provider in the `run.yaml`. Filters should follow the key:value format: | ||||||||||||||||||||||||||||
| ex. `"product:*openshift*`" | ||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| Note: This static filter is a temporary work-around. | ||||||||||||||||||||||||||||
|
Comment on lines
348
to
354
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Three minor doc issues in the Query Filtering section.
📝 Proposed fix-To filter the Solr context edit the *chunk_filter_query* field in the
-Solr **vector_io** provider in the `run.yaml`. Filters should follow the key:value format:
-ex. `"product:*openshift*`"
-
-Note: This static filter is a temporary work-around.
+To filter the Solr context, edit the `chunk_filter_query` field in the Solr **vector_io** provider in `run.yaml`. Filters must follow Solr query syntax (`field:value`), for example: `"product:*openshift*"`
+
+> [!NOTE]
+> This static filter is a temporary workaround.📝 Committable suggestion
Suggested change
🤖 Prompt for AI Agents |
||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| **Prerequisites:** | ||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
| - Solr must be running and accessible at the configured URL | ||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
embedding_modelin the note snippet is inconsistent with the primary Solr example above and will mislead users.Line 301 (the main Solr provider YAML block, six lines above) correctly shows
embedding_model: granite-embedding-30m— the registered model'smodel_id. The note snippet at line 312 reverts tosentence-transformers/${env.EMBEDDING_MODEL_DIR}, which will not resolve correctly (see the correspondingrun.yamlissue).Additionally, the plain
Note:prefix is inconsistent with the GFM alert style used throughout the rest of this document.📝 Proposed fix
Verify each finding against the current code and only fix it if needed.
In
@docs/rag_guide.mdaround lines 305 - 315, Update the inline YAML snippetunder the explanatory note to match the main Solr example by changing the
vector_stores entry embedding_model from the placeholder
sentence-transformers/${env.EMBEDDING_MODEL_DIR} to granite-embedding-30m
(ensure provider_id: solr-vector and vector_store_id: portal-rag remain), and
replace the plain "Note:" prefix with the repository's standard GFM alert style
used elsewhere in the doc so the warning format is consistent.