diff --git a/www/docs/api-reference/search-apis/reranking.md b/www/docs/api-reference/search-apis/reranking.md index 906846696..3dfd6977a 100644 --- a/www/docs/api-reference/search-apis/reranking.md +++ b/www/docs/api-reference/search-apis/reranking.md @@ -10,7 +10,6 @@ import {vars} from '@site/static/variables.json'; import CodePanel from '@site/src/theme/CodePanel'; - Initial search results often fail to capture nuanced relevance or diversity, potentially leading to suboptimal user experiences. Utilizing Vectara's reranking can significantly enhance the quality and usefulness of @@ -23,17 +22,24 @@ more accurate results. ## Available rerankers -Vectara currently provides the following rerankers: +Vectara offers multiple reranking models that enable you to choose the best one +that for your data and use case. You can evaluate different models +against your own dataset to determine which provides optimal results for your +domain and accuracy and latency requirements. -* [**Multilingual Reranker v1**](/docs/learn/vectara-multi-lingual-reranker) (`type=customer_reranker` and `reranker_name=Rerank_Multilingual_v1`) - also known as Slingshot, provides more accurate neural ranking than the - initial Boomerang retrieval. While computationally more expensive, it offers - improved text scoring across a wide range of languages, making it suitable - for diverse content. -* [**Maximal Marginal Relevance (MMR) Reranker**](/docs/learn/mmr-reranker) (`type=mmr`) - for diversifying results while maintaining relevance. -* [**User Defined Function Reranker**](/docs/learn/user-defined-function-reranker) (`type=userfn`) for - custom scoring based on metadata. +| Reranker Name | API Name | Description | +|--------------|----------|-------------| +| **Qwen3 Reranker** (default) | `qwen3-reranker` | High-performance multilingual neural reranker optimized for accuracy. In many benchmarks, Qwen3 demonstrates strong performance, though results vary by dataset. | +| **Mixbread Reranker** | `mxbai-rerank-base-v2` | Efficient production-friendly model offering a good balance between speed and accuracy. | +| [**Multilingual Reranker v1**](/docs/learn/vectara-multi-lingual-reranker) (Slingshot) | `Rerank_Multilingual_v1` | Neural reranker providing more accurate ranking than initial Boomerang retrieval. While computationally more expensive, it offers improved text scoring across a wide range of languages. | +| [**Maximal Marginal Relevance (MMR) Reranker**](/docs/learn/mmr-reranker) | `type=mmr` | Diversifies results while maintaining relevance. | +| [**User Defined Function Reranker**](/docs/learn/user-defined-function-reranker) | `type=userfn` | Applies custom scoring based on metadata or business rules. | + +:::tip +To enable reranking in the Vectara console, navigate to the +Query tab of a corpus and select **Retrieval**. Use this for exploration +and experimenting with the API. +::: ### Chain reranking @@ -51,32 +57,66 @@ precision while maintaining recall. ## Enable reranking -To enable reranking, specify the appropriate value for the `type` in the -`reranker` object. For the MMR reranker, use `mmr`. In most scenarios, -it makes sense to use the default query `start` value of `0` so that you're -reranking all of the best initial results. You can also set the `limit` of the -`query` to the total number of documents you wish to rerank. The default value +To enable reranking, specify the appropriate value for the `type` in the +`reranker` object. For the MMR reranker, use `mmr`. In most scenarios, +it makes sense to use the default query `start` value of `0` so that you're +reranking all of the best initial results. You can also set the `limit` of the +`query` to the total number of documents you wish to rerank. The default value is `25`. -The following example shows the `limit` and `type` values in a query. Note that +The following example shows the `limit` and `type` values in a query. Note that this simplified example intentionally omits several parameter values. +### Using neural rerankers + +For neural rerankers like Qwen3, Mixbread, or Multilingual v1, use +`type=customer_reranker` and specify the `reranker_name`. + + + + + +## Best practices + +When working with multiple rerankers, consider the following best practices: + +* **Experimentation**: Each reranker behaves differently depending on your + content and queries. Evaluate each reranker on your own dataset to determine + which provides the best results for your specific use case. +* **Latency vs. accuracy**: Larger models like Qwen3 tend to provide more + accurate results but can add more latency compared to smaller models like + Mixbread. Test both models to find the right balance for your application. +* **Fallback handling**: Ensure your application handles reranker errors + gracefully and can fall back to retrieval-only results if a reranker fails + or times out. ## Search cutoffs @@ -92,9 +132,9 @@ level of relevance. For example, when you set the `cutoff` to `0.5`, only result with a score of `0.5` or higher are considered. For example: When a reranker is applied with a cutoff, it performs the following steps: @@ -111,10 +151,11 @@ cutoff is applied first, followed by the limit. ::: :::caution -Search cutoffs are most effective when used with neural rerankers like -the Vectara Multilingual reranker (Slingshot). This provides normalized -scores between 0 and 1. If you use hybrid search methods that involve BM25, -scores may be unbounded, making cutoff values less predictable. +Search cutoffs are most effective when used with neural rerankers like +Qwen3, Mixbread, or the Vectara Multilingual reranker (Slingshot), which +provide normalized scores between 0 and 1. If you use hybrid search methods +that involve BM25, scores may be unbounded, making cutoff values less +predictable. ::: ## Search limits @@ -204,8 +245,4 @@ only highly relevant and recent documents for summarization. 2. The next stage prioritizes documents based on their `publish_ts` value, which represents the publication timestamp. -:::tip -You can also enable reranking in the Vectara console after navigating to the -Query tab of a corpus and selecting **Retrieval**. Use this for exploration -and experimenting with the API. -::: + diff --git a/www/docs/learn/knee-reranking.md b/www/docs/learn/knee-reranking.md index e1ef37a62..0323b5515 100644 --- a/www/docs/learn/knee-reranking.md +++ b/www/docs/learn/knee-reranking.md @@ -15,17 +15,17 @@ distributions across queries. Knee reranking addresses this challenge by detecting natural boundaries between relevant and irrelevant results automatically. -Knee reranking combines statistical analysis with configurable parameters -to provide intelligent, adaptive filtering. Designed specifically to work -after the Slingshot reranker, it analyzes score patterns to identify -significant drops in relevance while maintaining safeguards against -over-aggressive filtering. For more details about how this reranker works, see +Knee reranking combines statistical analysis with configurable parameters +to provide intelligent, adaptive filtering. Designed specifically to work +after the Slingshot reranker, it analyzes score patterns to identify +significant drops in relevance while maintaining safeguards against +over-aggressive filtering. For more details about how this reranker works, see this [**blog post**](https://www.vectara.com/blog/introducing-the-knee-reranking-smart-result-filtering-for-better-results). ## Enable knee reranking -Enable knee reranking by adding it your reranking chain after the Slingshot -reranker. The default settings balance precision and recall, making them +Enable knee reranking by adding it your reranking chain after the Slingshot +reranker. The default settings balance precision and recall, making them suitable for most use cases. +Knee reranking also works with the new neural rerankers like Qwen3 and Mixbread: + + + Customize the behavior of knee reranking through two key parameters: * **Sensitivity:** Controls how sharply the score must drop to identify a cutoff. diff --git a/www/docs/learn/vectara-multi-lingual-reranker.md b/www/docs/learn/vectara-multi-lingual-reranker.md index bc1d4cfc3..015dc2006 100644 --- a/www/docs/learn/vectara-multi-lingual-reranker.md +++ b/www/docs/learn/vectara-multi-lingual-reranker.md @@ -11,28 +11,30 @@ import {vars} from '@site/static/variables.json'; import CodePanel from '@site/src/theme/CodePanel'; -Generative AI applications often struggle with ranking the most relevant -information, leading to hallucinations and irrelevant responses. The new -Vectara Multilingual Reranker V1, also known as Slingshot, is a -state-of-the-art reranking model that significantly enhances the precision of -retrieved results. Providing advanced neural ranking, it refines the output of -initial models like [Boomerang](https://vectara.com/blog/introducing-boomerang-vectaras-new-and-improved-retrieval-model/), -offering even more accurate document scoring and response quality in Retrieval -Augmented Generation (RAG) pipelines. - -The Vectara Multilingual Reranker operates as a second-pass refinement tool, -building on Boomerang’s high-recall capabilities. While Boomerang quickly -retrieves a broad set of relevant documents, the Multilingual Reranker -delivers more precise results, ensuring that the top-ranked documents are the -most relevant. This reranker also excels across both English and multilingual -datasets, making it a powerful tool for global use cases. - -While more computationally expensive and introducing some additional latency, -the multilingual reranker improves neural ranking beyond Boomerang’s initial -selection by providing more precise text scoring. Think of the Slingshot -reranker as a "better Boomerang" for refining results, with the multilingual -capability serving primarily as a differentiator from other rerankers in the -market, which are often English-only. +Generative AI applications often struggle with ranking the most relevant +information, leading to hallucinations and irrelevant responses. The Vectara +Multilingual Reranker V1, also known as Slingshot, is a neural reranking model +that enhances the precision of retrieved results. Providing advanced neural +ranking, it refines the output of initial models like [Boomerang](https://vectara.com/blog/introducing-boomerang-vectaras-new-and-improved-retrieval-model/), +offering improved document scoring and response quality in Retrieval Augmented +Generation (RAG) pipelines. + +The Vectara Multilingual Reranker operates as a second-pass refinement tool, +building on Boomerang's high-recall capabilities. While Boomerang quickly +retrieves a broad set of relevant documents, the Multilingual Reranker +delivers more precise results, ensuring that the top-ranked documents are the +most relevant. This reranker excels across both English and multilingual +datasets, making it a strong choice for global use cases. + +While more computationally expensive and introducing some additional latency, +the multilingual reranker improves neural ranking beyond Boomerang's initial +selection by providing more precise text scoring. The multilingual capability +serves as a key differentiator, as many market rerankers are English-only. + +Vectara now offers multiple reranking models including Qwen3 (the default for +SaaS) and Mixbread. You should evaluate different rerankers on your own +dataset to determine which provides the best results for your specific use +case and latency requirements. Using this reranker requires both the `type` and `reranker_name` in the `reranker` object. Set the `type` as `customer_reranker` and the `reranker_name` diff --git a/www/docs/release-notes.mdx b/www/docs/release-notes.mdx index 617b994bb..ce45cbaef 100644 --- a/www/docs/release-notes.mdx +++ b/www/docs/release-notes.mdx @@ -20,6 +20,34 @@ and how these product and documentation changes can benefit your enterprise. --- +## New Reranking Models: Qwen3 and Mixbread + +_September 30, 2025_ + +Vectara now offers two new neural reranking models: **Qwen3** and **Mixbread**. +These models provide more flexibility to optimize search result relevance +based on your specific accuracy and latency requirements. + +**Why it matters:** Each use case demands different tradeoffs. With multiple +available rerankers, you can evaluate and select the model that best fits your +data and performance needs. + +**What's new:** + +- **Qwen3 Reranker** (`qwen3-reranker`): High-accuracy multilingual model, now the + default option. Optimized for precision across diverse datasets. +- **Mixbread Reranker** (`mxbai-rerank-base-v2`): Efficient model balancing speed + and accuracy for high-volume production workloads. +- Override defaults per query or combine rerankers in chains for advanced strategies. +- Evaluate all rerankers against your data to find the optimal fit. + +**More information:** + +* [Reranking](/docs/api-reference/search-apis/reranking) +* [Chain Reranker](/docs/learn/chain-reranker) + +--- + ## Vectara Agents Framework _September 3, 2025_