-
Notifications
You must be signed in to change notification settings - Fork 875
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Update documents to include rerank tool (#3691)
# Description Please add an informative description that covers that changes made by the pull request and link all relevant issues. # All Promptflow Contribution checklist: - [x] **The pull request does not introduce [breaking changes].** - [x] **CHANGELOG is updated for new features, bug fixes or other significant changes.** - [x] **I have read the [contribution guidelines](https://github.com/microsoft/promptflow/blob/main/CONTRIBUTING.md).** - [x] **I confirm that all new dependencies are compatible with the MIT license.** - [x] **Create an issue and link to the pull request to get dedicated review from promptflow team. Learn more: [suggested workflow](../CONTRIBUTING.md#suggested-workflow).** ## General Guidelines and Best Practices - [x] Title of the pull request is clear and informative. - [x] There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, [see this page](https://github.com/Azure/azure-powershell/blob/master/documentation/development-docs/cleaning-up-commits.md). ### Testing Guidelines - [ ] Pull request includes test coverage for the included changes.
- Loading branch information
1 parent
b5a68f4
commit 6e1d23d
Showing
3 changed files
with
73 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -244,6 +244,9 @@ | |
"usecwd", | ||
"locustio", | ||
"euap", | ||
"Rerank", | ||
"rerank", | ||
"reranker", | ||
"rcfile", | ||
"pylintrc" | ||
], | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,69 @@ | ||
# Rerank | ||
|
||
## Introduction | ||
Rerank is a semantic search tool that improves search quality with a semantic-based reranking system which can contextualize the meaning of a user's query beyond keyword relevance. This tool works best with look up tool as a ranker after the initial retrieval. The list of current supported ranking method is as follows. | ||
|
||
| Name | Description | | ||
| --- | --- | | ||
| BM25 | BM25 is an open source ranking algorithm to measure the relevance of documents to a given query | | ||
| Scaled Score Fusion | Scaled Score Fusion calculates a scaled relevance score. | | ||
| Cohere Rerank | Cohere Rerank is the market’s leading reranking model used for semantic search and retrieval-augmented generation (RAG). | | ||
|
||
## Requirements | ||
- For AzureML users, the tool is installed in default image, you can use the tool without extra installation. | ||
- For local users, | ||
|
||
`pip install promptflow-vectordb` | ||
|
||
## Prerequisites | ||
|
||
BM25 and Scaled Score Fusion are included as default reranking methods. To use cohere rerank model, you should create serverless deployment to the model, and establish connection between the tool and the resource as follows. | ||
|
||
- Add Serverless Model connection. Fill "API base" and "API key" field to your serverless deployment. | ||
|
||
|
||
## Inputs | ||
|
||
| Name | Type | Description | Required | | ||
|------------------------|-------------|-----------------------------------------------------------------------|----------| | ||
| query | string | the question relevant to your input documents | Yes | | ||
| ranker_parameters | string | the type of ranking methods to use | Yes | | ||
| result_groups | object | the list of document chunks to rerank. Normally this is output from lookup | Yes | | ||
| top_k | int | the maximum number of relevant documents to return | No | | ||
|
||
|
||
|
||
## Outputs | ||
|
||
| Return Type | Description | | ||
|-------------|------------------------------------------| | ||
| text | text of the entity | | ||
| metadata | metadata like file path and url | | ||
| additional_fields | metadata and rerank score | | ||
|
||
<details> | ||
<summary>Output</summary> | ||
|
||
```json | ||
[ | ||
{ | ||
"text": "sample text", | ||
"metadata": | ||
{ | ||
"filepath": "sample_file_path", | ||
"metadata_json_string": "meta_json_string" | ||
"title": "", | ||
"url": "" | ||
}, | ||
"additional_fields": | ||
{ | ||
"filepath": "sample_file_path", | ||
"metadata_json_string": "meta_json_string" | ||
"title": "", | ||
"url": "", | ||
"@promptflow_vectordb.reranker_score": 0.013795365 | ||
} | ||
} | ||
] | ||
``` | ||
</details> |