Add Persistant pipeline to the Sherlock RAG example #1416
Labels
feature request
New feature or request
sherlock
Issues/PRs related to Sherlock workflows and components
Is this a new feature, an improvement, or a change to existing functionality?
New Feature
How would you describe the priority of this feature request
High
Please provide a clear description of problem this feature solves
As part of the Sherlock work, an example showing how to use Morpheus to execute multiple LLM queries that utilize RAG inside of a pipeline.
Describe your ideal solution
Purpose
The purpose of this example is to illustrate how a user could build a pipeline while will integrate an LLM service into a Morpheus pipeline. This example builds on the previous example, #1305, by adding the ability to augment LLM queries with context information from a knowledge base. Appending this context helps improve the responses from the LLM by providing additional background contextual and factual information which the LLM can pull from for its response.
In order for this pipeline to function correctly, a Vector Database must already have been populated with information that can be retrieved. An example of populating a database is illustrated in #1298. This example assumes that pipeline has already been run to completion.
Scenario
This example will show two different implementations of a RAG pipeline but the pipeline and components could be used in many scenarios with different requirements. At a high level, the following illustrates different customization points for this pipeline and the specific choices made for this example:
LLMService
interface.llama-cpp-python
"stuff"
retrievers in Langchain. Using a simple custom prompt keeps the implementation easy to understand.LLMEngine
Implementation
This example will add a new version of the pipeline using a
click
commandPersistent Morpheus pipeline
The persistent Morpheus pipeline is functionally similar to the standalone pipeline, however it uses multiple sources and multiple sinks to perform both the upload and retrieval portions in the same pipeline. The benefit of this pipeline over the standalone pipeline is no VDB upload process needed to be run beforehand. Everything runs in a single pipeline.
The implementation for this pipeline is illustrated by the following diagram:
The major differences between the diagram and the example pipeline are:
KafkaSourceStage
to make it easy for the user to control when messages are processed by the example pipelineupload
andretrieve_input
. Pushing messages to one or the other will have a different effect on the final message but both will perform the same tasks until the final part of the pipeline.SplitStage
added after the embedding portion of the pipeline which determines which sink to send each message to.SplitStage
determines where to send each message by looking at the task attached to eachControlMessage
retrieval
task is sent to another Kafka topicretrieve_output
Completion Criteria
The following items need to be satisfied to consider this issue complete:
The text was updated successfully, but these errors were encountered: