[FEA]: Create a NeMo Service and NeMo Stage #1130

mdemoret-nv · 2023-08-18T01:26:20Z

Is this a new feature, an improvement, or a change to existing functionality?

New Feature

How would you describe the priority of this feature request

High

Please provide a clear description of problem this feature solves

This feature would allow Morpheus pipelines to integrate with NVIDIA's LLM service, NeMo, by sending inference requests to the service from a stage in the pipeline.

The ability to run LLM models inside of a Morpheus pipeline will allow for pipelines to execute complex NLP tasks with very large models. Often these models would be too large to run inside of a Morpheus pipeline so sending the requests off to an external service fits well with other inference services like Triton.

Describe your ideal solution

This new feature should be built from 2 components:

A NeMo LLM service which lives outside of the pipeline
1. The LLM service should live outside of the pipeline to allow multiple stages to utilize the same LLM model, but batch their requests together. The best way to do this is make a singleton service which can be accessed at any time from multiple stages.
2. Requests sent to this service should be batched and then sent off to the NeMo endpoint via the nemo_llm library (python) or CURL (C++)
A NeMo LLM Inference Stage which lives inside of the pipeline
1. This stage will primarily interact with the LLM service, sending input messages to the LLM service for inference
2. Returned messages from the LLM service will be sent to the next stage in the pipeline.

Configurable Options
The NeMo Inference stage should include (but not be limited to the following configurable parameters:

The column to use as the text to use for the inference request
The model name
The model customization ID
NeMo endpoint
API Key
Organization Key
Any model parameters that nemo_llm supports
- For example, tokens_to_generate, stop, temperature, etc.

Describe any alternatives you have considered

A test prototype has been created here: https://github.com/mdemoret-nv/Morpheus/tree/mdd_nemo-stage/examples/nemo

Additional context

No response

Code of Conduct

I agree to follow this project's Code of Conduct
I have searched the open feature requests and have found no duplicates for this feature request

The text was updated successfully, but these errors were encountered:

mdemoret-nv · 2023-12-07T18:34:44Z

Closing since it was completed in 23.11

mdemoret-nv added the feature request New feature or request label Aug 18, 2023

mdemoret-nv mentioned this issue Aug 21, 2023

[EPIC] Sherlock #1140

Closed

mdemoret-nv added this to the 23.11 - Sherlock milestone Aug 21, 2023

mdemoret-nv added the sherlock Issues/PRs related to Sherlock workflows and components label Sep 8, 2023

mdemoret-nv mentioned this issue Sep 8, 2023

[FEA]: Create LLM Engine Core Functionality #1178

Closed

2 tasks

mdemoret-nv assigned cwharris Sep 18, 2023

This was referenced Sep 19, 2023

Add NeMo Service + Stage #1203

Closed

Add NeMo Service + Stage #1204

Draft

mdemoret-nv mentioned this issue Oct 22, 2023

Create Sherlock example for Completion pipeline #1305

Closed

16 tasks

mdemoret-nv closed this as completed Dec 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEA]: Create a NeMo Service and NeMo Stage #1130

[FEA]: Create a NeMo Service and NeMo Stage #1130

mdemoret-nv commented Aug 18, 2023

mdemoret-nv commented Dec 7, 2023

[FEA]: Create a NeMo Service and NeMo Stage #1130

[FEA]: Create a NeMo Service and NeMo Stage #1130

Comments

mdemoret-nv commented Aug 18, 2023

Is this a new feature, an improvement, or a change to existing functionality?

How would you describe the priority of this feature request

Please provide a clear description of problem this feature solves

Describe your ideal solution

Describe any alternatives you have considered

Additional context

Code of Conduct

mdemoret-nv commented Dec 7, 2023