feat: Add support for separate LLM and embedding model endpoints #1434

sriramsowmithri9807 · 2025-06-13T12:12:57Z

Description (issue solved #1367)

This PR adds support for using different endpoints for LLM inference and embedding models, which is particularly useful when running separate llama.cpp servers for each function.

Changes Made

Added LLM_ENDPOINT and EMBEDDING_ENDPOINT configuration options
Updated GenericLLMProvider to handle custom base URLs for different providers
Enhanced embedding initialization to support separate endpoints
Improved configuration handling for both LLM and embedding providers
Added proper environment variable support for endpoint configuration

How to Test

Set up your environment variables:

export LLM_ENDPOINT="http://localhost:8080/v1"
export EMBEDDING_ENDPOINT="http://localhost:8081/v1"
export FAST_LLM="openai:llama3"  # or your preferred model
export EMBEDDING="openai:llama-embed"  # or your preferred embedding model

Or update your config file:

python

{
    "LLM_ENDPOINT": "http://localhost:8080/v1",
    "EMBEDDING_ENDPOINT": "http://localhost:8081/v1",
    "FAST_LLM": "openai:llama3",
    "EMBEDDING": "openai:llama-embed"
}

- Added LLM_ENDPOINT and EMBEDDING_ENDPOINT configuration options - Updated GenericLLMProvider to handle custom base URLs - Enhanced embedding initialization to use separate endpoints - Improved configuration handling for both LLM and embedding providers

sriramsowmithri9807 and others added 2 commits June 13, 2025 17:40

Merge branch 'assafelovic:master' into master

7176e27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add support for separate LLM and embedding model endpoints #1434

feat: Add support for separate LLM and embedding model endpoints #1434

sriramsowmithri9807 commented Jun 13, 2025 •

edited

Loading

Uh oh!

Uh oh!

feat: Add support for separate LLM and embedding model endpoints #1434

Are you sure you want to change the base?

feat: Add support for separate LLM and embedding model endpoints #1434

Conversation

sriramsowmithri9807 commented Jun 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description (issue solved #1367)

Changes Made

How to Test

Uh oh!

Uh oh!

sriramsowmithri9807 commented Jun 13, 2025 •

edited

Loading