Explicit voyageai embed support #2484

laszukdawid · 2025-12-05T00:23:18Z

Description

Adds explicit support for VoyageAI embedding. These embedding functions were already in the code, under anthropic.py, but there wasn't explicit support for using them via EMBEDDING_MODEL.

Related Issues

None that I've seen.

Changes Made

Externalize voyageai logic into voyageai.py from anthropic.py
Add validation for voyageai as args.embedding_model value

Checklist

Changes tested locally
Code reviewed
Documentation updated (if necessary)
Unit tests added (if applicable)

Additional Notes

N/A

Note

Adds first-class VoyageAI embeddings with a new provider module and integrates it into server binding/dispatch; removes VoyageAI code from Anthropic.

Embeddings:
- New provider: Introduces lightrag/llm/voyageai.py with voyageai_embed (retry, AsyncClient) wrapped via wrap_embedding_func_with_attrs (embedding_dim=1024, max_token_size=16000) and helper get_available_embedding_models.
- Server integration: Adds "voyageai" to supported embedding_binding values and dispatch in create_optimized_embedding_function to call lightrag.llm.voyageai.voyageai_embed.
Refactor/Cleanup:
- Removes VoyageAI embedding code and numpy import from lightrag/llm/anthropic.py.
DX:
- Improves unsupported embedding error message to include the invalid value.

^{Written by Cursor Bugbot for commit 9afc599. This will update automatically on new commits. Configure here.}

LarFii · 2025-12-11T14:22:25Z

@cursor review

cursor · 2025-12-11T14:26:18Z

lightrag/api/lightrag_server.py

        "aws_bedrock",
        "jina",
        "gemini",
+        "voyageai",


Bug: Missing voyageai in argparse embedding-binding choices

The PR adds voyageai to the validation list in lightrag_server.py but doesn't add it to the argparse choices list in config.py. The argparse configuration at config.py lines 242-250 restricts --embedding-binding to specific values that don't include voyageai. As a result, argparse will reject voyageai before reaching the server validation, making the new embedding binding unusable via command-line arguments or the EMBEDDING_BINDING environment variable.

cursor · 2025-12-11T14:26:18Z

lightrag/api/lightrag_server.py

+                        model=model,
+                        api_key=api_key,
+                        embedding_dim=embedding_dim,
+                    )


Bug: Model passed unconditionally may override default with None

The voyageai binding passes model=model unconditionally, unlike other bindings (jina, openai, gemini) which check if model: before including it in kwargs. When args.embedding_model is None (user didn't set EMBEDDING_MODEL), this passes model=None to voyageai_embed, which overrides the function's default value of "voyage-3" and subsequently passes None to the VoyageAI API. The API likely requires a valid model name, causing the call to fail.

laszukdawid · 2025-12-12T14:40:00Z

@LarFii updated

laszukdawid · 2025-12-18T16:53:49Z

@danielaskdd @LarFii ?

danielaskdd · 2025-12-22T12:30:05Z

Please refer to the openai_embed function and implement a max_token_size truncation mechanism for the VoyageAI and Anthropic embedding models.

@danielaskdd

This commit addresses all feedback from PR HKUDS#2484 review: 1. Fix model parameter override bug (Critical Bug HKUDS#2) - Add conditional model parameter passing in lightrag_server.py - Only pass model parameter if provided by user - Prevents None from overriding default "voyage-3" model - Mirrors pattern used by jina and openai bindings 2. Enable max_token_size truncation by default - Set truncation=True as default in voyageai_embed() - Prevents API errors when texts exceed token limits - Improves documentation for truncation parameter - Adds embedding_dim parameter documentation Note: Critical Bug #1 (argparse config) was already fixed in commit 2752b01 Addresses feedback from: - Cursor Bot review (model parameter bug) - @danielaskdd (truncation mechanism request) Related: PR HKUDS#2484 on HKUDS/LightRAG

danielaskdd · 2026-01-15T17:23:10Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2752b01f12

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-01-15T17:26:36Z

lightrag/api/lightrag_server.py

+                    return await actual_func(
+                        texts,
+                        api_key=api_key,
+                        embedding_dim=embedding_dim,


Pass embedding model to VoyageAI provider

The VoyageAI embedding dispatch never forwards the configured model name. This means EMBEDDING_MODEL is ignored for the voyageai binding, so calls always use the default voyage-3. In practice, setting EMBEDDING_MODEL=voyage-3-lite (and EMBEDDING_DIM=512) will still send the default model and then fail dimension validation because the wrapper expects 512 but the API returns 1024. Please pass model when calling voyageai_embed here.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-01-15T17:26:36Z

lightrag/llm/voyageai.py

+    if not api_key:
+        api_key = os.environ.get("VOYAGEAI_API_KEY")
+        if not api_key:
+            logger.error("VOYAGEAI_API_KEY environment variable not set")
+            raise ValueError(


Accept legacy VOYAGE_API_KEY for VoyageAI embeddings

The new VoyageAI module only checks VOYAGEAI_API_KEY when no explicit api_key is provided. The previous implementation (removed from anthropic.py in this commit) used VOYAGE_API_KEY, so existing deployments that only set VOYAGE_API_KEY will now hit the ValueError here unless they also set EMBEDDING_BINDING_API_KEY or rename the env var. Consider falling back to VOYAGE_API_KEY to avoid a silent regression.

Useful? React with 👍 / 👎.

laszukdawid added 2 commits December 4, 2025 16:18

chore/support-voyageai-embed-directly: feat: voyageai embed support

45700cd

chore/support-voyageai-embed-directly: fix: revert uv.lock changes

37f34fb

laszukdawid changed the title ~~feat: voyageai embed support~~ Explicit voyageai embed support Dec 5, 2025

Merge branch 'main' into chore/support-voyageai-embed-directly

9afc599

cursor bot reviewed Dec 11, 2025

View reviewed changes

chore/support-voyageai-embed-directly: chore: resolve comments

2752b01

danielaskdd added the enhancement New feature or request label Dec 22, 2025

chatgpt-codex-connector bot reviewed Jan 15, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Explicit voyageai embed support #2484

Explicit voyageai embed support #2484

laszukdawid commented Dec 5, 2025 •

edited by cursor bot

Loading

Uh oh!

LarFii commented Dec 11, 2025

Uh oh!

cursor bot Dec 11, 2025

Uh oh!

cursor bot Dec 11, 2025

Uh oh!

laszukdawid commented Dec 12, 2025

Uh oh!

laszukdawid commented Dec 18, 2025

Uh oh!

danielaskdd commented Dec 22, 2025

Uh oh!

danielaskdd commented Jan 15, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Jan 15, 2026

Uh oh!

chatgpt-codex-connector bot Jan 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Explicit voyageai embed support #2484

Are you sure you want to change the base?

Explicit voyageai embed support #2484

Conversation

laszukdawid commented Dec 5, 2025 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issues

Changes Made

Checklist

Additional Notes

Uh oh!

LarFii commented Dec 11, 2025

Uh oh!

cursor bot Dec 11, 2025

Choose a reason for hiding this comment

Bug: Missing voyageai in argparse embedding-binding choices

Uh oh!

cursor bot Dec 11, 2025

Choose a reason for hiding this comment

Bug: Model passed unconditionally may override default with None

Uh oh!

laszukdawid commented Dec 12, 2025

Uh oh!

laszukdawid commented Dec 18, 2025

Uh oh!

danielaskdd commented Dec 22, 2025

Uh oh!

danielaskdd commented Jan 15, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

laszukdawid commented Dec 5, 2025 •

edited by cursor bot

Loading