Skip to content

Conversation

@tomaarsen
Copy link

Description

  • Add Sentence Transformers embeddings & reranking

Related Issues

None

Changes Made

  1. Add sentence_transformers_embed with auto embedding_dim and max_token_size, allowing you to use any model from https://huggingface.co/models?library=sentence-transformers
  2. Add sentence_transformers_rerank, allowing you to use any model from https://huggingface.co/models?pipeline_tag=text-ranking&library=sentence-transformers
  3. Update READMEs
  4. Add a demo script based on lightrag_hf_demo.py

Checklist

  • Changes tested locally
  • Code reviewed
  • Documentation updated (if necessary)
  • Unit tests added (if applicable)

Additional Notes

I wasn't able to get lightrag_hf_demo.py to run nicely, so I couldn't extensively test the new demo lightrag_sentence_transformers_demo.py either, but the embedding and reranking components do work on their own.

  • Tom Aarsen

@danielaskdd
Copy link
Collaborator

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

choices=[
"lollms",
"ollama",
"openai",
"azure_openai",

P2 Badge Allow CLI to select sentence_transformers embedding binding

Even though create_app adds logic for a sentence_transformers embedding binding, the CLI still restricts --embedding-binding to the list shown here, so passing --embedding-binding sentence_transformers results in an argparse "invalid choice" error. Because of this omission the new embedding provider cannot be enabled from the server’s documented configuration interface.


"--rerank-binding",
type=str,
default=get_env_value("RERANK_BINDING", DEFAULT_RERANK_BINDING),
choices=["null", "cohere", "jina", "aliyun"],

P2 Badge Add sentence_transformers to rerank binding choices

Similarly, the argparse definition for --rerank-binding still lists only null|cohere|jina|aliyun even though the server now maps a sentence_transformers reranker (lightrag/api/lightrag_server.py lines 943‑998). Attempting to run the API with --rerank-binding sentence_transformers is rejected before the app starts, making the newly added reranker unreachable via the supported CLI/environment configuration.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 790 to 794
sentence_transformers_embed.func
if isinstance(sentence_transformers_embed, EmbeddingFunc)
else sentence_transformers_embed
)
return await actual_func(texts, embedding_dim=embedding_dim)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Wire sentence_transformers binding with required model

Selecting the new sentence_transformers embedding binding crashes immediately. The server block calls actual_func(texts, embedding_dim=embedding_dim) without ever constructing or passing a SentenceTransformer instance, yet sentence_transformers_embed (lightrag/llm/sentence_transformers.py lines 13‑32) requires a model positional argument and does not accept embedding_dim. As soon as the binding is chosen the API raises TypeError (missing model / unexpected embedding_dim), so the embedding provider cannot be used at all.

Useful? React with 👍 / 👎.

@danielaskdd
Copy link
Collaborator

Please implement the embedding function in accordance with the OpenAI embedding standards. Ensure the implementation includes robust error handling or truncation logic to prevent failures when the input text exceeds the max_token_size.

@danielaskdd danielaskdd added the enhancement New feature or request label Dec 23, 2025
@tomaarsen
Copy link
Author

Ensure the implementation includes robust error handling or truncation logic to prevent failures when the input text exceeds the max_token_size.

Sentence Transformers internally takes care of truncation in both the SentenceTransformer and CrossEncoder classes, so we're all good in that regard. See for example:

from lightrag.llm.sentence_transformers import sentence_transformers_embed
from sentence_transformers import SentenceTransformer
import asyncio
import numpy as np


async def main():
    texts = [
        "This is a test sentence.",
        "Another sentence for embedding, except this one is extremely long." * 1000,
    ]
    model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")

    embedded = await sentence_transformers_embed(texts=texts, model=model)
    print(np.array(embedded).shape)
    # (2, 384)


if __name__ == "__main__":
    asyncio.run(main())
  • Tom Aarsen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants