Skip to content

Conversation

@edlee123
Copy link
Contributor

Description

  1. Allow dataprep/src/integrations/neo4j_llama_index.py and retrievers/src/integrations/neo4j.py to use openai-like endpoints + api_key.
  2. generate_community_summary updated to use open ai API specification to handle max_token. Previously this method relied on hugging face tokenizer to evaluate token limits, and therefore only allowed endpoints with models on Hugging Face.
  3. Pinned compatible versions of llama-index-llms-openai vs openai. The import from llama_index.llms.openai import OpenAI failed with unpinned packages. llama index required the ResponseTextAnnotationDeltaEvent and was no longer available in openai>1.66.3.

Issues

List the issue or RFC link this PR is working on. If there is no such link, please mark it as n/a.

Type of change

List the type of change like below. Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds new functionality)
  • Breaking change (fix or feature that would break existing design and interface)
  • Others (enhancement, documentation, validation, etc.)

Dependencies

List the newly introduced 3rd party dependency if exists.

Tests

Describe the tests that you ran to verify your changes.

edlee123 added 5 commits May 23, 2025 09:51
Signed-off-by: Ed Lee <[email protected]>
openai like endpoints with key TGI_LLM_ENDPOINT_KEY

Signed-off-by: Ed Lee <[email protected]>
…onent. The latest openai does not have ResponseTextAnnotationDeltaEvent required by these components and so pinned openai==1.66.3

Signed-off-by: Ed Lee <[email protected]>
… limits. dont need to use hugging face tokenizer to trim messages

Signed-off-by: Ed Lee <[email protected]>
@rbrugaro
Copy link
Collaborator

@aMahanna any ideas about arango retriever failed test? I can reproduce it when running locally, I tried pinning some package versions but still same error. it's a 500 code seems ingestion issue.. dataprep ingestion was fine. thanks!

@aMahanna
Copy link
Contributor

aMahanna commented May 30, 2025

@aMahanna any ideas about arango retriever failed test? I can reproduce it when running locally, I tried pinning some package versions but still same error. it's a 500 code seems ingestion issue.. dataprep ingestion was fine. thanks!

Strange...Looks like an env var issue.

[2025-05-29 03:09:42,292] [    INFO] - OPEA_RETRIEVER_ARANGODB - Graph name: , Start Collection name: _ENTITY
[2025-05-29 03:09:42,341] [   ERROR] - opea_retrievers_microservice - [ retrieval ] Error during retrieval invocation: [HTTP 404][ERR 1924] graph 'vertex' not found

Graph Name seems to be empty here, even though it has a default if no env var is provided: https://github.com/opea-project/GenAIComps/blob/main/comps/retrievers/src/integrations/config.py#L203

In any case this PR has no direct relation to ArangoDB, so feel free to proceed with merging while I investigate this

@aMahanna
Copy link
Contributor

@rbrugaro I've opened #1764 to investigate, will let you know

@rbrugaro
Copy link
Collaborator

Thank you @aMahanna!

Copy link
Collaborator

@ashahba ashahba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

…ens for when a hf tokenizer is available or not available

Signed-off-by: Ed Lee <[email protected]>
@edlee123 edlee123 requested a review from rbrugaro June 5, 2025 20:14
Copy link
Collaborator

@rbrugaro rbrugaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for bringing back the HF tokenizer trimming method and addressing the case when that's not available. Once tests pass is good to merge

Signed-off-by: Ed Lee <[email protected]>
@joshuayao joshuayao added this to OPEA Jun 6, 2025
@joshuayao joshuayao added this to the v1.4 milestone Jun 6, 2025
@joshuayao joshuayao moved this to In review in OPEA Jun 6, 2025
@chickenrae chickenrae merged commit b0572b9 into opea-project:main Jun 9, 2025
38 checks passed
@github-project-automation github-project-automation bot moved this from In review to Done in OPEA Jun 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

6 participants