Skip to content

[Bug] dataprep/retriever uses different environment variables for different DB backend #1239

@lianhao

Description

@lianhao

Priority

P3

OS type

Ubuntu

Hardware type

Xeon-GNR

Installation method

  • Pull docker images from hub.docker.com
  • Build docker images from source
  • Other

Deploy method

  • Docker
  • Docker Compose
  • Kubernetes Helm Charts
  • Other

Running nodes

Single Node

What's the version?

1.2

Description

Currently, for the following common settings, the dataprep/retriever are using different environment variable names for different backend, we should unify them to use the same environment variable to minimize the end user's configuration confusion:

  • endpoint to the backend TEI service: some are using TEI_ENDPOINT, some are using TEI_EMBEDDING_ENDPOINT. Suggest to use TEI_EMBEDDING_ENDPOINT
  • local embedding model: some are EMBED_MODEL, some are using LOCAL_EMBEDDING_MODEL. Suggest to use EMBED_MODEL

While doing the change, we should also keep backward compatibility to make sure old version of config can work with new service, while the new environment names should take precedence: i.e.

TEI_EMBEDDING_ENDPOINT = os.getenv("TEI_EMBEDDING_ENDPOINT") or os.getenv("TEI_ENDPOINT")

Reproduce steps

n/a

Raw log

Attachments

No response

Metadata

Metadata

Labels

A3MaintainbugSomething isn't working

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions