Skip to content

Conversation

@RazvanLiviuVarzaru
Copy link
Contributor

Description

Add MariaDB Vector integrations for retriever & dataprep microservices.

Issues

n/a

Type of change

  • New feature (non-breaking change which adds new functionality)

Dependencies

  • retrievers Dockerfile: libmariadb-dev, build-essential
  • retrievers python requirments.txt: mariadb, langchain_mariadb
  • in dataprep Dockerfile: libmariadb-dev
  • dataprep python requirments.txt: mariadb, langchain_mariadb

Tests

The following tests will build the service docker image, run it and perform a series of tests against the exposed API endpoints.

cd tests
bash dataprep/test_dataprep_mariadb.sh
bash retrievers/test_retrievers_mariadb.sh

MariaDB Vector was introduced since MariaDB Server 11.7

Signed-off-by: Razvan-Liviu Varzaru <[email protected]>
Signed-off-by: Razvan-Liviu Varzaru <[email protected]>
Signed-off-by: Razvan-Liviu Varzaru <[email protected]>
@lvliang-intel
Copy link
Collaborator

@RazvanLiviuVarzaru,
please fix the code scan issue.
Uploading image.png…

- md5 is used for the primary key not as a security hash
- fixed mariadb readme headers

Signed-off-by: Razvan-Liviu Varzaru <[email protected]>
@RazvanLiviuVarzaru RazvanLiviuVarzaru force-pushed the feature/mariadb-vector branch from 42a3038 to 535f4cd Compare May 5, 2025 12:02
@RazvanLiviuVarzaru
Copy link
Contributor Author

@lvliang-intel fixed in 535f4cd

thanks!

Copy link
Collaborator

@letonghan letonghan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @RazvanLiviuVarzaru for your contribution!

@chickenrae chickenrae merged commit 9cf40c1 into opea-project:main May 6, 2025
66 of 67 checks passed
@joshuayao joshuayao added this to the v1.4 milestone May 8, 2025
@joshuayao joshuayao added this to OPEA May 8, 2025
@joshuayao joshuayao moved this to Done in OPEA May 8, 2025
@joshuayao joshuayao added the feature New feature or request label May 8, 2025
jilongW pushed a commit to jilongW/GenAIComps that referenced this pull request May 12, 2025
…roject#1645)

* Add MariaDB Vector third-party service

MariaDB Vector was introduced since MariaDB Server 11.7

Signed-off-by: Razvan-Liviu Varzaru <[email protected]>

* Add retriever MariaDB Vector integration

Signed-off-by: Razvan-Liviu Varzaru <[email protected]>

* Add dataprep MariaDB Vector integration

Signed-off-by: Razvan-Liviu Varzaru <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix CI failures

- md5 is used for the primary key not as a security hash
- fixed mariadb readme headers

Signed-off-by: Razvan-Liviu Varzaru <[email protected]>

---------

Signed-off-by: Razvan-Liviu Varzaru <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jilongw <[email protected]>
madison-evans pushed a commit to SAPD-Intel/GenAIComps that referenced this pull request May 12, 2025
…roject#1645)

* Add MariaDB Vector third-party service

MariaDB Vector was introduced since MariaDB Server 11.7

Signed-off-by: Razvan-Liviu Varzaru <[email protected]>

* Add retriever MariaDB Vector integration

Signed-off-by: Razvan-Liviu Varzaru <[email protected]>

* Add dataprep MariaDB Vector integration

Signed-off-by: Razvan-Liviu Varzaru <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix CI failures

- md5 is used for the primary key not as a security hash
- fixed mariadb readme headers

Signed-off-by: Razvan-Liviu Varzaru <[email protected]>

---------

Signed-off-by: Razvan-Liviu Varzaru <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
alexsin368 pushed a commit to alexsin368/GenAIComps that referenced this pull request May 15, 2025
…roject#1645)

* Add MariaDB Vector third-party service

MariaDB Vector was introduced since MariaDB Server 11.7

Signed-off-by: Razvan-Liviu Varzaru <[email protected]>

* Add retriever MariaDB Vector integration

Signed-off-by: Razvan-Liviu Varzaru <[email protected]>

* Add dataprep MariaDB Vector integration

Signed-off-by: Razvan-Liviu Varzaru <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix CI failures

- md5 is used for the primary key not as a security hash
- fixed mariadb readme headers

Signed-off-by: Razvan-Liviu Varzaru <[email protected]>

---------

Signed-off-by: Razvan-Liviu Varzaru <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: alexsin368 <[email protected]>
jilongW pushed a commit to jilongW/GenAIComps that referenced this pull request May 15, 2025
…roject#1645)

* Add MariaDB Vector third-party service

MariaDB Vector was introduced since MariaDB Server 11.7

Signed-off-by: Razvan-Liviu Varzaru <[email protected]>

* Add retriever MariaDB Vector integration

Signed-off-by: Razvan-Liviu Varzaru <[email protected]>

* Add dataprep MariaDB Vector integration

Signed-off-by: Razvan-Liviu Varzaru <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix CI failures

- md5 is used for the primary key not as a security hash
- fixed mariadb readme headers

Signed-off-by: Razvan-Liviu Varzaru <[email protected]>

---------

Signed-off-by: Razvan-Liviu Varzaru <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jilongw <[email protected]>
yinghu5 added a commit that referenced this pull request May 16, 2025
* add support for remote server

Signed-off-by: alexsin368 <[email protected]>

* add steps to enable remote server

Signed-off-by: alexsin368 <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove use_remote_service

Signed-off-by: alexsin368 <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add OpenAI models instructions, fix format of commands

Signed-off-by: alexsin368 <[email protected]>

* simplify ChatOpenAI instantiation

Signed-off-by: alexsin368 <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Revert "simplify ChatOpenAI instantiation"

This reverts commit b7c4acf.

* add back check and logic for llm_engine, set openai_key argument

Signed-off-by: alexsin368 <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Provide ARCH option for lvm-video-llama image build (#1630)

Signed-off-by: ZePan110 <[email protected]>
Signed-off-by: alexsin368 <[email protected]>

* Add sglang microservice for supporting llama4 model (#1640)

Signed-off-by: Ye, Xinyu <[email protected]>
Co-authored-by: Lv,Liang1 <[email protected]>
Signed-off-by: alexsin368 <[email protected]>

* Remove invalid codeowner. (#1642)

Signed-off-by: ZePan110 <[email protected]>
Signed-off-by: alexsin368 <[email protected]>

* add support for remote server

Signed-off-by: alexsin368 <[email protected]>

* add steps to enable remote server

Signed-off-by: alexsin368 <[email protected]>

* remove use_remote_service

Signed-off-by: alexsin368 <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: alexsin368 <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Signed-off-by: alexsin368 <[email protected]>

* bug fix for chunk_size and overlap cause error in dataprep ingestion (#1643)

* bug fix for dataingest url

Signed-off-by: Mustafa <[email protected]>

* add validation function

Signed-off-by: Mustafa <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* validation update

Signed-off-by: Mustafa <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update validation function

Signed-off-by: Mustafa <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Mustafa <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: alexsin368 <[email protected]>

* MariaDB Vector integrations for retriever & dataprep services (#1645)

* Add MariaDB Vector third-party service

MariaDB Vector was introduced since MariaDB Server 11.7

Signed-off-by: Razvan-Liviu Varzaru <[email protected]>

* Add retriever MariaDB Vector integration

Signed-off-by: Razvan-Liviu Varzaru <[email protected]>

* Add dataprep MariaDB Vector integration

Signed-off-by: Razvan-Liviu Varzaru <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix CI failures

- md5 is used for the primary key not as a security hash
- fixed mariadb readme headers

Signed-off-by: Razvan-Liviu Varzaru <[email protected]>

---------

Signed-off-by: Razvan-Liviu Varzaru <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: alexsin368 <[email protected]>

* update PR reviewers (#1651)

Signed-off-by: chensuyue <[email protected]>
Signed-off-by: alexsin368 <[email protected]>

* Expand test matrix, find all tests use 3rd party Dockerfiles (#1676)

* Expand test matrix, find all tests use 3rd party Dockerfiles

Signed-off-by: chensuyue <[email protected]>
Signed-off-by: alexsin368 <[email protected]>

* fix the typo of README.md Comp (#1679)

Update README.md for first entry of OPEA

Signed-off-by: alexsin368 <[email protected]>

* Fix request handle timeout issue (#1687)

Signed-off-by: lvliang-intel <[email protected]>
Signed-off-by: alexsin368 <[email protected]>

* FEAT: Enable OPEA microservices to start as MCP servers (#1635)

Signed-off-by: alexsin368 <[email protected]>

* Fix huggingface_hub API upgrade issue (#1691)

* Fix huggingfacehub API upgrade issue

Signed-off-by: lvliang-intel <[email protected]>
Signed-off-by: alexsin368 <[email protected]>

* add OpenAI models instructions, fix format of commands

Signed-off-by: alexsin368 <[email protected]>

* Fix dataprep opensearch ingest issue (#1697)

Signed-off-by: lvliang-intel <[email protected]>
Signed-off-by: alexsin368 <[email protected]>

* Fix embedding issue with ArangoDB due to deprecated HuggingFace API (#1694)

Signed-off-by: lvliang-intel <[email protected]>
Signed-off-by: alexsin368 <[email protected]>

* simplify ChatOpenAI instantiation

Signed-off-by: alexsin368 <[email protected]>

* Revert "simplify ChatOpenAI instantiation"

This reverts commit b7c4acf.

Signed-off-by: alexsin368 <[email protected]>

* add back check and logic for llm_engine, set openai_key argument

Signed-off-by: alexsin368 <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: alexsin368 <[email protected]>
Signed-off-by: ZePan110 <[email protected]>
Signed-off-by: Ye, Xinyu <[email protected]>
Signed-off-by: Mustafa <[email protected]>
Signed-off-by: Razvan-Liviu Varzaru <[email protected]>
Signed-off-by: chensuyue <[email protected]>
Signed-off-by: lvliang-intel <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Ying Hu <[email protected]>
Co-authored-by: ZePan110 <[email protected]>
Co-authored-by: Liang Lv <[email protected]>
Co-authored-by: Mustafa <[email protected]>
Co-authored-by: Razvan Liviu Varzaru <[email protected]>
Co-authored-by: chen, suyue <[email protected]>
Co-authored-by: Spycsh <[email protected]>
ZePan110 pushed a commit that referenced this pull request May 16, 2025
* Update prepare_xtune.sh

Signed-off-by: jilongw <[email protected]>

* Update prepare_xtune.sh

Signed-off-by: jilongw <[email protected]>

* MariaDB Vector integrations for retriever & dataprep services (#1645)

* Add MariaDB Vector third-party service

MariaDB Vector was introduced since MariaDB Server 11.7

Signed-off-by: Razvan-Liviu Varzaru <[email protected]>

* Add retriever MariaDB Vector integration

Signed-off-by: Razvan-Liviu Varzaru <[email protected]>

* Add dataprep MariaDB Vector integration

Signed-off-by: Razvan-Liviu Varzaru <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix CI failures

- md5 is used for the primary key not as a security hash
- fixed mariadb readme headers

Signed-off-by: Razvan-Liviu Varzaru <[email protected]>

---------

Signed-off-by: Razvan-Liviu Varzaru <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: jilongw <[email protected]>

* update PR reviewers (#1651)

Signed-off-by: chensuyue <[email protected]>
Signed-off-by: jilongw <[email protected]>

* Expand test matrix, find all tests use 3rd party Dockerfiles (#1676)

* Expand test matrix, find all tests use 3rd party Dockerfiles

Signed-off-by: chensuyue <[email protected]>
Signed-off-by: jilongw <[email protected]>

* fix the typo of README.md Comp (#1679)

Update README.md for first entry of OPEA

Signed-off-by: jilongw <[email protected]>

* add version check

Signed-off-by: jilongw <[email protected]>

* add doc

Signed-off-by: jilongw <[email protected]>

* update doc

Signed-off-by: jilongw <[email protected]>

---------

Signed-off-by: jilongw <[email protected]>
Signed-off-by: Razvan-Liviu Varzaru <[email protected]>
Signed-off-by: chensuyue <[email protected]>
Co-authored-by: Razvan Liviu Varzaru <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: chen, suyue <[email protected]>
Co-authored-by: Ying Hu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature New feature or request

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

5 participants