Skip to content

Conversation

@lianhao
Copy link
Collaborator

@lianhao lianhao commented Mar 31, 2025

Description

This is partial support running dataprep in airgap env. This only has been tested on redis/milvus/qdrant backend.

Issues

Partial of #1488.

Type of change

List the type of change like below. Please delete options that are not relevant.

  • New feature (non-breaking change which adds new functionality)

Dependencies

List the newly introduced 3rd party dependency if exists.

Tests

Describe the tests that you ran to verify your changes.

Copy link
Contributor

@eero-t eero-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there should be Dockerfile comments stating that these models and nltk are redis dependencies.

@lianhao lianhao changed the title dataprep: support airgap env for redis backend dataprep: support airgap env Apr 1, 2025
@lianhao
Copy link
Collaborator Author

lianhao commented Apr 1, 2025

I think there should be Dockerfile comments stating that these models and nltk are redis dependencies.

Actually, these are common to most DB backends. But I've only tested it on redis. I"ve updated the description to be more precise

@joshuayao joshuayao linked an issue Apr 17, 2025 that may be closed by this pull request
@joshuayao joshuayao added this to the v1.4 milestone Apr 17, 2025
@joshuayao joshuayao added the r1.4 label Apr 17, 2025
@lianhao lianhao force-pushed the 1488 branch 8 times, most recently from f5db704 to ac7b982 Compare May 19, 2025 08:16
@lianhao lianhao marked this pull request as draft May 19, 2025 09:02
Copy link
Contributor

@eero-t eero-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some doc/comment suggestions.

- Enhance dataprep to be able to run in the air gapped environment.

- Add documentionation of how to run dataprep in the air gapped
  environment.

Related to bug opea-project#1488.

Signed-off-by: Lianhao Lu <[email protected]>
@lianhao lianhao marked this pull request as ready for review May 20, 2025 06:19
@lianhao lianhao changed the title dataprep: support airgap env dataprep: support air gapped env May 26, 2025
@yinghu5 yinghu5 requested review from Copilot and yinghu5 June 6, 2025 02:16
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces partial support for running the dataprep microservice in air-gapped environments by adding offline mode configurations for redis, qdrant, and milvus backends. Key changes include:

  • Adding an optional "offline" parameter to start and validate service functions in the test scripts.
  • Updating documentation and Docker configurations to guide air-gapped setup.
  • Extending docker-compose configurations to support offline service variants.

Reviewed Changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
tests/dataprep/test_dataprep_redis.sh Added offline parameter handling and adjusted service startup logic.
tests/dataprep/test_dataprep_qdrant.sh Introduced offline mode with corresponding start and validate logic.
tests/dataprep/test_dataprep_milvus.sh Added offline mode support with a minor inconsistency in cleanup logic.
tests/dataprep/dataprep_utils.sh Added a new utility function for pre-downloading dataprep models.
comps/dataprep/src/README_{redis,qdrant,milvus}.md Updated documentation to include air gapped environment instructions.
comps/dataprep/src/Dockerfile Created a data directory and configured environment variables for offline mode.
comps/dataprep/deployment/docker_compose/compose.yaml Extended docker-compose definitions to include offline service variants.
comps/dataprep/README.md Added common guide and steps for running dataprep in an air gapped env.

Copy link
Collaborator

@yinghu5 yinghu5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

Copy link
Contributor

@eero-t eero-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved, looks OK to me.

@lianhao lianhao merged commit 711a305 into opea-project:main Jun 13, 2025
41 of 43 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] dataprep: support air-gapped environment

6 participants