Skip to content

[Build] Add S3 dvc remote #227

@rojberr

Description

@rojberr

User Story:

As a machine learning engineer,
I want to configure an S3 bucket as a DVC remote for our ML project,
So that we can securely version large datasets/models, enable team collaboration, and reduce local storage requirements.


Acceptance Criteria:

  1. Do a cost analysis
  2. S3 bucket created with proper IAM permissions and bucket policies
  3. DVC configured to use S3 as default remote (dvc remote add)
  4. Authentication credentials securely stored (AWS CLI profile or env variables)
  5. Existing data pushed to S3 remote (dvc push)
  6. Cost monitoring alerts configured in AWS
  7. Documentation for team members on S3 remote usage

Definition of Done:

  • All acceptance criteria met
  • Code reviewed and approved
  • Validation tests for push/pull operations
  • Documentation updated in /docs/data_management.md
  • Team members can successfully pull data from S3

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions