Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Add exporter code to storage #7218

Open
wants to merge 60 commits into
base: master
Choose a base branch
from

Conversation

GitHK
Copy link
Contributor

@GitHK GitHK commented Feb 12, 2025

What do these changes do?

Add functionality in the simcore_s3_dsm that allows to export an archive, given a list of S3 object_keys.

Expose functionality to the frontend via the celery job endpoint.

Related issue/s

How to test

Dev-ops checklist

@GitHK GitHK self-assigned this Feb 12, 2025
@GitHK GitHK added the a:storage issue related to storage service label Feb 12, 2025
@GitHK GitHK added this to the Singularity milestone Feb 12, 2025
Copy link

codecov bot commented Feb 12, 2025

Codecov Report

Attention: Patch coverage is 96.15385% with 4 lines in your changes missing coverage. Please review.

Project coverage is 87.24%. Comparing base (b5230f1) to head (2766d20).

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #7218      +/-   ##
==========================================
- Coverage   87.31%   87.24%   -0.08%     
==========================================
  Files        1712     1550     -162     
  Lines       66429    62729    -3700     
  Branches     1125      909     -216     
==========================================
- Hits        58004    54726    -3278     
+ Misses       8105     7733     -372     
+ Partials      320      270      -50     
Flag Coverage Δ
integrationtests 65.29% <100.00%> (-0.02%) ⬇️
unittests 86.35% <96.15%> (-0.14%) ⬇️
Components Coverage Δ
api ∅ <ø> (∅)
pkg_aws_library ∅ <ø> (∅)
pkg_dask_task_models_library ∅ <ø> (∅)
pkg_models_library 92.04% <100.00%> (ø)
pkg_notifications_library 84.57% <ø> (ø)
pkg_postgres_database ∅ <ø> (∅)
pkg_service_integration 70.03% <ø> (ø)
pkg_service_library 72.25% <100.00%> (-0.01%) ⬇️
pkg_settings_library ∅ <ø> (∅)
pkg_simcore_sdk 85.46% <ø> (ø)
agent 96.46% <ø> (ø)
api_server 90.68% <ø> (ø)
autoscaling 96.08% <ø> (ø)
catalog 92.14% <ø> (ø)
clusters_keeper 99.24% <ø> (ø)
dask_sidecar 91.25% <ø> (ø)
datcore_adapter 98.11% <ø> (ø)
director 76.68% <ø> (ø)
director_v2 91.25% <ø> (-0.05%) ⬇️
dynamic_scheduler 97.33% <ø> (ø)
dynamic_sidecar 90.11% <ø> (ø)
efs_guardian 89.79% <ø> (ø)
invitations 93.28% <ø> (ø)
payments 92.66% <ø> (ø)
resource_usage_tracker 89.12% <ø> (ø)
storage 85.84% <95.65%> (+1.55%) ⬆️
webclient ∅ <ø> (∅)
webserver 85.90% <100.00%> (+0.04%) ⬆️

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b5230f1...2766d20. Read the comment docs.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link

@GitHK GitHK marked this pull request as ready for review March 20, 2025 13:57
Copy link
Member

@pcrespov pcrespov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thx

@@ -45,7 +46,9 @@
)

# Storage basic file ID
SIMCORE_S3_FILE_ID_RE = rf"^(api|({UUID_RE_BASE}))\/({UUID_RE_BASE})\/(.+)$"
SIMCORE_S3_FILE_ID_RE = (
rf"^(api\/{UUID_RE_BASE}|exports\/\d+|{UUID_RE_BASE}\/{UUID_RE_BASE})\/(.+)$"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIP: might be easier to read with a constant ?
Would lead to something like

rf`^{API_PREFIX_RE}|{EXPORTS_USER_RE}|\/{UUID_RE_BASE}\/{UUID_RE_BASE}`\/(.+)$ 

?

@pytest.mark.parametrize(
"object_key",
[
f"api/{UUID_0}/some-random-file.png",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeing these, I would rather add a union of three patterns

@@ -334,6 +335,10 @@ async def get_file_access_rights(
# ownership still not defined, so we assume it is user_id
return AccessRights.all()

if parent == "exports":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shared constant or make sure it is synced via tests

for entry in meta_data_files:
source_object_keys.add(entry.object_key)

_logger.debug("will archive '%s' files", len(source_object_keys))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIONR:log context might be more informative?

await self.abort_file_upload(
user_id=user_id, file_id=destination_object_key
)
raise
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

who is handling this exception?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a:storage issue related to storage service
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement functionality for fetching files/folders from s3, zipping them and uploading the zip to s3
3 participants