Making Storage syncing explicit and moving it to SYNC_FILE_MOUNTS
stage
#1028
romilbhardwaj
started this conversation in
RFC
Replies: 2 comments
-
+1. Making that API lazy (or lazier) is more in line with other parts of our system and dataflow systems. |
Beta Was this translation helpful? Give feedback.
0 replies
-
I like the proposed method. I was confused by the
There are two possible solutions:
I think the second solution may make more sense. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Context
In the current implementation of
sky.Storage
, uploading of local files ("syncing") to aStorage
is an implicit operation when theStorage
is initialized. That is, we force a sync whenever theStorage.add_store()
method is called. This is undesirable because:Storage
object is created, not when it's actually used. E.g.,sky launch examples/storage_demo.yaml
would first sync the local directories with the s3 stores, even before the confirmation prompt is shown to the user. If the user cancels the launch, the file upload is wasted.sky storage delete
, we have to add thesync_on_reconstruction=False
argument to force it to not sync. @Michaelvll also rightly brought this up in his review offorce_managed
field for Storage #992.add_store
with syncing, which I now think should be two distinct operations.Proposal
I propose we make syncing an explicit operation that must be called by the
Storage
object creator. We already have theStorage.sync_all_stores()
method - the creator can call this method to execute a file upload. For example:Current implementation
Proposal
This allows us to do the file upload in the
SYNC_FILE_MOUNTS
stage, whileadd_store
can still be invoked earlier by the.from_yaml_config
or__init__
methods.Thoughts? cc @Michaelvll @michaelzhiluo
Beta Was this translation helpful? Give feedback.
All reactions