-
Notifications
You must be signed in to change notification settings - Fork 779
Description
We should make rsync-based syncing (workdir
, file_mounts
) match Git behavior as closely as practical, i.e. consistent ignores, ownership, symlinks across all sync methods.
Context
workdir
and local file_mounts
still rely on rsync, which many developer workflows depend on. But rsync behavior differs from Git in several key areas (ignores, ownership, symlinks), causing inconsistencies.
- Git workdir support ([Core] Support git repo #6257) gives us a clean baseline where remote state matches
git clone
- We have
.skyignore
([Storage] Add .skyignore support #4038), but with issues reported in [core] inconsistent .gitignore behavior withsky launch
rsync #5492 - Earlier cleanup attempt in Consolidate file filtering logic between rsync and bucket upload #5006 (closed as stale)
Problems
1. .skyignore
lacks negation pattern support (!pattern)
The current --filter='dir-merge,- .skyignore'
implementation strips Git-style negation patterns. We should either port to the storage_utils
parser for full Git-style support, or at least detect and forbid unsupported syntax with clear error messages.
2. Nested .skyignore
files ignored without root file
We only enable .skyignore
mode if the source root has one; otherwise everything falls back to .gitignore
. This contradicts docs that say "Any .skyignore
… is respected." Should either honor nested files or warn users clearly.
3. Root/Docker ownership inconsistencies
When the remote login user is root
, rsync -a
preserves local UID/GID on remote files. This differs from Git checkout and from non-root clusters. Should force --no-owner --no-group
under root, or at least document/warn.
Note: Still need to check if COPY and storage_mounts
have similar issues (#5006).