-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Moving archives between disks #140
Comments
Would that simply be -H ? |
Trying
As instructed by https://www.cyberciti.biz/faq/linux-unix-apple-osx-bsd-rsync-copy-hard-links/ without deleting source |
@tlaurion Good question. I'm fairly certain that As for Wyng itself, you're probably aware there are two ways to do this: A normal import like |
Btw, if you think |
It worked as expected! |
Is what I use now, where compression impacts the server and client without significant gains (archives are compressed afterall) |
Using FWIW, if this is used beyond a fresh copy/move of the archive (i.e. an existing copy is being updated) then |
[Hopefully it's OK to comment on closed issues] [edited to add rsync's I do daily wyng backups to a local ssh server and sync that local server's wyng archive weekly to a remote server at night when my qubes os pc is powered off. So this isn't something that #199 could solve (except if wyng could run on the local server without any passphrase/key, but if I'm not mistaken that's not the case). For that use case, the readme's I thought how this could be solved within wyng but it seemed too convoluted/fragile in the end. A simple workaround is to keep a hard linked copy of the wyng archive that is synchronized with the current archive after a successful rsync to the remote host. Eg.:
I tested this with a few volumes, this seems to be working pretty well. I'll report back if any issue pops up. Caveats:
Hope this helps! (I thought about sending a PR to update the readme but the content might be a bit too specific / unrelated to wyng). |
@t-gh-ctrl That's an interesting solution, thanks for posting it.
I think if you check inode usage you'll see this isn't true. Hardlinks are dir references to inodes; they are not inodes themselves. If at the end of the procedure the inode use is really doubled, then rsync has done something to de-link wyng.ln from the original tree (or, the archive had a very high degree of change between sessions and there are few commonalities between old and new). FWIW, Using an alternative to
A simple log of Wyng's pruning actions might get us 99% there, even without authenticating. On the data chunk level, pruning is extremely simple: there is a range of session dirs to be merged together and the oldest one becomes the target, then the files from the newer (to be deleted) sessions are moved into the target. I think that gives rsync all the help it needs to avoid wasting bandwidth. (Can this be done without a pruning log? Probably, if you compare the S_* entries between the origin and duplicate you could automatically find the 'missing' sessions on the origin and it would work as long as there is a prior session that is present in both locations.) You don't need to worry about the metadata files since they are small and rsync can duplicate them. With the following difference between Src archive and Dest...
...on Dest do something like:
I'm not sure |
@tasket, thank you for your detailed reply!
No idea why, for some reason I had always assumed that hard linking files used inodes but I now realize that was wrong. I stand corrected!
It's funny, despite using rsync since a couple years after it was created (I'm old...) I never paid attention to those options. I'll give those a try out of curiosity to see how they perform but I expect that the sheer number of files in wyng archives may not play well with fuzziness and/or performance...
Indeed, there might be other tools that work better. In hindsight I realize I got bent on using rsync because I'm used to it - all my other "data synchronization" tasks use rsync extensively and there isn't something that rsync hasn't managed to do well until now.
Yes - "fixing" the destination is something I initially planned to do (and which I actually did manually while doing tests yesterday) but even if the task of moving files to the oldest session is pretty trivial, the bulk of the code would be error handling for everything that can go wrong (which is somewhat mitigated by having rsync running afterwards to sync everything properly). So it seemed that "bending" rsync to work with my specific use case was easier; the approach I outlined does work but is definitely sub-optimal compared to a tool that would have knowledge of what wyng is doing: for instance rsync takes 5+ minutes for rsync to build the file list of a 160GB wyng archive on old-ish but still decent hardware. Which is fine for a cron job at night, but isn't when you want to run it interactively. |
I wouldn't consider the error handling too critical. In bash you can
Just wondering if rsync has the benefit of its remote daemon in your case? That is supposed to accelerate some operations. |
Indeed. One of these days I might give it a try. Rsync works for now- it's just not optimal...
It's already running in daemon mode to avoid an additional encaspulation layer (rsync over wireguard, instead of rsync over ssh over wireguard [edit: I can't ssh into the host directly, I have to use wireguard]); that was actually one of the reasons for sticking to rsync... |
@tasket : i'm looking forward into moving fix03 (latest) pruned archive between disks.
I'm not sure following read documentation on how to do this properly keeping hardlinks on origin to destination drive.
Obviously, mv won't do it.
rsync seems to do it properly on the same local filesystem, but it seems unclear if dowing so will duplicate content, which I'm tight on.
What would you recommend doing as rsync command to rsync between origin and destination drives?
The text was updated successfully, but these errors were encountered: