Replies: 6 comments 5 replies
-
Funnily enough, I was actually wishing I had something like this a couple of weeks ago during a recovery job, where I had a heap of damaged directory ZAPs and was trying to rebuild the directory structure from references between objects. So it might be useful even if there's no standard filesystem API to get at it, though we could easily make one if we wanted. I think the way to do it reasonably efficiently is something like this, which I thought about a little bit but am also making up as I go. Make new variable-length system attribute to store an array of {"ZPL_LINKDIRS", 0, SA_UINT64_ARRAY, 0} When we make a new link to an existing object and increment This lets us fix a low-key nit as well: right now, To handle multiple hard links in the same dir, we either:
Notice I haven't stored the name here. That's mostly because that's a lot more effort to manage, as it gets tied into the rename path, and also just a lot more stuff to store, for a function that probably isn't used all that often (I guess). The actual call that gets all the names can go and scan the listed dirs to find them, which isn't any different to finding a file by name. That feels about right as the MVP, I think. It's a little like block cloning in shape really. I think the overhead is low enough for a relatively rare event that we wouldn't need to complicate it futher. I have two questions, that might change how I did stuff here. What kind of programs use this feature, and how do they use it? If it's used infrequently, then the name lookup overhead is probably not a huge deal. But, if it's all the time, maybe we'd need to something more there (I have vague ideas, but I'm hoping it's not needed). The other is, what is the right behaviour if the list isn't complete? I don't see any reason why it shouldn't be in normal operation, but I think it informs a migration path. If the system calls involved can handle a partial or incorrect answer, and the overheads are negligible, then we can just start adding link information the first time a hard link is created, and on remove, if it's not there, just ignore. If they need complet information, then it's slightly different; we just ignore a missing (or a dataset-specific feature flag, I guess, but that feels heavyhanded; system attributes are designed to be extended). Yeah, this might even work 😆 |
Beta Was this translation helpful? Give feedback.
-
Make new variable-length system attribute to store an array of uint64_t. That is an interesting idea, so a file/inode would have an array in Although it does sound like you would have to list each directory to look for matching What kind of programs use this feature Only thing I have heard of is backup style programs. But yeah I have no seen anything actually break from returning empty hardlink list. It'd just backup the same file data multiple times. Honestly, hardlinks do not appeared to be used all that much from what I see, maybe mostly by So how much effort would one want to put into something like this, but it is nice to be "technically correct" :) |
Beta Was this translation helpful? Give feedback.
-
I'm not sure what the theoretical maximum size of a system attribute is. Once I saw that
Yeah, this model means a directory search to find the names. This is what already happens for A possible in-between option is storing the ZAP hash/cursor (a (I'm not really optimising ahead of time here, just thinking about the shapes we might want). For the link id, I guess the question is how "immutable" it has to be. It's a little tricky to find information about it, but the source for So yeah, I think really the only way to do that part is to keep some 1:1 association of link id to name. If was building from scratch, I'd be inclined to put it in the dir ZAP itself, alongside the name, but if I was building from scratch maybe I wouldn't do any of this this way heh. So yeah, maybe that's a macOS-specific extension to all this, unless it can be made extremely cheap.
Yeah, backup software is all I can think of; many things use hardlinks, but if they care about where they all are they do their own tracking. That's kinda good, we can do this how we like heh. Honestly, for me this is kind of a fun academic exercise, though like I said before, I would have liked to have it available under the hood.
Well maybe I'll prototype it this afternoon for fun heh :) |
Beta Was this translation helpful? Give feedback.
-
Here we go! https://github.com/robn/zfs/commits/zpl-linkdirs/ So, we make a file as normal, and get its parent and link count:
Adding a link sets up the
As we add more, the refcount goes up:
Addng a new reference in a different dir sets up a new entry:
As we remove them, the refcount goes down:
Removing the last one from the listed parent dir will take a reference from the array and move it to the parent, a nice improvement on current releases.
So if that's workable, the next thing would be to write a function to take that list and resolve names. And then it gets wired up to the platform APIs. That proves it out at least; pretty sure the overhead is low enough that it could be always-on, but I'd have to think a bit more about the finer details. If you think it's useful I can push on upstreaming it. And if not, it was a still a pleasant diversion for a lazy day :) |
Beta Was this translation helpful? Give feedback.
-
It might be worth revisiting this talk from the OpenZFS 2020 Dev Summit:
|
Beta Was this translation helpful? Give feedback.
-
Testing the code, I added a simple iterator, I do not need to build fullpath, as I already have to do that anyway, but has a test:
Comments (yes, its just a POC)
Still worked well: https://github.com/openzfsonwindows/openzfs/tree/zpl_linkdirs |
Beta Was this translation helpful? Give feedback.
-
So here I am, on my second platform that has an API-call to retrieve all sibling names pointing to the same hardlink.
In Windows, it is the
IRP_MJ_QUERY_INFORMATION / FileHardlinkInformation
call, returning ParentIDs, FileIDs, names.and under macOS
APFSIOC_NEXT_LINK
(HFSIOC_NEXT_LINK
) returninginode
andlinkid
.(
inode
IDs are the same with a hardlink, butlinkID
s are unique so you can distinguish between them, you canstat()
with either. Current implementation raiseslinkID
up in the same range upstream does for.zfs
entries - macOS let's youstat(Filename)
)So pondering solutions for what to do about Hardlinks and the ability to look up siblings. Doing a big recursive search is not really going to be an option.
For macOS, we currently build an in-memory map whenever we come across a file with
n_links > 1
, and dynamically assign alinkID
to the full path name. As it happens,linkID
s do not need to survive reboot/remounts, just be unique during the mount. It does mean it works when you import a pool from Linux/FreeBSD which contains hardlinks (I mean, eventually, see below)But it does have a pretty big drawback, if it has not yet "come across" a hardlink (by an earlier directory listing request, or similar) then it does not know about siblings (or has a subset of siblings).
Another option would be to build a xattr and store it with the file. Fairly easy to maintain, add ParentID/FileID/Name when increasing
n_links
and remove when decreasing. But alas, this does nothing when pool is created by Linux/FreeBSD.Although, with upstream support, a dataset feature "hardlink_tracking" could be PRed, should Linux/FreeBSD one day want to add API calls for sibling support. However unlikely that is.
Something in between? Leverage
scrub
to silently/softly track hardlinks, then update xattr once completed? What about resumed scrubs? Too hacky?Ultimately though, I have not found anything that breaks when returning empty sibling lists for Hardlinks, so it is more of a pursuit of perfection/compliance, than any actual need.
Thoughts?
Beta Was this translation helpful? Give feedback.
All reactions