-
Notifications
You must be signed in to change notification settings - Fork 61
Description
I just want to see if anyone has any ideas or experience around this. I have a separate metadata catalogue of my zarr (soon to be icechunk) repositories to make it easier to find data. Right now, the metadata database has a field for the url pointing to the object, but I guess I may want to at some point migrate the data to some other object store. It would be nice if I had resolvable PIDs in that field instead, so it doesn't break for people if I move the data.
With DOIs, it seems to be expected that the DOI would resolve to a landing page, that might have a link to download the 'file', but it feels like it would be better to have something that resolves directly to the object (or collection of objects) in s3, so users can:
- Query the metadata catalogue with our API
- Read the
urlfield which contains a PID - Put that directly in to
icechunk.s3_storage()
I guess then you'd need some way to parse out the endpoint_url, bucket, and prefix so icechunk knew what to do with it. What would be even cooler would be if you could utilise versionable PIDs and have the PID somehow resolve to specific tags of your icechunk repo... but maybe I'm getting carried away at this point.