Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need to provide support for different "instances" of DANDI archive #76

Open
yarikoptic opened this issue Aug 11, 2021 · 4 comments
Open

Comments

@yarikoptic
Copy link
Member

ATM

id: DANDI:000018/draft
identifier: DANDI:000018

on both staging instance AND on deployed for completely disconnected dandisets. I have committed a "crime" and with my glorious admin powers uploaded some test file to the main deployment instance instead of the staging (not a bigger -- this dandiset is empty on deployment) since forgot to add instance I want to upload to.

I think:

  • staging instance should acquire its own prefix, such as DANDI-STAGING so any new dandiset created on staging would get that new identifier
  • schema should allow for having those few (+ for tests etc) to start with explicitly listed prefixes, but eventually we might make it more of a schema "parametrization" somehow e.g. via env var (but see avoid needing DANDI_ALLOW_LOCALHOST_URLS env var for tests #67 for DANDI_ALLOW_LOCALHOST_URLS - so may be some "better" way)
  • dandischema model for the *Dandiset should gain some sameAs or alike to provide alternative identifiers (if not ids) to allow for mapping
  • dandi-cli should default to the instance to interact with corresponding prefix, and allowed to be overridden -- but then consult the sameAs for possible mapping to another "identifier" on the other instance
  • dandi-api instance should be provided prefix to be used (so DANDI for deployment, DANDI-STAGING for staging)

This might become also relevant for orchestrating data interactions for embargoed datasets, if we would allow for some kind of a hybrid, where some prior versions might be made public and subsequent embargoed/private and we have separate DANDI archives for public/private.

@kabilar
Copy link
Member

kabilar commented Apr 11, 2024

Thanks @yarikoptic. This would also be helpful for our LINC staging and production instances. cc @aaronkanzer

@yarikoptic
Copy link
Member Author

@satra @kabilar @aaronkanzer we better attack this one asap. I think there should be allowance for an arbitrarily long list of id and identifiers, but then we would get affects on manifestLocation and url.
We should review that in the scope of "instances" as described in dandi-cli https://github.com/dandi/dandi-cli/blob/e8611498496f90f722370e7b852cfc2ef1fd08b5/dandi/consts.py#L120 and extend with missing metadata (like identifiers_org_prefix which is present for main instance as DANDI) etc.

@kabilar
Copy link
Member

kabilar commented Aug 10, 2024

Thank you, Yarik. Agree that this is an important issue.

Is this a blocker for something you are working on? Trying to decide if we should change our current priority list and bump down an active target or the targets in the queue (i.e. improving the user sign up flow, and DOI handling).

@satra
Copy link
Member

satra commented Aug 11, 2024

@yarikoptic - perhaps we can start with pulling out a config system (there are already pieces of this through environment variables, but even those variables need to be configurable based on prefix). this part should not be hard. the harder part is how we do systems like identifiers.org which only have one prefix. for this second part we can consider another config system on related services based on the initial config. a production instance may interact with different sets of services compared to a test or sandbox instance. that flexibility will also allow us to evaluate new services and interactions.

since @aaronkanzer did a lot of work in developing the linc instance and noting components that were similar or different, i think one of the priorities that the engineering core could consider after audit is in place is the white-label infrastructure.

in the meantime, abstracting the config object in dandi-schema would be a good thing to do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants