-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix validation error when only Zarr assets are uploaded #2062
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need a bit more explanation of the actual bug, and why this fixes it. We need to be careful with Zarr statuses because those have a complex relationship with Assets that results in (perhaps overly) complex semantics.
@@ -90,7 +92,9 @@ def version_aggregate_assets_summary(version: Version) -> None: | |||
|
|||
assets_summary = aggregate_assets_summary( | |||
asset.full_metadata | |||
for asset in version.assets.filter(status=Asset.Status.VALID) | |||
for asset in version.assets.filter( | |||
Q(status=Asset.Status.VALID) | Q(zarr__status=ZarrArchiveStatus.UPLOADED) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What was the problem that led to this particular change? We need @jjnesbitt to confirm, but I don't think we should be counting Zarr archives in the UPLOADED
state (those have not yet been "finalized" in the terminology of the Zarr API; once they are, I believe they move into the COMPLETE
state).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch -- updated to COMPLETE
here and in the unit test.
The expansion of the filter
query here is to match similar logic in the validate_pending_asset_metadata
Celery task -- specific code reference here
In the case where there is no
only evaluates the |
Hi @aaronkanzer, thank you for the original fix and pushing these changes upstream to DANDI. Hi @waxlamp @jjnesbitt, I made some minor changes to the original comment above to further describe the issue and fix. Please take a look at it when you have a chance. Thank you. |
if asset_type == 'blob': | ||
assert version.metadata['assetsSummary']['numberOfBytes'] == asset.blob.size | ||
|
||
elif asset_type == 'zarr': | ||
assert version.metadata['assetsSummary']['numberOfBytes'] == asset.size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe asset.size
can be used for both here, since it abstracts over blob vs zarr.
if asset_type == 'blob': | |
assert version.metadata['assetsSummary']['numberOfBytes'] == asset.blob.size | |
elif asset_type == 'zarr': | |
assert version.metadata['assetsSummary']['numberOfBytes'] == asset.size | |
assert version.metadata['assetsSummary']['numberOfBytes'] == asset.size |
Description of error
As documented in #1814, we receive the following validation error on the web application when uploading only Zarr asset(s) to a Dandiset (and not blobs):
The error is raised from
dandi-schema
when a validation check occurs on theassetSummary
field in thedandiset.yaml
. See the models module:And it arises because in the query below only the
blob
asset is evaluated:dandi-archive/dandiapi/api/services/metadata/__init__.py
Line 127 in 0a4bf70
Description of proposed fix
The proposed changes additionally evaluate the size of Zarr assets that are in the
COMPLETE
state to generate theassetSummary:numberOfBytes
metadata field.cc @kabilar