-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
--mode verify detects lots of unexpected diffs in metadata without server timestamp boost #49
Comments
@yarikoptic The error message seems pretty clear to me:
This is the Archive's fault for not updating the Dandiset's draft version's |
But it is |
@yarikoptic Based on the below script, there are only 6 Dandisets that have been unembargoed (000253, 000408, 000773, 000774, 000897, and 000935). Is the problem described in the original comment still an issue? #!/bin/bash
set -eu -o pipefail
dandiset_root=/mnt/backup/dandi/dandisets
cd "$dandiset_root"
for ds in 0*
do
embargo_status="$(git -C "$ds" config --file .datalad/config --default OPEN --get dandi.dandiset.embargo-status)"
if [ "$embargo_status" = OPEN ] \
&& git -C "$ds" log -S EMBARGOED -n1 -- .datalad/config | grep -q .
then echo "$ds"
fi
done |
@yarikoptic Ping. |
blocked by #56 ATM. Please just rerun that command with --verify whenever we do not have ongoing backup process running |
@yarikoptic This problem is still occurring, but seeing as it's affecting Dandisets that are still embargoed, the problem seems to be solely with Dandi Archive. I have filed dandi/dandi-archive#2002. |
currently, after @jjnesbitt ping on dandi/dandi-archive#2002 we have
detailed diff - so went from null to filled out | Metadata diff:
|
| --- old-metadata
| +++ new-metadata
| @@ -1,2 +1,51 @@
| -null
| -...
| +asset_id: a2958ce9-a5c8-4137-9fd0-ace1ffd0655b
| +blob: 8a0a157a-58d7-463a-885b-796202eefe7d
| +created: '2024-07-15T15:29:35.791159Z'
| +metadata:
| + '@context': https://raw.githubusercontent.com/dandi/schema/master/releases/0.6.7/context.json
| + access:
| + - schemaKey: AccessRequirements
| + status: dandi:OpenAccess
| + blobDateModified: '2024-02-06T13:30:46.886873-05:00'
| + contentSize: 173
| + contentUrl:
| + - https://api.dandiarchive.org/api/assets/a2958ce9-a5c8-4137-9fd0-ace1ffd0655b/download/
| + - https://dandiarchive.s3.amazonaws.com/blobs/8a0/a15/8a0a157a-58d7-463a-885b-796202eefe7d
| + dateModified: '2024-07-15T11:29:34.548579-04:00'
| + digest:
| + dandi:dandi-etag: be589a56cf11aa658c8a8a368c90d299-1
| + dandi:sha2-256: f3692617d26821b40426a41de2ddcbdfbf4bc29e59a9f413c9bd6d0bd9b43ca9
| + encodingFormat: application/json
| + id: dandiasset:a2958ce9-a5c8-4137-9fd0-ace1ffd0655b
| + identifier: a2958ce9-a5c8-4137-9fd0-ace1ffd0655b
| + path: derivatives/MRI-pipeline/sub-SP002/anat/sub-SP002_ses-MRI_flip-3_chunk-1_to-SP002_xfm.json
| + schemaKey: Asset
| + schemaVersion: 0.6.7
| + wasGeneratedBy:
| + - description: Metadata generated by DANDI cli
| + endDate: '2024-07-15T11:29:34.217129-04:00'
| + id: urn:uuid:7069a42d-fedc-4eb5-a275-c4d578a18093
| + name: Metadata generation
| + schemaKey: Activity
| + startDate: '2024-07-15T11:29:34.217129-04:00'
| + wasAssociatedWith:
| + - identifier: RRID:SCR_019009
| + name: DANDI Command Line Interface
| + schemaKey: Software
| + url: https://github.com/dandi/dandi-cli
| + version: 0.62.3
| + - description: Metadata generated by DANDI cli
| + endDate: '2024-07-15T11:29:34.548518-04:00'
| + id: urn:uuid:738f5381-7476-429a-9efc-c00a0f058c6e
| + name: Metadata generation
| + schemaKey: Activity
| + startDate: '2024-07-15T11:29:34.548518-04:00'
| + wasAssociatedWith:
| + - identifier: RRID:SCR_019009
| + name: DANDI Command Line Interface
| + schemaKey: Software
| + url: https://github.com/dandi/dandi-cli
| + version: 0.62.3
| +modified: '2024-07-15T15:29:35.791174Z'
| +path: derivatives/MRI-pipeline/sub-SP002/anat/sub-SP002_ses-MRI_flip-3_chunk-1_to-SP002_xfm.json
| +size: 173 and the same for
similarly a diff on derivatives/OCT-pipeline/sub-SP002/micr/sub-SP002_ses-OCT_sample-01_res-20um_OCT.json | --- old-metadata
| +++ new-metadata
| @@ -1,52 +1,51 @@
| -asset_id: fc7f85dd-abd5-488e-857f-efc600e6dd0e
| -blob: 035e3880-07c5-4fca-85d7-31162988506e
| -created: '2024-01-30T16:57:58.459166Z'
| +asset_id: 3e98c412-b4be-4e3d-8709-662e721cba30
| +blob: d7aab918-d0a0-4abd-b116-edd53b379591
| +created: '2024-07-15T15:43:57.648669Z'
| metadata:
| - '@context': https://raw.githubusercontent.com/dandi/schema/master/releases/0.6.4/context.json
| + '@context': https://raw.githubusercontent.com/dandi/schema/master/releases/0.6.7/context.json
| access:
| - schemaKey: AccessRequirements
| status: dandi:OpenAccess
| - blobDateModified: '2024-01-23T15:00:15.383186-05:00'
| - contentSize: 164
| + blobDateModified: '2024-02-06T13:23:09.788738-05:00'
| + contentSize: 165
| contentUrl:
| - - https://api.dandiarchive.org/api/assets/fc7f85dd-abd5-488e-857f-efc600e6dd0e/download/
| - - https://dandiarchive-embargo.s3.amazonaws.com/000874/blobs/035/e38/035e3880-07c5-4fca-85d7-31162988506e
| - dateModified: '2024-01-30T11:57:57.799582-05:00'
| + - https://api.dandiarchive.org/api/assets/3e98c412-b4be-4e3d-8709-662e721cba30/download/
| + - https://dandiarchive.s3.amazonaws.com/blobs/d7a/ab9/d7aab918-d0a0-4abd-b116-edd53b379591
| + dateModified: '2024-07-15T11:43:55.907615-04:00'
| digest:
| - dandi:dandi-etag: 97fb6db3d084dd8d1e73ea1cef3ec2ca-1
| - dandi:sha2-256: a2b6f12545fca55730d5295967fac3f6c3b28c24bd2b56047043057a5ecbbe08
| + dandi:dandi-etag: bc5150cdd5c99e1981297ab02c89cbff-1
| + dandi:sha2-256: e4affd47e0e521bfb16e05146816b85c58ac9e50ef7db0158b86fe0abeb7689c
| encodingFormat: application/json
| - id: dandiasset:fc7f85dd-abd5-488e-857f-efc600e6dd0e
| - identifier: fc7f85dd-abd5-488e-857f-efc600e6dd0e
| + id: dandiasset:3e98c412-b4be-4e3d-8709-662e721cba30
| + identifier: 3e98c412-b4be-4e3d-8709-662e721cba30
| path: derivatives/OCT-pipeline/sub-SP002/micr/sub-SP002_ses-OCT_sample-01_res-20um_OCT.json
| schemaKey: Asset
| - schemaVersion: 0.6.4
| - wasAttributedTo: []
| + schemaVersion: 0.6.7
| wasGeneratedBy:
| - description: Metadata generated by DANDI cli
| - endDate: '2024-01-30T11:57:57.019465-05:00'
| - id: urn:uuid:03a8ccfc-6407-41f0-9eab-f5a40c670f11
| + endDate: '2024-07-15T11:43:54.388406-04:00'
| + id: urn:uuid:83534e7b-f35c-41f4-ae5c-6f3cde372951
| name: Metadata generation
| schemaKey: Activity
| - startDate: '2024-01-30T11:57:57.019465-05:00'
| + startDate: '2024-07-15T11:43:54.388406-04:00'
| wasAssociatedWith:
| - identifier: RRID:SCR_019009
| name: DANDI Command Line Interface
| schemaKey: Software
| url: https://github.com/dandi/dandi-cli
| - version: 0.59.0
| + version: 0.62.3
| - description: Metadata generated by DANDI cli
| - endDate: '2024-01-30T11:57:57.799471-05:00'
| - id: urn:uuid:5f2bf9bb-dcfd-41e3-b418-4195ce154b24
| + endDate: '2024-07-15T11:43:55.907564-04:00'
| + id: urn:uuid:fa95ee7b-a5e2-42f1-8fcb-cf0d016d89a9
| name: Metadata generation
| schemaKey: Activity
| - startDate: '2024-01-30T11:57:57.799471-05:00'
| + startDate: '2024-07-15T11:43:55.907564-04:00'
| wasAssociatedWith:
| - identifier: RRID:SCR_019009
| name: DANDI Command Line Interface
| schemaKey: Software
| url: https://github.com/dandi/dandi-cli
| - version: 0.59.0
| -modified: '2024-01-30T16:57:58.459185Z'
| + version: 0.62.3
| +modified: '2024-07-15T15:43:57.700913Z'
| path: derivatives/OCT-pipeline/sub-SP002/micr/sub-SP002_ses-OCT_sample-01_res-20um_OCT.json
| -size: 164
| +size: 165
so on 2024-07-15 we seems had an upload of few updated files to 000874 (note: embargoed) but we still have metadata for dandiset saying
|
on 000719 zarr issue getting the bits together
dandi@drogon:~$ s3cmd -c ~/.s3cfg-dandi-backup ls -l s3://dandiarchive/zarr/7e1b3b36-a94a-427f-9793-b14e344a04f2/acquisition/TwoPhotonSeries/data/10006.5.7
2024-04-02 13:03 24908 5feda0eb2702d2422f1451d873caf677 STANDARD s3://dandiarchive/zarr/7e1b3b36-a94a-427f-9793-b14e344a04f2/acquisition/TwoPhotonSeries/data/10006.5.7 our local zarr git has clean state and
and no file
I wish we had audit readily available to get records on this zarr to check exactly what/when was happening on that zarr ... |
ok, no time for chasing this bug in dandi-archive somewhere ATM. For now
now rerunning for all |
@yarikoptic What is the concise bug report that we could move forward on solving in the archive once we're able? Is it mainly that the metadata of dandiset draft versions are updated without the |
for non-zarr assets -- I have not troubleshooted to the form beyond the fact that changes to assets (not dandiset level) metadata did not result in the change of the dandiset may be you reextracted some metadata for some assets (listed in above report) also modifying them inplace and thus not triggering for zarr -- filed |
After
I manually ran the
--mode verify
sweep and it errorred out quite loudly -- here is the trail pointing to the full logfrom which it looks like potentially unemabrgoing forgetting to reset the
modified
may be?The text was updated successfully, but these errors were encountered: