remove `skip_fits_update` to reduce data duplication and risk of data modification loss #271

braingram · 2024-03-01T17:20:56Z

skip_fits_update determines if, when a file is opened, the fits headers are read to reconstruct the ASDF tree (including extra_fits).

This setting can be False True or None.

When False the ASDF tree (read from the ASDF extension) will be updated (on file read) as follows:

fits hdus linked to the tree via the schema will have their data read from the hdu (this always occurs)
fits keywords linked to the tree via the schema will be read and the tree updated to reflect the data in the keywords
fits hdus not linked to the tree will have their data assigned to the extra_fits portion of the ASDF tree
fits keywords not linked to the tree will have their values recorded in extra_fits

When skip_fits_update is True stdatamodels will (attempt to) skip 2 3 and 4 above, only performing 1 above (linking hdu data defined in the schema). However, 2 3 and 4 will still occur if:

No ASDF extension is found
The model type recorded in the fits file doesn't match the model type
A computed hash of the fits header (except for the ASDF extension) does not match a hash stored in _fits_hash on write
(see _verify_skip_fits_update for details).

When None the value will be read from the SKIP_FITS_UPDATE environment variable (or default to False).

skip_fits_update is unused in jwst
as is SKIP_FITS_UPDATE.

Furthermore the above behavior has some issues.

First, the use of extra_fits requires that any data not linked to the schema is duplicated in the fits headers/hdus and the ASDF extension on write. This is required so that on read, if skip_fits_update is True the values in extra_fits will still be readable. This is not a big issue for keywords but the inclusion of an extra hdu (with table or image data) will result in saving that data twice.

Secondly, the computed hash uses only the header values (not the hdu data). This means that a hdu not linked to the schema, if modified outside of stdatamodels will be ignored when skip_fits_update is True (if the hash of the headers still match which seems unlikely but possible).

The text was updated successfully, but these errors were encountered:

braingram mentioned this issue Mar 1, 2024

deprecate skip_fits_update #270

Merged

5 tasks

braingram mentioned this issue Mar 18, 2024

safely convert FITS_rec for non-schema data #268

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

remove `skip_fits_update` to reduce data duplication and risk of data modification loss #271

remove `skip_fits_update` to reduce data duplication and risk of data modification loss #271

braingram commented Mar 1, 2024

remove skip_fits_update to reduce data duplication and risk of data modification loss #271

remove skip_fits_update to reduce data duplication and risk of data modification loss #271

Comments

braingram commented Mar 1, 2024

remove `skip_fits_update` to reduce data duplication and risk of data modification loss #271

remove `skip_fits_update` to reduce data duplication and risk of data modification loss #271