Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove skip_fits_update to reduce data duplication and risk of data modification loss #271

Open
braingram opened this issue Mar 1, 2024 · 0 comments

Comments

@braingram
Copy link
Collaborator

skip_fits_update determines if, when a file is opened, the fits headers are read to reconstruct the ASDF tree (including extra_fits).

This setting can be False True or None.

When False the ASDF tree (read from the ASDF extension) will be updated (on file read) as follows:

  1. fits hdus linked to the tree via the schema will have their data read from the hdu (this always occurs)
  2. fits keywords linked to the tree via the schema will be read and the tree updated to reflect the data in the keywords
  3. fits hdus not linked to the tree will have their data assigned to the extra_fits portion of the ASDF tree
  4. fits keywords not linked to the tree will have their values recorded in extra_fits

When skip_fits_update is True stdatamodels will (attempt to) skip 2 3 and 4 above, only performing 1 above (linking hdu data defined in the schema). However, 2 3 and 4 will still occur if:

  • No ASDF extension is found
  • The model type recorded in the fits file doesn't match the model type
  • A computed hash of the fits header (except for the ASDF extension) does not match a hash stored in _fits_hash on write
    (see _verify_skip_fits_update for details).

When None the value will be read from the SKIP_FITS_UPDATE environment variable (or default to False).

skip_fits_update is unused in jwst
as is SKIP_FITS_UPDATE.

Furthermore the above behavior has some issues.

First, the use of extra_fits requires that any data not linked to the schema is duplicated in the fits headers/hdus and the ASDF extension on write. This is required so that on read, if skip_fits_update is True the values in extra_fits will still be readable. This is not a big issue for keywords but the inclusion of an extra hdu (with table or image data) will result in saving that data twice.

Secondly, the computed hash uses only the header values (not the hdu data). This means that a hdu not linked to the schema, if modified outside of stdatamodels will be ignored when skip_fits_update is True (if the hash of the headers still match which seems unlikely but possible).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant