Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adapting CutoutFactory to account for error-less TICA Cubes #88

Merged
merged 18 commits into from
Aug 8, 2023

Conversation

jaymedina
Copy link
Collaborator

@jaymedina jaymedina commented Jul 26, 2023

Summary

This PR updates CutoutFactory to account for the removed error array in TicaCubeFactory (work that was done in #87). It involves turning the product kwarg from cube_cut() into a class attribute so that it doesn't have to be fed into every method as an argument before being called. It also involves using this new self.product class attribute to make conditionals that make the code dynamic and changes the logic depending on whether the product for the cutouts is "TICA" or "SPOC". Unit test changes are also made where appropriate.


Task List


Screenshots

@jaymedina jaymedina added the enhancement New feature or request label Jul 26, 2023
@jaymedina jaymedina self-assigned this Jul 26, 2023
@jaymedina jaymedina marked this pull request as draft July 26, 2023 14:02
@codecov
Copy link

codecov bot commented Jul 26, 2023

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.05% 🎉

Comparison is base (f0b4c45) 94.40% compared to head (367da13) 94.45%.
Report is 5 commits behind head on main.

❗ Current head 367da13 differs from pull request most recent head fef425a. Consider uploading reports for the commit fef425a to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #88      +/-   ##
==========================================
+ Coverage   94.40%   94.45%   +0.05%     
==========================================
  Files           8        8              
  Lines        1429     1443      +14     
==========================================
+ Hits         1349     1363      +14     
  Misses         80       80              
Files Changed Coverage Δ
astrocut/cube_cut.py 98.97% <100.00%> (+0.03%) ⬆️
astrocut/cutouts.py 88.84% <100.00%> (ø)
astrocut/make_cube.py 94.15% <100.00%> (ø)

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@falkben falkben left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tesscut code will need to change for this to work

astroquery? anything else?

astrocut/cube_cut.py Show resolved Hide resolved
astrocut/cube_cut.py Outdated Show resolved Hide resolved
astrocut/cube_cut.py Outdated Show resolved Hide resolved
astrocut/cube_cut.py Outdated Show resolved Hide resolved
astrocut/cube_cut.py Outdated Show resolved Hide resolved
astrocut/cube_cut.py Outdated Show resolved Hide resolved
astrocut/make_cube.py Outdated Show resolved Hide resolved
@jaymedina
Copy link
Collaborator Author

jaymedina commented Jul 28, 2023

tesscut code will need to change for this to work

astroquery? anything else?

astroquery.mast.Tesscut wraps around the TESSCut API, so once we redeploy TESSCut API, we should be good to go (I will keep an eye on unit tests). We also have to ensure that this notebook runs correctly, too. More detail is in the Jira ticket for this Epic.

Side note: @hmlewis-astro will be doing a science review of this PR (cutout image/WCS inspection, etc) next sprint, so I'd like to hold off on merging until I get her approval. Thanks!

astrocut/cube_cut.py Outdated Show resolved Hide resolved
astrocut/tests/test_cube_cut.py Show resolved Hide resolved
@falkben
Copy link
Member

falkben commented Jul 31, 2023

There's some flake8 failures in CI -- look relatively trivial to fix: https://github.com/spacetelescope/astrocut/actions/runs/5716882528/job/15489424621?pr=88

would be good to have CI passing. I didn't look at the "allowed failure" at all -- that can be ignored for now, i think.

@jaymedina
Copy link
Collaborator Author

There's some flake8 failures in CI -- look relatively trivial to fix: https://github.com/spacetelescope/astrocut/actions/runs/5716882528/job/15489424621?pr=88

Interesting - I haven't seen that codestyle error before. I'll update with a new commit.

falkben
falkben previously approved these changes Jul 31, 2023
Copy link
Member

@falkben falkben left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

regarding the CI failure for python 3.11 and installing everything from source, I think we might need an additional step to install openblas: libblas-dev

this can be done in a different PR, though. Once this is merged, I don't mind taking a look.

@jaymedina jaymedina dismissed falkben’s stale review July 31, 2023 17:27

The merge-base changed after approval.

falkben
falkben previously approved these changes Jul 31, 2023
@jaymedina jaymedina dismissed falkben’s stale review July 31, 2023 17:48

The merge-base changed after approval.

@jaymedina
Copy link
Collaborator Author

Before we merge this into main:

I demo'ed the work here earlier this week and got a suggestion from @scfleming to keep the FLUX_ERR cutouts in TICA for the sake of making the code easier to integrate into 3rd party software like Lightkurve and user scripts. I believe this would involve changing the logic that starts here in this branch so that an empty array (empty_arr) gets filled in for each row of this column when the product is TICA, and also updating the unit tests accordingly (we will check TICA for the existence of FLUX_ERR again, and ensure that they are empty arrays). Let me know what you all think of this logic before I make a commit @falkben @hmlewis-astro

@falkben
Copy link
Member

falkben commented Aug 2, 2023

I'd be happy with that change @jaymedina -- it will make this cutout code easier to maintain, as well.

It will make the cutouts the same size as the SPOC cutouts, though, (not any smaller) but I don't think that was the intention with any of this.

@jaymedina
Copy link
Collaborator Author

It will make the cutouts the same size as the SPOC cutouts, though, (not any smaller) but I don't think that was the intention with any of this.

Updated accordingly. Yep, the cutouts themselves are small, so it's not as big of a deal for them to have the error array. The important thing was to reduce the size of the cubes by removing the error array there, and to adapt the CutoutFactory code to take this change into account. Thanks for the quick response!

@@ -729,10 +729,10 @@ def _build_tpf(self, cube_fits, img_cube, uncert_cube, cutout_wcs_dict, aperture

# Adding flux and flux_err (data we actually have!)
pixel_unit = 'e-/s' if self.product == 'SPOC' else 'e-'
flux_err_array = uncert_cube if self.product == 'SPOC' else empty_arr
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm a little uncertain about this. numpy array assignment is done by reference. i'm just worried that because we are using empty_arry here to assign into cadence_arry, or appending to columns, that something might change inside of empty_arr and then would be applied to flux_err_arry at this point.

e.g. to illustrate what i'm worried about

>>> import numpy as np
>>> empty_arr = np.zeros((2,3))
>>> other_arr = empty_arr[:,1]
>>> other_arr[0] = 1
>>> other_arr
array([1., 0.])
>>> empty_arr
array([[0., 1., 0.],
       [0., 0., 0.]])

maybe on this line, we should assign flux_err_arry to just np.zeros(img_cube.shape) to be sure it's empty?

maybe we're not changing cadence_array so this isn't an issue?

Copy link
Member

@falkben falkben Aug 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another thing i'm wondering -- should the error be zero? or should it be np.NaN?

Copy link
Collaborator

@scfleming scfleming Aug 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this too. I lean towards 0. for two reeasons at the moment:

  • the TICA cubes had the dummy values sent to 0. I think, so this maintains that part of it
  • adding NaN may result in issues for non-NaN-safe calculations, e.g., produce values of NaN for computation that included the flux uncertainty. Of course 0. could also introduce problems depending on what the user is doing with it, it's not obious to me which is better if someone isn't aware that these are essentially undefined but use the flux uncertainties in calcuations within their code.

E.g., a flux uncertainty of 0 wouldn't impact a propagation of error calculation, but a value of NaN would result in a NaN result.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in that case wouldn't it be better to nullify the results than to silently make the calculation with an incorrect value for the error?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that was my old philosophy: use NaN's as a forced flag to indicate to an unaware user that something is not defined, and you either need to mask or add nan-safe options to your function calls. these days I wonder how many functions filter out NaN values automatically or not. Do you have thoughts on this @hmlewis-astro @jaymedina ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in favor of that. I'll go ahead and make a commit unless there are other suggestions!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not in favor of using any actual number here -- the error/uncertainty is unknown. using actual numbers is likely just going to result in confusion. if we're returning an array to conform with some expectation that the cutout has a certain number of extensions or whatever, but the values are not to be used -- well, this is precisely what np.NaN is for. we might also consider writing something in the header of the fits file about it?

Copy link
Collaborator Author

@jaymedina jaymedina Aug 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not opposed to changing the vals to np.nans. One thing to consider tho, is that np.zero arrays are used for all the other columns which we don't have information about, for example CADENCEN0 which is the column that was referenced in the start of this thread (when making references to cadence_array). So if 1 column uses np.nan to fill in values we don't know about, and 1 column uses np.zero, that will be even more confusing.

What we can do for now is stick with the traditional np.zero array (I can reassign flux_err_array to np.zeros(img_cube.shape) to ensure its a zero array as suggested in the first comment of the thread), and once we come to can agreement on what the fill vals should be, I'll open up a new PR to update the empty_arr for all relevant columns. Let me know if this works with you all and I'll make a Jira ticket.

As for timeline, I'd like to get this PR merged in by end of this week or beginning of next week so I have enough time to update and redeploy the TESSCut API before the end of this sprint.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this solution works for now! Once you make the changes to reassign flux_err_array following Ben's original comment, I think this is ready to merge

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, thanks @hmlewis-astro !

@falkben
Copy link
Member

falkben commented Aug 2, 2023

Should the title of this PR change?

@jaymedina jaymedina changed the title Removing TICA Error Array from CutoutFactory Adapting CutoutFactory to account for error-less TICA Cubes Aug 2, 2023
@hmlewis-astro
Copy link
Contributor

I agree, keeping the FLUX_ERR in the TICA cutouts is fine, and will make things easier for users that want to run the same/similar code on cutouts made from SPOC and TICA cubes.

falkben
falkben previously approved these changes Aug 4, 2023
getattr(cube[1], cube_data_prop), threads=threads, verbose=verbose
)
img_cutout, uncert_cutout, aperture = self._get_cutout(getattr(cube[1], cube_data_prop), threads=threads,
verbose=verbose)
Copy link
Member

@falkben falkben Aug 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should probably align on what whitespace formatting we're using on this project to avoid unnecessary changes back and forth...

I use darker which just applies black selectively to my set of changes (instead of on the entire file).

@jaymedina jaymedina dismissed falkben’s stale review August 4, 2023 19:56

The merge-base changed after approval.

falkben
falkben previously approved these changes Aug 4, 2023
@jaymedina jaymedina dismissed falkben’s stale review August 4, 2023 20:03

The merge-base changed after approval.

@jaymedina jaymedina dismissed falkben’s stale review August 7, 2023 14:16

The merge-base changed after approval.

hmlewis-astro
hmlewis-astro previously approved these changes Aug 8, 2023
@jaymedina jaymedina dismissed hmlewis-astro’s stale review August 8, 2023 14:14

The merge-base changed after approval.

hmlewis-astro
hmlewis-astro previously approved these changes Aug 8, 2023
@jaymedina jaymedina dismissed hmlewis-astro’s stale review August 8, 2023 14:40

The merge-base changed after approval.

@jaymedina jaymedina merged commit b02f1d4 into main Aug 8, 2023
5 of 6 checks passed
@jaymedina jaymedina deleted the tica-cutout-factory branch August 8, 2023 14:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants