Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate how to tackle cropping timestamps in raw data #369

Closed
yousefmoazzam opened this issue Jun 14, 2024 · 10 comments
Closed

Investigate how to tackle cropping timestamps in raw data #369

yousefmoazzam opened this issue Jun 14, 2024 · 10 comments
Labels
enhancement New feature or request question Further information is requested

Comments

@yousefmoazzam
Copy link
Collaborator

Some beamlines collect data such that there is a timestamp in the top left corner of projections. Leaving in this information can cause the processing of methods to behave unexpectedly (for example, incorrect normalisation performed), so it is best to crop out this region by cropping the detector_y dimension in some way.

This issue serves as a discussion place for how to go about doing this in a way that users would find easy.

@yousefmoazzam yousefmoazzam added enhancement New feature or request question Further information is requested labels Jun 14, 2024
@yousefmoazzam
Copy link
Collaborator Author

One simple idea mentioned in a Slack thread by myself:

I'd lean towards keeping it simple for now: have a loader template for data that can be copy+pasted for cropping out such a timestamp, or have the users enter it themselves. Anything being done "behind the scenes" feels like it's going to require some knowledge of which data is given to httomo (ie, if the data has the timestamp or not), and that sounds not very nice to be checking stuff like "is this I12 data -> it has timestamp, so crop automatically"

@dkazanc
Copy link
Collaborator

dkazanc commented Jun 14, 2024

The timestamps contain either zeros/ones or the values close to the limit of the data type (65535). In normalisation we seem to avoid division by zero, but then remove_nans should be always enabled to remove the consequences of log(0) or log(-value). Currently this is set to false, may be we should change it to true by default?

We might also want to alert the user that the data contains "too many" zeroes of values close to the precision limit. May be counting those elements in raw data won't be a big burden?

@dkazanc
Copy link
Collaborator

dkazanc commented Jun 14, 2024

quick follow-up. May be we shouldn't terminate a job but warn the user of the percentage of those values in the raw data. I'd say the levels higher than 5% should be alarming.

@jessicavers
Copy link
Collaborator

You may have already discussed this further, or I may be unaware of other details, so you may not need to reply. Relating to your comment about a loader template:

have a loader template for data that can be copy+pasted for cropping out such a timestamp, or have the users enter it themselves

Would you mean using the standard loader with an alternative preview value? Or standard loader with an additional parameter to specify the exclusion on time stamp? or a slightly different loader?

If it was the extra parameter suggestion, would it be something like this:

'crop_timestamp'
This parameter will crop the 10 data points collected across the whole of one axis. By default, the axis chosen is [.,*,.]

Relating to Daniil's comment:

We might also want to alert the user that the data contains "too many" zeroes of values close to the precision limit.

It does sound like a piece of feedback the user would want to be aware of if they weren't already.

@yousefmoazzam
Copy link
Collaborator Author

You may have already discussed this further, or I may be unaware of other details, so you may not need to reply. Relating to your comment about a loader template:

have a loader template for data that can be copy+pasted for cropping out such a timestamp, or have the users enter it themselves

Would you mean using the standard loader with an alternative preview value? Or standard loader with an additional parameter to specify the exclusion on time stamp? or a slightly different loader?

If it was the extra parameter suggestion, would it be something like this:

'crop_timestamp' This parameter will crop the 10 data points collected across the whole of one axis. By default, the axis chosen is [.,*,.]

@jessicavers I was meaning using the standard loader with a value for the preview parameter set to the desired value for a specific beamline. For example, for I12 data that needs the timestamp cropped out, have a loader template with the standard loader that has a preview value filled in which we know crops the timestamp out, like

preview:
  detector_y:
    start: 30

The suggestion was trying to provide the "simplest" solution in that it doesn't do anything clever, nor does it require a new feature or code changes (like an additional parameter or a different loader would). Simply a pre-written loader config for a beamline that can be copy+pasted.

@dkazanc
Copy link
Collaborator

dkazanc commented Jun 24, 2024

I think we shouldn't really do anything in addition from the loader point of view. We just need to make sure that the pipeline won't be exited based on some nan's / inf's in some pixels. I think previously it was due to rescale_to_int raising an error, which is fixed now.
The warning to the users on some issues like that is useful though but requires a different implementation. I'll create a different issue. @yousefmoazzam you created a similar issue somewhere, sorry cannot find it.

@yousefmoazzam
Copy link
Collaborator Author

yousefmoazzam commented Jun 24, 2024

The warning to the users on some issues like that is useful though but requires a different implementation. I'll create a different issue. @yousefmoazzam you created a similar issue somewhere, sorry cannot find it.

Possibly #267, though I'm not 100% sure if that's the one you mean?

@jessicavers
Copy link
Collaborator

Thank you for clarifying A loader template, using the standard loader, with a specific preview value.

A further suggestion, maybe including a message/recommendation within the verbose comment for the preview parameter? Drawbacks would be that it adds extra text. I'm also not sure that many users would know that -vv exists or would notice.

Or the possibility of a warning message, "I12 users should be aware that there may be a timestamp present. If so, we recommend preview value: " - perhaps slightly long.

@jessicavers
Copy link
Collaborator

jessicavers commented Jun 25, 2024

Okay, yes, that sounds good

@dkazanc
Copy link
Collaborator

dkazanc commented Jul 5, 2024

closing this as not planned for now and #372 should help partially. But, as I said earlier, the presence of nans/infs shouldn't break the run. If you remember in Savu the zeros were even intentionally converted to nans in the recon so that Dawn could do better scaling when visualising the images. Not that I like it...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants