-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add parameter sweep functionality #380
Conversation
3195c7f
to
ee5d7af
Compare
For an example of how the logging looks so far (I've only added the basics), with the following pipeline (sweeps over 10 - method: standard_tomo
module_path: httomo.data.hdf.loaders
parameters:
name: tomo
data_path: entry1/tomo_entry/data/data
image_key_path: entry1/tomo_entry/instrument/detector/image_key
rotation_angles:
data_path: /entry1/tomo_entry/data/rotation_angle
dimension: 1
pad: 0
preview:
detector_y:
start: 50
stop: 57
- method: normalize
module_path: httomolibgpu.prep.normalize
parameters:
cutoff: 10.0
minus_log: true
nonnegativity: false
remove_nans: false
- method: paganin_filter_tomopy
module_path: httomolibgpu.prep.phase
parameters:
pixel_size: 0.0001
dist: 50.0
energy: 53.0
alpha: 0.001
- method: remove_all_stripe
module_path: httomolibgpu.prep.stripe
parameters:
snr: 3.0
la_size: 61
sm_size: 21
dim: 1
- method: FBP
module_path: httomolibgpu.recon.algorithm
save_result: False
parameters:
center: !SweepRange
start: 10
stop: 20
step: 1
filter_freq_cutoff: 0.6
recon_size: null
recon_mask_radius: null
- method: save_to_images
module_path: httomolib.misc.images
parameters:
subfolder_name: images
axis: 1
file_format: tif
bits: 8
perc_range_min: 0.0
perc_range_max: 100.0
jpeg_quality: 95
offset: 0
asynchronous: true the following terminal output is produced:
Feel free to make any suggestions on tweaks/additions for improvement 🙂 |
8c6f82a
to
e4cf022
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea re factoring out the various block interfaces into smaller protocols, etc.
Is there any chance to re-use more of TaskRunner
in ParameterSweepRunner
somehow? Although it's not too much, there is a good bit of overlap. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to go for me.
Implements the simple property getters.
The motivation behind this change is that it would be useful to be able to use method wrapper objects (implementors of `MethodWrapper`) to execute methods in both: - the high throughput runner `TaskRunner` - the parameter sweep runner `ParamSweepRunner` The `DataSetBlock` class contains code that is mostly compatible with a parameter sweep run. However, one incompatibility is the `DataSetBlock.data.setter()` method, which places a constraint on the data involving the slicing dimension of the data. In parameter sweep runs there is only a single process being run, and so the concept of slicing dimension isn't relevant. Given that `DataSetBlock` contains a fair amount of logic that *is* usable in a parameter sweep run, after various iterations of testing out what could work well, the logic in `DataSetBlock` that is usable across both high throughput runs and parameter sweep runs has been extracted into a separate `BaseBlock` class. A high-level overview of the organisation of block-related functionality now is: - the various functionalites needed for a block type to be processable by implementors of `MethodWrapper` have been organised into three separate protocols: `BlockData`, `BlockTransfer`, and `BlockIndexing` - `BaseBlock` provides "typical implementations" for `BlockData` and `BlockTransfer`, which are reused by `DataSetBlock` via inheritance - thus, `BaseBlock` contains the code for "typical" implementations of methods that are common to blocks across both high throughput runs and parmeter sweep runs - `DataSetBlock` contains only the code specific to processing blocks in high throughput runs - in the future, the "typical implementations" contained in `BaseBlock` can be reused by a new block type that will be used in parameter sweep runs
… + no of sweep Before, the array to hold all sweep results was inferred from the `single_shape` value that was given when the writer object was first created. This had the implicit assumption that the shape of a single sweep result would be known at the time the writer object is created. This assumption holds when the method being executed in the sweep doesn't change the shape of the data during processing (most methods adhere to this). However, this assumption doesn't hold when the method executed in the sweep does change the shape of the data during processing (for example, reconstruction methods). Therefore, the writer now infers the shape of a single sweep result when the first block to be written has been given. This can then can be used to infer the shape of the array to hold all sweep results for both cases (when the method being executed in the sweep does and doesn't change the shape of the data).
Instead of creating multiple copies of the same wrapper with different values for the parameter to sweep over, use a single wrapper and update the parameters before each execution to reflect the new sweep value to be used.
Instead of requiring that the caller of the sweep runner creates the `Stages` object, have the sweep runner generate that itself from the pipeline object.
By default, the YAML loader used by the UI layer to load YAML pipeline files is now the modified version of `yaml.SafeLoader` that additionally handles the `!Sweep` and `!SweepRange` tags.
af0ff1b
to
12a6c8c
Compare
Attempt to fix #362
Main changes:
ParamSweepWriter
+ParamSweepReader
for writing + reading parameter sweep results using blocksStages
to group methods that are before the sweep, in the sweep, and after the sweepParamSweepRunner
to orchestrate the parameter sweep runSideOutputManager
to separate side output logic from runnerDataSetBlock
and introduce three interfaces that describe the distinct sets of behaviour that a block needs to be processable by an implementor ofMethodWrapper
(see commit message of 4123373 for more info)ParamSweepBlock
to hold data during a parameter sweep runrun
command to choose between executing high-throughput or param sweep runThings left to do:
!Sweep
and!SweepRange
YAML "tags"!Sweep
and!SweepRange
tagsAcceptance criteria checklist