Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor hash_obj to handle nested arbitrary objects #329

Open
ShiveshM opened this issue Apr 6, 2017 · 1 comment
Open

Refactor hash_obj to handle nested arbitrary objects #329

ShiveshM opened this issue Apr 6, 2017 · 1 comment

Comments

@ShiveshM
Copy link

ShiveshM commented Apr 6, 2017

Similar to how it's done in the normQuant function

backstory:
When a Pipeline object is used as a Param value in a ParamSet, the ParamSet method values_hash fails to obtain a hash value. Inside the values_hash function, hash_obj is applied to a tuple of the ParamSet values, one of which is the Pipeline object. Currently hash_obj treats tuples (and all other Sequences) by converting it to a string using pickle. The Pipeline object is not picklable so it fails.

@ShiveshM ShiveshM changed the title Refactor hash_obj to handle nested arbitrary objects Refactor hash_obj to handle nested arbitrary objects Apr 6, 2017
jllanfranchi pushed a commit that referenced this issue Apr 13, 2017
…329 (#339)

* Add hash property to Pipeline and implement a temporary fix to issue #329
* Respond to PR comments
jllanfranchi pushed a commit that referenced this issue Apr 25, 2017
…eio mkdir function (#342)

* Add hash property to Pipeline and implement a temporary fix to issue #329
* Make compare script works with MapSets and return the path in the fileio mkdir function
* Repond to PR comments, add ability to input Map, Pipeline and DistributionMaker objects as input to the compare function
jllanfranchi pushed a commit that referenced this issue Jun 20, 2017
* get greco sample working with included flux-reweighted weights
* Add hashing to transformation of data
* add caching in unfolding stage for the creation of the initial histograms
* misc bugs
* reconfigure caching slightly
* revert cfx example script settings back to use the leesard sample
* cleanup roounfold
* roountils convenience script
* progress on adding eff corrections to unfolding stage
* allow loading in gen_lvl sample from sample.py and more progress of eff corr in unfolding stage
* make separate function for use of real data in roounfold.py and finish up eff implementation
* Add option in sample cfg to load only specific keys
* Fix bug in caching where it took 2 attempts at each stage for caching to kick in
* misc
* make unfolding with efficiency work for the greco sample
* Bug: Fix de-sync of the sample pipeline with the gen_lvl pipeline
The sample pipeline uses the output Data object of the generator level
pipeline, to work out efficiencies. The gen lvl pipeline is fed into the
sample pipeline. Previously this was done through a config file.

In scripts, when the params of the sample pipeline are dynamically
changed, the gen lvl needs to be kept in sync. Now the gen lvl Pipeline
object can also be fed into the sample config.

Issue with hashing came up. The Pipeline object cannot be pickled, so I
did some try/except acrobatics in hash.py as a fix. Probably a more solid
fix is to check if the object (or any object inside the Sequence)
already contains a hashed value in a more general way, then replace the
unpickleable object with this hash value.

* add keep_keys for greco sample
* add true_e_scale osc param to weight.py and cfx pipeline
* Fix invalid values bugs and others in weight.py
* add is_dir and is_valid_file in pisa fileio utils
* Make compare functionality callable from another script
* go over which params should be free for CFX analysis
* Implement memcaching so that it doesn't have to deepcopy every stage
* suffix comes after, not before u dope
* add noise
* bug in scaled flux systematics
* misc
* misc
* Implement alias feature in sample.py
* output rates in debug
* fix units for greco sample
* units of gen level sample
* Add ability to return the efficiency maps in roounfold.py
* change name of binning from unsmeared to unfolded
* begin discrete systematics stage
* Implement dicrete systematics stage
* Make sure all test work
* Add hash property to Pipeline and implement a temporary fix to issue #329
* Respond to PR comments
* Flavint uses properties now, fix in sample.py
* Move output_events param to instantiate at __init__
* update muon example cfg
* remove extraneous object file
* remove remnant from old commit
* remove remnant from old commit
* Turn on file caching in fit.py and fix bug in roounfold.py
* update sample.py to use the latest flavint class
* fix import bug in roounfold
* Add README for mc settings and respond to PR comments
* Add Madison locations to mc config files
* make set, get methods case insensitive in the Data object and raise error if applyCut is used due to bugs
jllanfranchi pushed a commit that referenced this issue Oct 17, 2017
* get greco sample working with included flux-reweighted weights
* Add hashing to transformation of data
* add caching in unfolding stage for the creation of the initial histograms
* misc bugs
* reconfigure caching slightly
* revert cfx example script settings back to use the leesard sample
* cleanup roounfold
* roountils convenience script
* progress on adding eff corrections to unfolding stage
* allow loading in gen_lvl sample from sample.py and more progress of eff corr in unfolding stage
* make separate function for use of real data in roounfold.py and finish up eff implementation
* Add option in sample cfg to load only specific keys
* Fix bug in caching where it took 2 attempts at each stage for caching to kick in
* misc
* make unfolding with efficiency work for the greco sample
* Bug: Fix de-sync of the sample pipeline with the gen_lvl pipeline
The sample pipeline uses the output Data object of the generator level
pipeline, to work out efficiencies. The gen lvl pipeline is fed into the
sample pipeline. Previously this was done through a config file.
In scripts, when the params of the sample pipeline are dynamically
changed, the gen lvl needs to be kept in sync. Now the gen lvl Pipeline
object can also be fed into the sample config.
Issue with hashing came up. The Pipeline object cannot be pickled, so I
did some try/except acrobatics in hash.py as a fix. Probably a more solid
fix is to check if the object (or any object inside the Sequence)
already contains a hashed value in a more general way, then replace the
unpickleable object with this hash value.
* add keep_keys for greco sample
* add true_e_scale osc param to weight.py and cfx pipeline
* Fix invalid values bugs and others in weight.py
* add is_dir and is_valid_file in pisa fileio utils
* Make compare functionality callable from another script
* go over which params should be free for CFX analysis
* Implement memcaching so that it doesn't have to deepcopy every stage
* suffix comes after, not before u dope
* add noise
* bug in scaled flux systematics
* misc
* misc
* Implement alias feature in sample.py
* output rates in debug
* fix units for greco sample
* units of gen level sample
* Add ability to return the efficiency maps in roounfold.py
* change name of binning from unsmeared to unfolded
* begin discrete systematics stage
* Implement dicrete systematics stage
* Make sure all test work
* Add hash property to Pipeline and implement a temporary fix to issue #329
* Respond to PR comments
* Flavint uses properties now, fix in sample.py
* Move output_events param to instantiate at __init__
* update muon example cfg
* remove extraneous object file
* remove remnant from old commit
* remove remnant from old commit
* Turn on file caching in fit.py and fix bug in roounfold.py
* update sample.py to use the latest flavint class
* fix import bug in roounfold
* Add README for mc settings and respond to PR comments
* Add Madison locations to mc config files
* Modify separator in CFX stages to reflect recent PISA updates
* Implement MCEq into PISA
* add some docs and pep8ify mceq
* bug with units in mceq
* Remove incorrect doc in mceq.py
* add doc links to default param options in mceq
jllanfranchi pushed a commit that referenced this issue Feb 13, 2019
…329 (#339)

* Add hash property to Pipeline and implement a temporary fix to issue #329
* Respond to PR comments
jllanfranchi pushed a commit that referenced this issue Feb 13, 2019
…eio mkdir function (#342)

* Add hash property to Pipeline and implement a temporary fix to issue #329
* Make compare script works with MapSets and return the path in the fileio mkdir function
* Repond to PR comments, add ability to input Map, Pipeline and DistributionMaker objects as input to the compare function
jllanfranchi pushed a commit that referenced this issue Feb 13, 2019
* get greco sample working with included flux-reweighted weights
* Add hashing to transformation of data
* add caching in unfolding stage for the creation of the initial histograms
* misc bugs
* reconfigure caching slightly
* revert cfx example script settings back to use the leesard sample
* cleanup roounfold
* roountils convenience script
* progress on adding eff corrections to unfolding stage
* allow loading in gen_lvl sample from sample.py and more progress of eff corr in unfolding stage
* make separate function for use of real data in roounfold.py and finish up eff implementation
* Add option in sample cfg to load only specific keys
* Fix bug in caching where it took 2 attempts at each stage for caching to kick in
* misc
* make unfolding with efficiency work for the greco sample
* Bug: Fix de-sync of the sample pipeline with the gen_lvl pipeline
The sample pipeline uses the output Data object of the generator level
pipeline, to work out efficiencies. The gen lvl pipeline is fed into the
sample pipeline. Previously this was done through a config file.

In scripts, when the params of the sample pipeline are dynamically
changed, the gen lvl needs to be kept in sync. Now the gen lvl Pipeline
object can also be fed into the sample config.

Issue with hashing came up. The Pipeline object cannot be pickled, so I
did some try/except acrobatics in hash.py as a fix. Probably a more solid
fix is to check if the object (or any object inside the Sequence)
already contains a hashed value in a more general way, then replace the
unpickleable object with this hash value.

* add keep_keys for greco sample
* add true_e_scale osc param to weight.py and cfx pipeline
* Fix invalid values bugs and others in weight.py
* add is_dir and is_valid_file in pisa fileio utils
* Make compare functionality callable from another script
* go over which params should be free for CFX analysis
* Implement memcaching so that it doesn't have to deepcopy every stage
* suffix comes after, not before u dope
* add noise
* bug in scaled flux systematics
* misc
* misc
* Implement alias feature in sample.py
* output rates in debug
* fix units for greco sample
* units of gen level sample
* Add ability to return the efficiency maps in roounfold.py
* change name of binning from unsmeared to unfolded
* begin discrete systematics stage
* Implement dicrete systematics stage
* Make sure all test work
* Add hash property to Pipeline and implement a temporary fix to issue #329
* Respond to PR comments
* Flavint uses properties now, fix in sample.py
* Move output_events param to instantiate at __init__
* update muon example cfg
* remove extraneous object file
* remove remnant from old commit
* remove remnant from old commit
* Turn on file caching in fit.py and fix bug in roounfold.py
* update sample.py to use the latest flavint class
* fix import bug in roounfold
* Add README for mc settings and respond to PR comments
* Add Madison locations to mc config files
* make set, get methods case insensitive in the Data object and raise error if applyCut is used due to bugs
jllanfranchi pushed a commit that referenced this issue Feb 13, 2019
* get greco sample working with included flux-reweighted weights
* Add hashing to transformation of data
* add caching in unfolding stage for the creation of the initial histograms
* misc bugs
* reconfigure caching slightly
* revert cfx example script settings back to use the leesard sample
* cleanup roounfold
* roountils convenience script
* progress on adding eff corrections to unfolding stage
* allow loading in gen_lvl sample from sample.py and more progress of eff corr in unfolding stage
* make separate function for use of real data in roounfold.py and finish up eff implementation
* Add option in sample cfg to load only specific keys
* Fix bug in caching where it took 2 attempts at each stage for caching to kick in
* misc
* make unfolding with efficiency work for the greco sample
* Bug: Fix de-sync of the sample pipeline with the gen_lvl pipeline
The sample pipeline uses the output Data object of the generator level
pipeline, to work out efficiencies. The gen lvl pipeline is fed into the
sample pipeline. Previously this was done through a config file.
In scripts, when the params of the sample pipeline are dynamically
changed, the gen lvl needs to be kept in sync. Now the gen lvl Pipeline
object can also be fed into the sample config.
Issue with hashing came up. The Pipeline object cannot be pickled, so I
did some try/except acrobatics in hash.py as a fix. Probably a more solid
fix is to check if the object (or any object inside the Sequence)
already contains a hashed value in a more general way, then replace the
unpickleable object with this hash value.
* add keep_keys for greco sample
* add true_e_scale osc param to weight.py and cfx pipeline
* Fix invalid values bugs and others in weight.py
* add is_dir and is_valid_file in pisa fileio utils
* Make compare functionality callable from another script
* go over which params should be free for CFX analysis
* Implement memcaching so that it doesn't have to deepcopy every stage
* suffix comes after, not before u dope
* add noise
* bug in scaled flux systematics
* misc
* misc
* Implement alias feature in sample.py
* output rates in debug
* fix units for greco sample
* units of gen level sample
* Add ability to return the efficiency maps in roounfold.py
* change name of binning from unsmeared to unfolded
* begin discrete systematics stage
* Implement dicrete systematics stage
* Make sure all test work
* Add hash property to Pipeline and implement a temporary fix to issue #329
* Respond to PR comments
* Flavint uses properties now, fix in sample.py
* Move output_events param to instantiate at __init__
* update muon example cfg
* remove extraneous object file
* remove remnant from old commit
* remove remnant from old commit
* Turn on file caching in fit.py and fix bug in roounfold.py
* update sample.py to use the latest flavint class
* fix import bug in roounfold
* Add README for mc settings and respond to PR comments
* Add Madison locations to mc config files
* Modify separator in CFX stages to reflect recent PISA updates
* Implement MCEq into PISA
* add some docs and pep8ify mceq
* bug with units in mceq
* Remove incorrect doc in mceq.py
* add doc links to default param options in mceq
@LeanderFischer LeanderFischer added this to the PISA 4.2 milestone May 27, 2024
@LeanderFischer
Copy link
Collaborator

Still seems to be the case, but I'm not sure how serious this problem is 🤔 If someone feels up for it, they could try to implement it, apparently similar to the normQuant implementation..

@thehrh thehrh removed this from the PISA 4.2 milestone Feb 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants