-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GSoC 2022: Multiweight integration #125
base: substructure
Are you sure you want to change the base?
GSoC 2022: Multiweight integration #125
Conversation
destructor now deletes allocated region selection objects when going out of scope.
…nager.h Co-authored-by: Jack Y. Araz <[email protected]>
Substructure
…tatistics for each weight id
…ical for all samples within one analysis
Co-authored-by: Jack Y. Araz <[email protected]>
Co-authored-by: Jack Y. Araz <[email protected]>
Co-authored-by: Jack Y. Araz <[email protected]>
Co-authored-by: Jack Y. Araz <[email protected]>
Co-authored-by: Jack Y. Araz <[email protected]>
…ern to avoid using compiler flags
…nalyzer now passes in necessary data to output manager, further implementation only need to be added to the output manager
…may be a bug with certain graphs, need to track it down
…e file only if it exists
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are some comments that need resolving.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file should not be here!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Database Manager is not a good naming here indicates extreme general structure. Please create two dedicated classes for cutflow and histogramming. Current structure does not look expandable i.e. we might want to use sqlite for completely different thing than just writing histograms and curflows so dedicated code structures would be ideal. Additionally this piece of code needs bit of documentation so please add a README.md file here explaining how to read curflow and histo file generated by this file through python.
|
||
def AutoDetection(self): | ||
# Which | ||
result = ShellCommand.Which('sqlite3',all=False,mute=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please also add
if importlib.util.find_spec("sqlite3") is None: ...
since one might have weird specification with cpp headers defined but python files missing etc.
} | ||
|
||
// Add a cut to the CutFlow | ||
void AddCut(std::string const &CutName) | ||
{ cutflow_.InitCut(CutName); } | ||
|
||
/// Getting ready for a new event | ||
void InitializeForNewEvent(const MAfloat64 &weight) | ||
void InitializeForNewEvent(const MAfloat64 &weight,const std::map<MAuint32, MAfloat64> &multiweight) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not backwards compatible; it breaks all the codes developed in the past! Please split this into two as follows;
/// Initializing the new event with a single weight
void InitializeForNewEvent(const MAfloat64 &weight);
/// Initializing the new event with multi-weight
void InitializeForNewEvent(const std::map<MAuint32, MAfloat64> &multiweight);
This way, we won't need to modify old implementations individually. You can test old implementations by using the install PADForSFS
command, which will download a set of implementation codes in tools/PADForSFS/Build/SampleAnalyzer/User/Analyzer
, and you can see which functions can not be changed. All those codes have to work.
Context:
GSoc 2022 - Multiweight integration
Description of the Change:
This update extends the Cutflow and Histogram classes to hold data for n-weights and outputs the multidimensional histogram and cutflow data to SQLite3 format.
This draft is currently fully functional for Both Cutflows and histograms, currently still working on building an interface for SQlite3 output format.
Benefits:
This implementation will output Cutflow and Histogram data for all weights in a single SQLite3 file which is long term stable(updates in software version will not change accessibility) and easily transferable. SQlite3 supports the core query commands of the SQL standard and users will be able to process the result data however they wish, the database file can be accessed independently of MADAnalysis5 and plotted in Excel or whichever plotting package or the user's choosing.
Possible Drawbacks:
Currently still working on an interface to detect if SQLite3 is available on the system and if not, give user the option to install it. MacOS includes SQLite3 by default, this issue will only apply to Linux users.
Existing analysis files need to update the SampleAnalyzer.cpp file:
in the Execute function: user needs to pass in the weigh map as shown in the example below. the WeightCollection object supports *, /, +, - operators.
Example: user wants to multiply all weights by a double, say 1.2345, the user can do so by the following.
EvMultiweight*=1.2345;
The current implementation will run both the single and multi weight simultaneously and output independently (Existing SAF format is preserved and untouched, multi weight will output Outflow and Histogram SQLite3 files in their respective directories). The plan is to deprecate the single weight implementation once numerical validation is complete, the current implementation is a little cumbersome but is fully functional and easier to debug should their be discrepancies between the existing and new implementation.
Related GitHub Issues: