Adds a SplinePostProcessor singleton class to Generator#468
Adds a SplinePostProcessor singleton class to Generator#468kjplows wants to merge 7 commits intoGENIE-MC:masterfrom
SplinePostProcessor singleton class to Generator#468Conversation
The SplinePostProcessor acts on a user-given set of algorithms upon spline generation. It applies a Gaussian kernel smoothing on the spline knots' ordinates without changing the abscissae, with user-configurable widths interpolated linearly from user-defined stops. The intended use case is for algorithms that depend on MC throws (such as NewQELXSec) to undergo a post-processing stage to reduce fluctuations in the splines.
| <priority msgstream="SpectralFunc"> NOTICE </priority> | ||
| <priority msgstream="Spline"> INFO </priority> | ||
| <priority msgstream="SplinePostProcessor"> WARN </priority> | ||
| <priority msgstream="Target"> INFO </priority> |
There was a problem hiding this comment.
I suggest SplinePostProcessor entries in the other Messenger_XYZZY.xml files as well; have them consistent with the general theme of the file.
| for(iter = nuvec.begin(); iter != nuvec.end(); ++iter) { | ||
| list->push_back( atoi(iter->c_str()) ); | ||
| } | ||
| } |
There was a problem hiding this comment.
Unnecessary addition of trailing blank space.
| @@ -0,0 +1,333 @@ | |||
| //____________________________________________________________________________ | |||
| /* | |||
| Copyright (c) 2003-2025, The GENIE Collaboration | |||
There was a problem hiding this comment.
Since this is going into the code base in 2026, perhaps that should be the copyright date (here and elsewhere)
| <!-- separate list for nutau due to high thresholds --> | ||
| <param type="vec-double" delim="," name="Stops@Pdg=16,-16"> 2.8,3.0,4.0,5.0,10.0 </param> | ||
| <param type="vec-double" delim="," name="Widths@Pdg=16,-16"> 0.005,0.01,0.03,0.075,0.1 </param> | ||
|
|
There was a problem hiding this comment.
Does the stops/widths list for nu_tau(_bar) need to go as far as the standard one? Why or why not?
Also, what happens above the upper stop? We typically generate splines out to E=1000 GeV, though out that high there tends not to be much jitter.
There was a problem hiding this comment.
Yeah, I definitely didn't give too much thought to the stops / widths for nutau, nutaubar. But as you noted -- above the upper stop, the last width is used. Now that I think about it, perhaps the best solution is to add a stop at 1 TeV with some width.
(The scaling between points is linear, currently; line 262 and friends of SplinePostProcessor.cxx. So if last 2 stops are e.g. at 10 GeV, 1e3 GeV; at E =100 GeV, the activation becomes (100-10) / (1000 - 10) = 1/11 , meaning the width at that point is width(10) + (width(1e3) - width(10)) / 11 . Probably should add an option for logarithmic scaling, come to think of it.)
| SplinePostProcessor(); | ||
| SplinePostProcessor(std::string name); | ||
| SplinePostProcessor(std::string name, std::string config); | ||
| ~SplinePostProcessor(); |
There was a problem hiding this comment.
Usually for singletons the constructor and destructor are private and the only public way to get the one-and-only instance is via Instance(). This prevents users from accidentally creating one outside the singleton framework. I don't know if we have any other singletons that are algorithms, and haven't reviewed all the code to see how this works in actuality here so I don't know whether it is possible for this code. This is something to consider.
Also a common pattern in singletons is a Cleaner to destroy the one instance when the program is shutting down. If you look at some of the other singletons you'll see code (here for AlgFactory in the private portion of the header) like:
//! singleton cleaner
struct Cleaner {
void DummyMethodAndSilentCompiler() { }
~Cleaner() {
if (AlgFactory::fInstance !=0) {
delete AlgFactory::fInstance;
AlgFactory::fInstance = 0;
}
}
};
friend struct Cleaner;
and in the .cxx file:
AlgFactory * AlgFactory::Instance()
{
if(fInstance == 0) {
static AlgFactory::Cleaner cleaner;
cleaner.DummyMethodAndSilentCompiler();
fInstance = new AlgFactory;
}
return fInstance;
}
Then when the static is unloaded, it cleans up the associated instance.
There was a problem hiding this comment.
I think that makes sense, if turning the Instance into a pointer means I can still assign an fPostProcessor at XSecSplineList level then happy to do this!
| this->GetParam( tmp_key, stops[i] ); | ||
| } | ||
| fStopMap[pdg] = stops; | ||
| } // split strings |
There was a problem hiding this comment.
This block of code for if( key.find("Stops")... (and the latter one for Widths) have a very wonky indentation, in the GitHub representation, that doesn't follow the code flow. The line following the if is 2 spaces to the left from the if. It might be that the files are using tabs (which we discourage because their interpretation can wildly vary) instead of spaces.
Also, just to note, that given subs (without having appended a , on dup_name) one could also have had the elements extracted using
vector<string> pdgstrvec = genie::utils::str::Split(subs,",");
while ( size_t indx=0, indx < pdgstrvec.size(); indx++) {
int pdf = std::stoi(pdgstrvec[indx]);
...
}
I'm not sure I'm advocating changing what's here, but more making you aware of some GENIE utility functions.
There was a problem hiding this comment.
Huh, I never knew about tabs vs spaces could resolve in wonky indentation! Thanks for the catch, I'll make changes.
I'll give the utility a go. It looks like a more elegant solution.
| double width = w1 + (w2 - w1) * rat; // between w1 and w2 | ||
| double num = 0.0; double den = 0.0; | ||
| for(int j = 0; j < NKnots; j++) { | ||
| double w = std::exp(-0.5 * std::pow((E[i] - E[j]) / width, 2.0)); // Gauss |
There was a problem hiding this comment.
This line is the heart of the Gaussian kernel smoother, and probably needs a comment to that effect ahead of the line rather than a simple // Gauss. :-)
| if( xmax > stops.back() ) { | ||
| stops.emplace_back(xmax); | ||
| widths.emplace_back(widths.back()); | ||
| } |
There was a problem hiding this comment.
Ah, I think this answers an earlier question about what happen to when the stops/widths don't cover the range of the spline. I read this as the first (or last) entry in the stops/widths are duplicated at the front or back to fully cover the range.
|
|
||
| } | ||
|
|
||
| fPostProcessor = *( SplinePostProcessor::Instance() ); |
There was a problem hiding this comment.
This and line 103 of the header. Because fPostProcessor is not a pointer, what happens here is a full copy of the one given by SplinePostProcessor::Instance(). I think what you should be doing is making fPostProcessor a pointer, assigning the pointer from the Instance() just for convenience and converting the following uses from fPostProcessor.X() to fPostProcessor->X(). Otherwise you're making unnecessary copies of a fairly hefty data structure.
|
|
||
| static XSecSplineList * fInstance; | ||
|
|
||
| SplinePostProcessor fPostProcessor; |
There was a problem hiding this comment.
See comment in XSecSplineList.cxx about this being an actual object and not a pointer.
|
Hi @nusense , thanks for the detailed comments! I'll start implementing these as soon as I can. |
For the cross-section algorithms that rely on
NewQELXSecto integrate, MC throws of the struck nucleon are required to compute the cross-section. This leads to a "jitter" in the graph of the cross-section as a function of neutrino energy.This PR adds a new singleton class
genie::SplinePostProcessor, that acts after spline knots have been collected inXSecSplineList, and applies a Gaussian kernel smoother to the knots, "smoothing" the spline.The
SplinePostProcessoracts only on algorithms that it has been configured to run on (one can add these as keys in theSplinePostProcessor.xmlfile), and only if it has been switched on at the configuration. By default, theSplinePostProcessoris configured to act onNievesQELCCPXSec/ZExp.One may specify a sequence of "stops", or reference energies, at which the width of the window is given by a corresponding entry in a sequence of "widths". In between "stops", the width is interpolated linearly.
One may also specify additional "stops" and "widths" per neutrino PDG; this is done by specifying the additional string
@Pdg=<probe PDG>in the configuration key. In this XML file, a separate key is set for tau neutrinos, which have significantly higher thresholds than muon or electron neutrinos.Some validation plots, in the form of "Brazil bands" follow: first, for 12C, 16O, and 40Ar, and for each of the (anti)neutrino flavours, one hundred runs of
gmkspl -e 10 -n 100(using tuneAR23_20i_00_000) were generated; for each of the spline xmls made, for each of the energies sampled, the 100 results for σ were collected into a distribution.A band of (<σ> - 2υ, <σ> + 2υ) is coloured in yellow; a band of (<σ> - υ, <σ> + υ) is coloured in green, where <σ>, υ are the sample mean and variance of this distribution.
Then, a spline is generated, with the
SplinePostProcessorswitched off: this is plotted as red points. The jitter is evident, with a fraction of points lying outside the "2 sigma" yellow band.Finally, a spline with the same seed is generated, with the
SplinePostProcessorswitched on, and plotted as the dashed blue line. The graph of the cross-section is both less "jittery", and now lies within the band, indicating a better agreement with the "expected" cross-section.Shown here: numu on 40Ar, nue on 12C, nutau on 16O.