Skip to content

Adds a SplinePostProcessor singleton class to Generator#468

Open
kjplows wants to merge 7 commits intoGENIE-MC:masterfrom
kjplows:feature/kjplows_smooth_QELSplines
Open

Adds a SplinePostProcessor singleton class to Generator#468
kjplows wants to merge 7 commits intoGENIE-MC:masterfrom
kjplows:feature/kjplows_smooth_QELSplines

Conversation

@kjplows
Copy link
Contributor

@kjplows kjplows commented Dec 10, 2025

For the cross-section algorithms that rely on NewQELXSec to integrate, MC throws of the struck nucleon are required to compute the cross-section. This leads to a "jitter" in the graph of the cross-section as a function of neutrino energy.

This PR adds a new singleton class genie::SplinePostProcessor, that acts after spline knots have been collected in XSecSplineList, and applies a Gaussian kernel smoother to the knots, "smoothing" the spline.

The SplinePostProcessor acts only on algorithms that it has been configured to run on (one can add these as keys in the SplinePostProcessor.xml file), and only if it has been switched on at the configuration. By default, the SplinePostProcessor is configured to act on NievesQELCCPXSec/ZExp.

One may specify a sequence of "stops", or reference energies, at which the width of the window is given by a corresponding entry in a sequence of "widths". In between "stops", the width is interpolated linearly.
One may also specify additional "stops" and "widths" per neutrino PDG; this is done by specifying the additional string @Pdg=<probe PDG> in the configuration key. In this XML file, a separate key is set for tau neutrinos, which have significantly higher thresholds than muon or electron neutrinos.

Some validation plots, in the form of "Brazil bands" follow: first, for 12C, 16O, and 40Ar, and for each of the (anti)neutrino flavours, one hundred runs of gmkspl -e 10 -n 100 (using tune AR23_20i_00_000) were generated; for each of the spline xmls made, for each of the energies sampled, the 100 results for σ were collected into a distribution.

A band of (<σ> - 2υ, <σ> + 2υ) is coloured in yellow; a band of (<σ> - υ, <σ> + υ) is coloured in green, where <σ>, υ are the sample mean and variance of this distribution.
Then, a spline is generated, with the SplinePostProcessor switched off: this is plotted as red points. The jitter is evident, with a fraction of points lying outside the "2 sigma" yellow band.
Finally, a spline with the same seed is generated, with the SplinePostProcessor switched on, and plotted as the dashed blue line. The graph of the cross-section is both less "jittery", and now lies within the band, indicating a better agreement with the "expected" cross-section.
Shown here: numu on 40Ar, nue on 12C, nutau on 16O.

40Ar_numu 12C_nue 16O_nutau

kjplows and others added 7 commits November 10, 2025 10:05
The SplinePostProcessor acts on a user-given set of algorithms upon spline generation.
It applies a Gaussian kernel smoothing on the spline knots' ordinates without changing the abscissae,
with user-configurable widths interpolated linearly from user-defined stops.

The intended use case is for algorithms that depend on MC throws (such as NewQELXSec) to undergo a
post-processing stage to reduce fluctuations in the splines.
@kjplows kjplows requested a review from sjgardiner December 10, 2025 18:23
@kjplows kjplows self-assigned this Dec 10, 2025
@kjplows kjplows added the framework Framework related issues or improvements label Dec 10, 2025
@kjplows kjplows requested review from candreop and nusense December 11, 2025 17:59
Copy link
Member

@nusense nusense left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A first pass review

<priority msgstream="SpectralFunc"> NOTICE </priority>
<priority msgstream="Spline"> INFO </priority>
<priority msgstream="SplinePostProcessor"> WARN </priority>
<priority msgstream="Target"> INFO </priority>
Copy link
Member

@nusense nusense Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest SplinePostProcessor entries in the other Messenger_XYZZY.xml files as well; have them consistent with the general theme of the file.

for(iter = nuvec.begin(); iter != nuvec.end(); ++iter) {
list->push_back( atoi(iter->c_str()) );
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary addition of trailing blank space.

@@ -0,0 +1,333 @@
//____________________________________________________________________________
/*
Copyright (c) 2003-2025, The GENIE Collaboration
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is going into the code base in 2026, perhaps that should be the copyright date (here and elsewhere)

<!-- separate list for nutau due to high thresholds -->
<param type="vec-double" delim="," name="Stops@Pdg=16,-16"> 2.8,3.0,4.0,5.0,10.0 </param>
<param type="vec-double" delim="," name="Widths@Pdg=16,-16"> 0.005,0.01,0.03,0.075,0.1 </param>

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the stops/widths list for nu_tau(_bar) need to go as far as the standard one? Why or why not?
Also, what happens above the upper stop? We typically generate splines out to E=1000 GeV, though out that high there tends not to be much jitter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I definitely didn't give too much thought to the stops / widths for nutau, nutaubar. But as you noted -- above the upper stop, the last width is used. Now that I think about it, perhaps the best solution is to add a stop at 1 TeV with some width.

(The scaling between points is linear, currently; line 262 and friends of SplinePostProcessor.cxx. So if last 2 stops are e.g. at 10 GeV, 1e3 GeV; at E =100 GeV, the activation becomes (100-10) / (1000 - 10) = 1/11 , meaning the width at that point is width(10) + (width(1e3) - width(10)) / 11 . Probably should add an option for logarithmic scaling, come to think of it.)

SplinePostProcessor();
SplinePostProcessor(std::string name);
SplinePostProcessor(std::string name, std::string config);
~SplinePostProcessor();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually for singletons the constructor and destructor are private and the only public way to get the one-and-only instance is via Instance(). This prevents users from accidentally creating one outside the singleton framework. I don't know if we have any other singletons that are algorithms, and haven't reviewed all the code to see how this works in actuality here so I don't know whether it is possible for this code. This is something to consider.
Also a common pattern in singletons is a Cleaner to destroy the one instance when the program is shutting down. If you look at some of the other singletons you'll see code (here for AlgFactory in the private portion of the header) like:

  //! singleton cleaner
  struct Cleaner {
      void DummyMethodAndSilentCompiler() { }
      ~Cleaner() {
         if (AlgFactory::fInstance !=0) {
            delete AlgFactory::fInstance;
            AlgFactory::fInstance = 0;
         }
      }
  };
  friend struct Cleaner;

and in the .cxx file:

AlgFactory * AlgFactory::Instance()
{
  if(fInstance == 0) {
    static AlgFactory::Cleaner cleaner;
    cleaner.DummyMethodAndSilentCompiler();

    fInstance = new AlgFactory;
  }
  return fInstance;
}

Then when the static is unloaded, it cleans up the associated instance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that makes sense, if turning the Instance into a pointer means I can still assign an fPostProcessor at XSecSplineList level then happy to do this!

this->GetParam( tmp_key, stops[i] );
}
fStopMap[pdg] = stops;
} // split strings
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block of code for if( key.find("Stops")... (and the latter one for Widths) have a very wonky indentation, in the GitHub representation, that doesn't follow the code flow. The line following the if is 2 spaces to the left from the if. It might be that the files are using tabs (which we discourage because their interpretation can wildly vary) instead of spaces.

Also, just to note, that given subs (without having appended a , on dup_name) one could also have had the elements extracted using

 vector<string> pdgstrvec = genie::utils::str::Split(subs,",");
 while ( size_t indx=0, indx < pdgstrvec.size(); indx++) {
    int pdf = std::stoi(pdgstrvec[indx]);
...
 }

I'm not sure I'm advocating changing what's here, but more making you aware of some GENIE utility functions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, I never knew about tabs vs spaces could resolve in wonky indentation! Thanks for the catch, I'll make changes.

I'll give the utility a go. It looks like a more elegant solution.

double width = w1 + (w2 - w1) * rat; // between w1 and w2
double num = 0.0; double den = 0.0;
for(int j = 0; j < NKnots; j++) {
double w = std::exp(-0.5 * std::pow((E[i] - E[j]) / width, 2.0)); // Gauss
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is the heart of the Gaussian kernel smoother, and probably needs a comment to that effect ahead of the line rather than a simple // Gauss. :-)

if( xmax > stops.back() ) {
stops.emplace_back(xmax);
widths.emplace_back(widths.back());
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I think this answers an earlier question about what happen to when the stops/widths don't cover the range of the spline. I read this as the first (or last) entry in the stops/widths are duplicated at the front or back to fully cover the range.


}

fPostProcessor = *( SplinePostProcessor::Instance() );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and line 103 of the header. Because fPostProcessor is not a pointer, what happens here is a full copy of the one given by SplinePostProcessor::Instance(). I think what you should be doing is making fPostProcessor a pointer, assigning the pointer from the Instance() just for convenience and converting the following uses from fPostProcessor.X() to fPostProcessor->X(). Otherwise you're making unnecessary copies of a fairly hefty data structure.


static XSecSplineList * fInstance;

SplinePostProcessor fPostProcessor;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment in XSecSplineList.cxx about this being an actual object and not a pointer.

@kjplows
Copy link
Contributor Author

kjplows commented Mar 4, 2026

Hi @nusense , thanks for the detailed comments! I'll start implementing these as soon as I can.
Cheers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

framework Framework related issues or improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants