-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
weight normalisation #36
Comments
Hi Simone, This is because the final normalization constant may depend on third-party factors. In many cases the normalization constant does not play a significant role (e.g. to compute efficiencies / ROC curves / train classifiers), however when it does, you should compute it yourself. Explanation: absence of normalization in reweighters makes it possible to guarantee that E.g. if you predict a large sample at once or predict separately weight for each event and concatenate predictions - the result is the same. If you normalize, obviously the result is wrong in the second case. |
Hi, related to this question, I'm trying to compare a single reweighter trained and tested using the entire dataset to several reweighters which are trained on individual bins of the data. What I'm trying to do is reconstruct the reweighted distributions over the whole data range from the binned reweighters. Therefore, is it possible to obtain the normalization constant used somehow or can I normalize the reweighters externally? Thanks |
@jcob95, you should renormalize externally. As I understand your case, you should compute expected amount of samples in each bin first, and then within each bin you need to apply normalization so that total weight coincides with expected. |
I have used hep_ml in the past weeks to reweight MC distributions and stumbled upon the following issue
When determining weights as data/MC ratio of normalised distributions, the computed weights are normalised such as Sum w_i = N
However, I noticed this is not the case for weights obtained using hep_ml.reweight
Is this expected or am I missing something?
The text was updated successfully, but these errors were encountered: