Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random behavior of GBReweighter and UGradientBoostingClassifier #55

Open
arogozhnikov opened this issue Oct 30, 2018 · 0 comments
Open
Labels

Comments

@arogozhnikov
Copy link
Owner

(Leaving this as an open answer to common question)

Why GBReweighter/UGradientBoostingClassifier provide different weights after each training?

Both algorithms are based on stochastic tree boosting. Settings like subsample and max_features drive to randomized tree building (i.e. each tree uses only random part of train data), which is widely known to strengthen ensemble by building more diverse trees.

hep_ml follows sklearn convention to keep random things random unless explicitly asked otherwise.

Reproducible behavior is achieved with setting random_state

for boosting:
UGradientBoostingClassifier(<other setting here>, random_state=42)
for reweighter
GBReweighter(<other setting here>, gb_args={'random_state': 42, <other gb args>})
@arogozhnikov arogozhnikov changed the title Randomization of GBReweighter and UGradientBoostingClassifier Random behavior of GBReweighter and UGradientBoostingClassifier Oct 30, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant