Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reproducible quantile regression predictions require R set.seed() before ranger() call #643

Open
mkb007 opened this issue Nov 6, 2022 · 3 comments

Comments

@mkb007
Copy link

mkb007 commented Nov 6, 2022

The documentation for quantile regression should emphasize that reproducible predictions are only possible if R's set.seed is called before training the forest using ranger(). R's sample() function is used at the end of ranger() to initialize the random.node.values of the ranger object, which are used by predict.ranger(type = "quantiles"). The potential confusion exists because there are two RNGs - one for R and another for C++.

See issue #637.

@mnwright
Copy link
Member

Good point. I still think a refactor of the RNG would be good to improve flexibility, see e.g. here: #533 (comment).

@mkb007
Copy link
Author

mkb007 commented Nov 16, 2022

Yes. The encapsulated RNG idea would help reproducibility and reduce some possible wasted effort with quantile regression.

@UnixJunkie
Copy link

Great! Thanks for the fix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants