Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using rng= keyword argument for NumPy randomness #29315

Open
thomasjpfan opened this issue Jun 20, 2024 · 6 comments
Open

Using rng= keyword argument for NumPy randomness #29315

thomasjpfan opened this issue Jun 20, 2024 · 6 comments
Labels

Comments

@thomasjpfan
Copy link
Member

In SPEC7 scientific-python/specs#180, has two goals:

  1. Deprecate the use of RandomState and np.random.seed
  2. Standardize the usage of rng for setting seeding.

For 1, according to NEP19, I do not think NumPy wants to deprecate np.random.seed because they see valid use cases.

For 2, the primary reason around using rng instead of random_state is that it is a "better name" for NumPy's Random Generator. I am okay with keeping random_state and not have users go the pain of changing their code.

Currently, scikit-learn does not support generators because we tied it to scikit-learn/enhancement_proposals#88. We wanted to use generators to cleanly switch to a different RNG behavior compared to RandomState. For me, I think they can be decoupled. If we tackle scikit-learn/enhancement_proposals#88, we can fix it for both RandomState and Generators.

@scikit-learn/core-devs What do you think of SPEC7's proposal?

@GaelVaroquaux
Copy link
Member

GaelVaroquaux commented Jun 21, 2024 via email

@lesteve
Copy link
Member

lesteve commented Jun 21, 2024

It was interesting to me to witness the reactions to numpy 2.0. Senior and visible people who had been vocal to criticize historical choices in numpy and slowness of numpy to move forward where also the same to complain about breakage across their stacks and environments.

Oh well, I guess it's not that easy to make people happy 😝

@adrinjalali
Copy link
Member

The random_state discussion got quite convoluted, and the last meeting that we talked about it a while ago, I didn't get the impression that we have a way forward.

I personally don't mind renaming random_state to rng IF it unlocks some backward compatibility concerns with fixing random_state. New name, new behavior, otherwise I don't care much about changing the name because of the SPEC.

I also wouldn't deprecate RandomState's usage, but happy to move all our own code / examples to the generators instead.

@betatim
Copy link
Member

betatim commented Jun 24, 2024

I agree that we should provide a benefit to the user if we make them change from random_state to rng. I don't know what that benefit would be though. Especially given that Numpy seems to have no plan to remove the old random number infrastructure. Basically, I am uneducated about the advantages and not aware of one but dismissive of it.

I like the idea of using the switch from random_state= to rng= to also fix the randomness ambiguities. However I also think that fixing both at the same time has a lower chance of happening than if we tackle them independently. The price to tackle them independently is more user upheaval, I think.

I am okay with keeping random_state and not have users go the pain of changing their code.

Thomas, is this a proposal to add rng= without removing/deprecating random_state=?

@betatim betatim changed the title Using rng=kwargs for NumPy randomness Using rng= keyword argument for NumPy randomness Jun 24, 2024
@thomasjpfan
Copy link
Member Author

Thomas, is this a proposal to add rng= without removing/deprecating random_state=?

No. That proposal is to not use rng at all and continue to use random_state even when we support generators.

@betatim
Copy link
Member

betatim commented Jun 26, 2024

This means we'd also accept a random generator as argument to random_state? What problem do we solve with that? One that I can think of is people who use generators and expect them to work across the eco-system will have a good experience.

That seems like a good enough reason to allow people to pass a generator as random_state=.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants