-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for np.random.Generator #6566
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #6566 +/- ##
=======================================
Coverage 97.82% 97.82%
=======================================
Files 1066 1068 +2
Lines 91864 91905 +41
=======================================
+ Hits 89862 89903 +41
Misses 2002 2002 ☔ View full report in Codecov by Sentry. |
sorry about the delay, I will do the review this afternoon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of introducing new type we should consider defining and extending RANDOM_STATE_OR_SEED_LIKE
and provide parsers that would express it as either a RandomState or Generator as needed. Eventually we may completely transition to Generator-s and deprecate/remove the parser to RandomState.
cirq-core/cirq/value/prng.py
Outdated
def parse_prng( | ||
prng_or_seed: PRNG_OR_SEED_LIKE, | ||
) -> Union[np.random.Generator, np.random.RandomState, _CUSTOM_PRNG_T]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The return type of different types tends to be a code smell. Such returned value is less useful for type checking. In addition, Generator and RandomState (not to mention _CUSTOM_PRNG_T) have different APIs so the parse_prng
caller would still need to do some isinstance
check to ascertain the actual type and figure what methods can be called.
I would propose an alternative approach:
(1) convert the RANDOM_STATE_OR_SEED_LIKE
type from Any to the Union of numpy types that can be converted to RandomState and the np.random.Generator type. Hopefully this can be done without too much hassle with typechecks, because the current Any type skips them completely.
(2) extend parse_random_state to accept a Generator object and convert it to RandomState.
Generator-s have bit_generator
attribute that can be used to create RandomState.
(3) add method parse_random_generator
to the cirq.value.random_state
module which would take RANDOM_STATE_OR_SEED_LIKE
argument and convert it to a Generator object.
np.random.RandomState()
has a _bit_generator
attribute that can be used for creating a Generator.
If in some configurations the _bit_generator
is not present, we can just use RandomState.randint to get a seed for the np.random.default_rng()
With these steps in place, we can keep all the existing interfaces that take RANDOM_STATE_OR_SEED_LIKE
and just start replacing its interpretation from parse_random_state
to parse_random_generator
as needed.
This would also avoid bifurcation between RANDOM_STATE_OR_SEED_LIKE and PRNG_OR_SEED_LIKE types that may need several major releases to clear up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ptal
@pavoljuhas I updated parse_random_state to accept a generator and turn it into a randomstate and parse_prng to always return a generator. however I think that we do need new type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If at all feasible we should avoid Any in the PRNG_OR_SEED_LIKE type and discourage the np.random module use as a parse_prng
argument.
Otherwise LGTM.
If is an integer or None, turns into a `np.random.Generator` seeded with that value. | ||
If is an instance of `np.random.Generator` or a subclass of it, return as is. | ||
If is an instance of `np.random.RandomState` or has a `randint` method, returns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If is an integer or None, turns into a `np.random.Generator` seeded with that value. | |
If is an instance of `np.random.Generator` or a subclass of it, return as is. | |
If is an instance of `np.random.RandomState` or has a `randint` method, returns | |
If an integer or None, turns into a `np.random.Generator` seeded with that value. | |
If an instance of `np.random.Generator` or a subclass of it, return as is. | |
If an instance of `np.random.RandomState` or has a `randint` method, returns |
|
||
|
||
def parse_prng( | ||
prng_or_seed: Union[PRNG_OR_SEED_LIKE, RANDOM_STATE_OR_SEED_LIKE] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just have only the PRNG_OR_SEED_LIKE
type?
RANDOM_STATE_OR_SEED_LIKE
is Any
so it turns off type checking of the argument.
if prng_or_seed is None or isinstance(prng_or_seed, numbers.Integral): | ||
return np.random.default_rng(prng_or_seed if prng_or_seed is None else int(prng_or_seed)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we return a singleton Generator object for None
?
The None
arg is going to be frequently used as a default for optional arguments.
Singleton would prevent creation of potentially large number of Generator objects.
if prng_or_seed is None or isinstance(prng_or_seed, numbers.Integral): | ||
return np.random.default_rng(prng_or_seed if prng_or_seed is None else int(prng_or_seed)) | ||
if isinstance(prng_or_seed, np.random.RandomState): | ||
return np.random.default_rng(prng_or_seed.randint(2**31)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can reuse the bit generator for a more genuine conversion.
return np.random.default_rng(prng_or_seed.randint(2**31)) | |
return np.random.default_rng(prng_or_seed._bit_generator) |
randint = getattr(prng_or_seed, "randint", None) | ||
if randint is not None: | ||
return np.random.default_rng(randint(2**31)) | ||
raise TypeError(f"{prng_or_seed} can't be converted to a pseudorandom number generator") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit - maybe state the actual class here ?
raise TypeError(f"{prng_or_seed} can't be converted to a pseudorandom number generator") | |
raise TypeError(f"{prng_or_seed} cannot be converted to the numpy.random.Generator") |
def _sample(prng): | ||
return tuple(prng.random(10)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need this. One output from random()
is enough to check if 2 generators are at the same seed.
# An `np.random.Generator` or a seed. | ||
group_inputs: List[Union[int, np.random.Generator]] = [42, np.random.default_rng(42)] | ||
group: List[np.random.Generator] = [cirq.value.parse_prng(s) for s in group_inputs] | ||
eq.add_equality_group(*[_sample(g) for g in group]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let us not check cross-group inequality. Following the test_parse_random_state
style is a bit more readable
# An `np.random.Generator` or a seed. | |
group_inputs: List[Union[int, np.random.Generator]] = [42, np.random.default_rng(42)] | |
group: List[np.random.Generator] = [cirq.value.parse_prng(s) for s in group_inputs] | |
eq.add_equality_group(*[_sample(g) for g in group]) | |
# An `np.random.Generator` or a seed. | |
prngs = [ | |
cirq.value.parse_prng(42), | |
cirq.value.parse_prng(np.int32(42)), | |
cirq.value.parse_prng(np.random.default_rng(42)), | |
] | |
vals = [prng.random() for prng in prngs] | |
eq = cirq.testing.EqualsTester() | |
eq.add_equality_group(*vals) |
|
||
# A None seed. | ||
prng = cirq.value.parse_prng(None) | ||
eq.add_equality_group(_sample(prng)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a noop check for a single value. Perhaps replace with
assert prng is cirq.value.parse_prng(None)
if you are OK with the previous suggestion to have a singleton generator for None.
# RandomState PRNG. | ||
prng = cirq.value.parse_prng(np.random.RandomState(42)) | ||
eq.add_equality_group(_sample(prng)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can check reproducibility here -
# RandomState PRNG. | |
prng = cirq.value.parse_prng(np.random.RandomState(42)) | |
eq.add_equality_group(_sample(prng)) | |
# RandomState PRNG. | |
prngs = [ | |
cirq.value.parse_prng(np.random.RandomState(42)), | |
cirq.value.parse_prng(np.random.RandomState(42)), | |
] | |
vals = [prng.random() for prng in prngs] | |
eq = cirq.testing.EqualsTester() | |
eq.add_equality_group(*vals) |
# np.random module | ||
prng = cirq.value.parse_prng(np.random) | ||
eq.add_equality_group(_sample(prng)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not support creation of generator from a module, not a good practice.
The use of np.random module was causing pickle havoc in #3717.
I don't quite see a need for it, users can pass None for a default generator.
I'd be open to throwing TypeError for a module argument.
# np.random module | |
prng = cirq.value.parse_prng(np.random) | |
eq.add_equality_group(_sample(prng)) |
This PR addresses #6531
I don't modify support for old
RANDOM_STATE_OR_SEED_LIKE
instead I create a newPRNG_OR_SEED_LIKE
type.The reasons for that are:
RANDOM_STATE_OR_SEED_LIKE
implies that the result will be a RandomState. This is enforced in the code by forcing a a cast toRandomState
in theparse_random_state
functionCirq/cirq-core/cirq/value/random_state.py
Line 60 in e1b03ef
parse_random_state
defaults tonp.random
when the seed isNone
which is problematic when usign multiprocessing/multithreading since the internal state of the np.random module becomes a shared state that will make the results of experiments correlated, it will also negatively impact the performance since the threads/processes will be blocked on write operations to the internal state of the module.Cirq/cirq-core/cirq/value/random_state.py
Line 56 in e1b03ef
np.random
is also problematic in other ways as demonstrated in Can't use cirq.Simulator() in a multiprocessing closure (unable to pickle) #3717RANDOM_STATE_OR_SEED_LIKE
type is just an alias ofAny
which makes using@overload
to specify the return type impossible.The solution
PRNG_OR_SEED_LIKE
has union ofNone, int, np.random.RandomState, np.random.Generator, _CUSTOM_PRNG_T
, the first two are types of seeds that will always give annp.random.Generator
, the rest are objects that will be returned as is. and using@overload
we specifiy the correct return type.deprecating
RANDOM_STATE_OR_SEED_LIKE
atm is not possible due to how widely it's used. instead we should start to prefer usingPRNG_OR_SEED_LIKE
and maybe deprecateRANDOM_STATE_OR_SEED_LIKE
atcirq 2.0
and remove it bycirq 3.0