-
Notifications
You must be signed in to change notification settings - Fork 373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modernize random number generation for NEST 3 #1440
Comments
An important issue is statistical independence of random number streams, which usually comes into play when generating multiple streams of random numbers. Many strategies that claim to produce this kind of streams don't in fact produce independent random number sequences (one example is PCG's use of additive constants as stream numbers). It's hard to determine whether a given strategy (for a given PRNG) will produce independent streams without testing it.
This is another issue on its own. Typically Finally, the random number engines currently available in C++ leave a lot to be desired. For example, See also my criteria for high-quality random generators. |
According to the paper "A new specification for std::generate_canonical", there are flaws in the current definition of |
@peteroupc Thank you for your input! Concerning the weakness of the Mersenne Twisters and similar generators: These weaknesses have been known for over a decade, but I have not yet seen any publication indicating that this weakness affects simulation experiments. Ferrenberg and Landau showed in 1992 that some "good old generators" distorted the results of physics simulations (Ferrenberg and Landau, Phys Rev Lett 69:3382, 1992), but I have not come across any comparable critique of MT generators. |
I know of one: "It Is High Time We Let Go of the Mersenne Twister", S. Vigna. |
@peteroupc Thanks for the pointer to the problem that Vigna's paper is interesting, especially the example of Sec 2.1. Unfortunately, he does not provide any estimates about at which matrix size one would expect to see problems for the full-sized MT19937. In NEST, we have at least since 2002 tried to follow Knuth's advice (TAOCP, Ch 3.6)
by making it easy for the user to change the RNG used. We will continue to do so and strongly encourage users to re-run simulations using generators from different families. |
@vigna, do you have a comment? |
|
See also https://bashtage.github.io/randomgen/index.html on a modern generator suite for NumPy. |
The overlap analysis including the r^1/3 rule of L'Ecuyer indicates that periods of 2^128 are too short to avoid overlaps. With a period of 2^256, the probability of overlap is 10^-30, which seems acceptable. I haven't seen (but not searched either) for generators with period 2^192.
Indeed, that's why I propose to include Random123 which provides such generators.
Random number generators are a small, if crucial, component of NEST. We do not want to engage in RNG development ourselves, but build on well established software, preferably with a wide user community (increases risk of weaknesses coming to light), active developer community (bugfixes, performance improvements, adaptation to evolving standards) and good development practices (code repo, issue tracker, code review, CI testing). I would not want to invest time in devising new schemes ourselves (we need to focus our resources where we are specialists). |
@heplesser there's no need to guess because I just published an exact computation of the overlap probability: http://vigna.di.unimi.it/ftp/papers/overlap.pdf If you have period P and n processors using L outputs the probability of overlap is bounded by n²L/P, and for n²L much smaller than P the bound is quite precise (the paper contains exacy upper and lower bounds). |
20000⨉20000. You just need it to be larger than the number of state bits. So, quite large. It takes more time to count the odd coefficients (probably a day or two per sample—an educated guess from the fact that the algorithm is cubic). If you want I can try in my spare time. |
I would like to put up a specific use case from a project by @sdasbach for discussion: It became necessary to assess the influence of different sources of randomization on the network dynamics. In particular, we independently randomized node parameters, the selection of connections to be created, and the parametrization of connections. What would be the preferred way to do that in the new framework? |
@vigna Thanks! |
@vigna I am slightly confused by this post. Was this meant for a different discussion? |
@jhnnsnk In NEST3 (as in NEST2) you will be able to re-seed or even exchange the RNGs at any time. So you could perform each step, then select a different RNG type (or keep the type but choose new seeds). In order to use different sources of randomness for which connections to create and which parameters to use for them, you would have to create the connections first and then set the parameters on them. But if the RNGs used are any good, you should not need to do this—a single stream of numbers per VP should suffice. I am very curious about what you and @sdasbach found (offline if you do not want to spill the news before a possible publication). |
@heplesser I was answering to
|
... and then libc++ breaks it all ...@hakonsbm points out that Clang will not accept generators wrapped by a base class, which is required to allow for flexible exchange of RNG types at runtime. The reason is subtle, entirely standard-conformant, most likely performance enhancing, but a deal-breaker for us. The standard defines in §29.6.1.3, Table 103, that the Of the three C++ Library implementations, libc++ exploits this fully in const size_t __logR = __log2<uint64_t, _URNG::max() - _URNG::min() + uint64_t(1)>::value; This obviously precludes any idea of passing generators via base-class pointers. Excluding Clang/libc++ is not an option, and we cannot exclude that the other C++ Lib implementations would change to static method access, too. I have a vague idea of how to work around this, but it is so complex I would not want to consider it. The only way out would be to select the RNG type at compile time, but this would make it far too difficult for users to adhere to the good practice of cross-checking results with different RNG types. So it seems that we cannot use the C++ Library random generators and distributions after all. |
@heplesser Thanks for your reply. In our previous work with NEST2 we used the option to re-seed before each step, but we had to use Python random number generators in addition which is not necessary any more. |
@heplesser @jougs By creating a wrapper class for the distributions, I have found a way to call the distribution stored in the distribution wrapper with the generator stored in the RNG wrapper, making all the compilers happy. The RNGs are still exchangeable at runtime, and (most of) the interface and flexibility is kept the same. |
@hakonsbm Sounds great! |
Rademacher FPL author here. Thank you for mentioning my library. The three relevant properties of a uniform random floating-point number generator for the range
For the interval For intervals If you need to draw from all floating-point values in |
The most recent implementation is now on my branch. Compiling with Clang, there are currently SLI tests
Python tests
|
This issue is a follow-up of #245, #349, #508, #1296, #1381, #1405.
This exposition is based on discussions with NEST developers, especially @jougs and @hakonsbm.
Background
The current NEST random number architecture dates back to 2002 and is thoroughly outdated today. The arrival of carefully standardized RNGs with the C++11 standard, the essentially complete transition to 64-bit CPUs, the availability of dedicated cryptographic RNG hardware (AES) and the transition from 100s to 100s of 1000s of parallel processes create an entirely different landscape for random number generation today.
Key weaknesses include
General considerations
NEST
Use of random numbers
Random numbers are consumed in NEST
In all cases except certain sub-cases of 2.i, randomization is tied to a node, which in turn is assigned to a specific virtual process, either because the node is concerned (cases 1, 3) or because the node is the target of a connection (case 2). NEST design guarantees that nodes assigned to different VPs can be updated independent of each other, and the same holds for connections if created or parameterized from a target-node perspective. Therefore, for all these purposes, one random number stream per VP is required and sufficient to ensure that different VPs can operate independently.
The only exception are certain sub-cases of 2.i in which all VPs need to draw identical random number streams and discard all numbers that are irrelevant. This pertains to
fixed_outdegree
andfixed_total_number
(initial step) connection patterns. For these purposes, a global random number stream is required.Quantity of random numbers required
NEST therefore requires in total N_VP+1 random number streams. With a view to future exascale computers, we need to prepare for
N_VP=O(10^7)
random streams. We can estimate the number of random numbers required per stream as follows:N = 1000
neurons per stream; this results in a total ofO(10^11)
neurons, on the scale of the human brain and thus a sensible upper limit.~1000 x 4 x 10^6 = 4 x 10^10
random numbers per stream (consumption during network construction will usually be negligible).Replicability guarantee
A NEST simulation compiled with the same compiler and library, on the same computer hardware and initialised with a fixed random seed, shall generate exactly the same output on each execution.
Exchangeable random number generator
The user shall in an easy way, in particular without recompilation, be able to perform the same simulation with different random number generators. This allows the user to test whether effects observed in simulations may be artefacts of a specific RNG type.
Parallel random number streams
Collision risk
L'Ecuyer et al (2017) provide a recent account of random number generation in highly parallel settings. The main approach to providing independent random number streams in parallel simulations is to use a high-quality random number generator with long period and seed it with a different seed for each stream. They then argue that if we have
s
streams of lengthl
and a generator with periodr
, the probability that two streams seeded with different seeds will overlap is given byWe have
so assuming an rng period of
r = 2^128
we arrive atwhile for a period of
r = 2^256
we obtainThe probability of stream collisions is thus negligibly small, provided we use random number generators with a period of at least
2^128
.L'Ecuyer (2012, Ch 3.6) points out that certain random number generators classes will show too much regularity in their output if more than
r^1/3
numbers are used (this relation is based on exercises in Knuth, TAOCP vol 2, see L'Ecuyer and Simard (2001)). While it is not certain that that analysis also applies to (short) subsequences of the period of an RNG, one might still re-consider the analysis above, but usingl^3 ~ 2^111
instead ofl
when computingp_overlap
. We then obtainPeriod
r
|p_overlap
2^128 ~ 10^38
|>> 1
2^256 ~ 10^77
|2^-99 ~ 10^-30
2^512 ~ 10^154
|2^-355 ~10^-107
Thus, for generators with periods of at least
2^256
, we have negligible collision probability also under this tightened criterium.Available seeds
Traditional
seed()
functions typically accept a 32-bit integer as input and initialize RNG state from this. This allows only for4 x 10^9
different seeds, so if a simulation requires10^7
streams, we can at most perform400
different simulations. This is not acceptable and generators with reliable, better seed generators are required.Available values
Many existing RNGs return 32-bit integers, which are converted into
[0, 1)-uniform
random numbers by simple division. This allows at most2^32=4 x 10^9
different values even for 53-bit-mantissa doubles (see also boostorg/random#61). Given that our streams consume10^11
numbers, we would expect every possible value to occur many times over during a simulation, which is not ideal. But keep in mind that while we will see the same value many times, it will occur in a different sequence of numbers every time provide the underlying generator has sufficiently long period.Random number and deviate generator libraries
The following random number and deviate generator libraries have been considered primarily:
GSL and Boost
The GSL random number facilities work strictly with 32 bit random numbers and seeds. This strongly limits there utility in view of the requirements listed above.
The Boost Library Random module uses only 32 bits in generating uniform-[0, 1)-distributed numbers, which form the basis of all random deviate generation. Many deviate generators also appear to rely heavily on constant or pre-computed tables (e.g. normal distribution), which seems little cache efficient and may even cause slow memory access if static data is local to a single CPU. Therefore, the Boost Random module also seems of limited value.
C++ Standard Library
The random number and distribution components of the C++ Standard Library are defined in great detail in §29.6 of the C++ Standard; all references to the standard in the following are to the (see C++17 Working Draft N4659, which is openly available).
Currently, three implementations of the C++ standard library are available
random
,bits/random.h
,bits/random.tcc
random
)random
)Where the standard leaves aspects to the implementations, checking these three implementations provides comprehensive answers for all practical purposes.
Advantages
double
(generate_canonical()
, §29.6.7.2). This is independent of whether the underlying generator is a 32 or 64 bit generator.*_distribution
classes in all three implementations inspected consume random numbers only viagenerate_canonical()
, i.e., using 53 bits of randomness.operator<<()
andoperator>>()
.Disadvantages
Other generators
A number of other random number generator libraries are available. They usually provide a small number of raw random number generators and would require adaptation to be usable by random distribution generators. Many of these libraries do not seem to follow modern open source code development best practices (e.g. code in open repositories with systematic bug tracking, code review and automated testing).
xoshiro
class of generators recently proposed by Blackman and Vigna looks interesting, although the paper on the generator is currently only available as preprint and seems not to have been published in peer-reviewed form yet.2^128
, too short for our purposes.Recommendations
Principles
user_seed
.seed_seq(user_seed, f(stream_no))
.Comments
random
, only the MT provides sufficient period and speed. It should be included in 32- and 64-bit variant. Theranlux
generators are painfully slow.generate_canonical()
in our base class to prevent 1 from being returned or even implement the more advanced solution proposed in ISO/IEC JTC1 SC22 WG21 P0952R0; see also P0952 R2 A new specification for std::generate_canonical cplusplus/papers#574.-march
flags as a CMake option to activate hardware support for AES-based generators.-march
).xoshiroXXX
withXXX >= 256
also seem interesting, but require wrapping.constexpr result_type min()
andmax()
. Sinceresult_type
can differ for different generators (32 or 64 bit), we need to define these methods with return typeuint_fast64_t
so they can hold all possible values, and protect against generators with even larger return type bystatic_assert()
. Integer promotion rules make this approach safe in light of thegenerate_canonical()
implementations in the three STL variants.user_seed
provided.user_seed
is never used "as is".stream_no
is identical to thread number for per-thread streams, andnum_threads+1
for the global stream.f(stream_no)
is still to be determined. It needs should not return zero and must be guarantee to map10^7
differentstream_no
values to different output values.std::seed_seq
may have weaknesses. But using this architecture, we could later replace the standard-defined seeding rule by a different one.The text was updated successfully, but these errors were encountered: