-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correlations of different DXSM streams with increments coresidual to MULTIPLIER-1 #90
Comments
** UPDATE: The following code only fails PractRand when seed with 0, @michaelni provide an example that fail for abitrary seed. #include <stdint.h>
#include <stdio.h>
#include <time.h>
#define STATE_BITS 64
#define STATET uint64_t
#define OUTT uint32_t
#define MULTIPLIER 6364136223846793005ULL
static OUTT pcg_dxsm(STATET *state, STATET inc) {
*state = *state * MULTIPLIER + inc;
OUTT h = *state >> STATE_BITS / 2;
h ^= h >> STATE_BITS / 4;
h *= MULTIPLIER;
h ^= h >> 3*STATE_BITS / 8;
return h * (*state | 1);
}
int main(int argc, char **argv) {
uint32_t buff[4096];
uint64_t seed = 0; // EDIT: It only fails PractRand quickly when seed with 0
STATET state1 = seed;
STATET state2 = seed;
while (1) {
for (int i = 0; i < 2048; i++) {
buff[2 * i] = pcg_dxsm(&state1, 1);
buff[2 * i + 1] = pcg_dxsm(&state2, MULTIPLIER );
}
fwrite(buff, sizeof(uint32_t), 4096, stdout);
}
} fails PractRand quickly
|
If you reset the seed inside the loop, it will generate the same 4096 entry block repeatedly, i assume that is unintended |
Good catch, I edited it a bit before pasting to here |
If you fix a bug in the code, you need to re-run the practrand test. The practrand failure likely was because of the bug. To see a real correlation with identical seeds STATET state1 = seed; STATET state2 = seed; STATET inc1 = 1 - seed * (MULTIPLIER - 1); STATET inc2 = MULTIPLIER - seed * (MULTIPLIER - 1); while(1) { for (int i = 0; i < 2048; i++) { buff[2 * i ] = pcg_dxsm(&state1, inc1); buff[2 * i + 1] = pcg_dxsm(&state2, inc2); } fwrite(buff, sizeof(uint32_t), 4096, stdout); } |
@michaelni is right. I first seed them with 0, and then it fails the PractRand badly. Later I edited the code a bit to seed them from a timestamp but made a mistake and result in it fails PractRand too, which is mistake. |
No one who uses PCG in a sane way ever has to worry about these issues, as you need carefully contrived related seeds to show any issue, and if you're in the business of contriving carefully related seeding pairs, you can claim issues for pretty much all PRNGs (assuming you're allowed to set the complete PRNG state through seeding). I recommend folks read this 2017 blog post of mine on the topic of PCG's streams. In that post I describe what's going on when you change the LCG additive constant, and how you can make LCG outputs that only differ by one bit if you contrive the constants just right. (There's also the thread on the Numpy GitHub.) Overall, all PRNGs exist in a space of trade offs. We could work harder to scramble the LCG output so that even if we have a pair of LCGs have outputs that only differ by just one bit, the final permuted output would still be radically different, but that scrambling costs CPU cycles. Since you only see an issue with carefully contrived related choices for the stream, it's not something that is worth trying to address with stronger scrambling. One thing the PCG paper invites readers to do is design their own permutation functions. There's basically a cookbook showing you techniques you can apply and rules to follow (as it turns out before I ever wrote the PCG paper, Bob Jenkins provided similar advice, which boils down to the idea that mixing must be reversible). So if my permutations aren't strong enough for you, feel free to make your own. If you can permute more strongly and just as fast as the existing PCG permutations, feel free to share a pull request. |
First id like to clarify that i intended to report the issues i found in a somewhat more coherent way, but i didnt find the time yet. And someone who wanted to help opened this issue. This is not what or how i planed to report things ;)
I think this statement is a little dangerous. Because "sane way" is not very well defined. But iam certainly interrested in how to initialize PCG-DXSM in a sane way if i want only non correlated streams? I was told the seed needs to be the same, Now for each given fixed seed theres a wide range of differing increment pairs that show correlations.
Pairs that differ by one bit in the LCG output are a tiny subset of cases that show correlations What i found concerning is that it seems not well understood how much of the space in PCG-DXSM are correlated streams. And thats what i tried to look into a bit, by looking at what kinds of correlations there are, and in that respect, i found it interresting every time i found a new class of correlations. I have not deeply looked into how much this is a practical issue with randomly selected seeds and increments. but it seems there are in excess of 2^64 correlated streams for each randomly picked stream in pcg64-dxsm. Which one can see as a lot or as little depending on point of view Thanks |
The right thing to do (for any PRNG, not just PCG) is use random entropy to initialize PRNG state (in PCG's case, that's current LCG-state value and LCG-increment). That's it, that's all. If you're not doing that, you're almost certainly doing it wrong, again for almost any PRNG. For SplitMix, for example, (which is a permuted counter-based PRNG), if we just switch the stream (a.k.a. its increment or “gamma” as they call it) to the wrong (hand-chosen) value, we may end up with the same generator going backwards, or going forwards at half or double speed, which will obviously create correlations. The standard SplitMix implementation make such errors harder by usually using its own internal state to choose the stream when you split, but it's not hard to make it do the wrong thing if we put our minds to it. So what's going on with PCG's streams is similar to some other generators of the past. As another example, you'd find similar correlations in Marsaglia's XorWow for different settings of the counter variable. XorWow is worse, because it doesn't do any permutation at all after adding in its counter. In PCG's case, a way to think of the stream selected is that in effect we're choosing a different output permutation from the a base PRNG. Generally speaking, if you pick two output permutations at random, they'll look very different even when the underlying base PRNGs are as perfectly lined up as they can be, but if you are somehow weirdly unlucky, they might not be that different. Likewise, for the underlying LCG sequence, your randomly-chosen starting points will likely be quite different from each other and not at all aligned. But it's possible you could be catastrophically unlucky and pick two nearby starting points and overlap. But if you pick a random starting point and a random “stream”, you're you've made it even more unlikely that somehow choose wrong. And the other thing is that these streams are incredibly cheap, they cost almost nothing in terms of execution time—LCG or MCG we're talking about 1.25 ns per 64-bit random number (with an AMD Ryzen 9 7900X but pretty similar on current Apple CPUs) before doing anything clever like using vector instructions. (I think this is all pretty well covered by that 2017 blog post.) |
Ok, so if we randomly initialize a PRNG with 128+127 bits (that being seed and increment/stream selector) and draw lets say a billion values from it. How often can we do this before our set contains a correlated pair of streams ? With a ideal/perfect PRNG I think we would roughly need sqrt(2^(128-30) * 2^127) ~ 2^112 (simply the birthday paradox hitting overlapping sequences) for the 64bit output 128 bit state case and 2^48 for the 32bit output / 64bit state case. What can we expect to be able to draw out of a PCG-DXSM here before we see correlations? Iam asking this because its a more clear way to quantify the amount of correlations and because I have not seen this stated anywhere and ATM i have a feeling below what value these are but no proof. (I dont want to waste my time trying to proof correlations ocuring in a random set when thats already known/expected, OTOH if its not known thats a interesting exercise) |
First, let's just think about how much compute that is in this best case. Let's assume 1ns per random number, that's (2^112 * 2^30) ns. And if you want to store all the results, it's (2^112 * 2^30 * 2^64) bits, if we use one atom of silicon per bit somehow, that's going to be about a million stellar masses worth to store them all. Let's suppose that somehow PCG DXSM were only half that, with you only needing to collect 2^56 separate outputs. Now you just need to devote 2.5 billion years of compute to your problem, and your one-atom-per-bit computer is just going to need 600 quadrillion tons of silicon, if you can somehow manage 1 atom per bit. There's enough silicon in Earth's crust to do that, but obviously the number goes up if you need more than one atom. Then you're going to run your algorithm to hunt for the correlations. Let's hope it's somehow that's O(n). But then we have another interesting question. If we've got that much storage and that much compute, what would happen if we took other PRNGs and looked for (possibly weak) correlations in their outputs. What would we find? As I recall, some evidence during the Numpy discussion showed that (some) LFSRs can also be correlated even when not exactly overlapping. You also mention using DXSM for smaller-state PRNGs. But that would be a bit silly. When the main parts of your state fit in just one 64-bit register, it's much easier to usr RXSM. Part of the point of the 128-bit versions is to pick a permutation that is a bit weaker because it's good enough and cheaper. Finally, let's switch gears, and imagine we have someone who uses some awesome PRNG, and decides to pass the output of |
ultimately a pencil and a sheet of paper would do. What the goal is, is to find some bounds on the correlated sequences. And i think this is useful as this PRNG is used by default in a few places.
Theres a large number of weak PRNGs and it takes effort to analyze each specific PRNG. (because my galaxy sized computer collapsed into a black hole so i cannot brute force compute these correlations)
People do interesting things, my bank encrypted my data sent over email with my birthday as password for example. I tried telling them that all living people dont have a whole lot of different brithdays but they never replied. |
@imneme's point was that the exploit you found was not something that people had to worry about (No one who uses PCG in a sane way ever has to worry about these issues, as you need carefully contrived related seeds to show any issue, and if you're in the business of contriving carefully related seeding pairs, you can claim issues for pretty much all PRNGs). I am curious: do you disagree with her point? |
@lemire, I have not opened this issue. What i intended, was to once i finish my investigation of the stream correlations to report it. It was just some reader of my blog who opened this issue (and iam certain he just wanted to help but it caused a bit of confusion, i think). The code posted here does not represent very well what i found btw I was quite surprised by the amount of correlations in PCG-DXSM, and i do not believe that all PRNGs have such issues. The thing people need to worry about, is if they generate seeds and increments in some non random way. What i will do, is look more into this as my time permits and ill report things i find. Only after that, can i say if theres anything people need to worry about. This is "work in progress" |
@michaelni That's a fair answer and I think you are saying "maybe it matters". |
I think more insight into how things work is overall a good thing, and investigating these kinds of questions is interesting, or even fascinating. So I think it’s cool to see folks get interested in PRNGs and explore, and there are numerous rabbit holes to head down. But if I were making a list of things typical PRNG users “should know”, that nuance would be be pretty far down the list. As I mentioned earlier, far higher on my list would be having them understand that a single 32-bit number is inadequate to seed any PRNG, as would understanding you can’t just use One other thing is that while someone using the Frontier supercomputer possibly does need to think about potential issues when you draw more than 2^80 random numbers (since their machine might consume 2^60 numbers per second), most folks are never going to be in that situation, so it’d be a moot concern. Alas, that mootness might not be apparent—in my experience, people often don’t always come to sensible conclusions when given a possible issue to worry about. To give some other examples, some folks are fearful of randomized algorithms because they worry that infinitesimal probabilities “might happen”. Told that there’s a 3.8 × 10^-66 chance that random-pivot quicksort might use more than 100 stack slots when sorting 1000 numbers”, some folks will say “oh, so there is a chance” when in practice this kind of probability equates to “don’t worry about it, it’ll never happen”. Closer to home, I saw some fear-based thinking in your blog post where you wrote:
I recommend you read the analysis of JSF in my blog post Random Invertible Mapping Statistics. In particular, I show there would only be a short cycle in So to my eye, you rejected a perfectly good generator that has withstood some of the most targeted testing of any out there. (Fun fact, according to David Blackman, a fair amount of effort in PRNG testing was done just to try to “bust this little PRNG”.) In short, often people can get obsessed with worrying about vanishingly unlikely things, while at the same time being blind to issues that might have real impact. But stepping back, the high-order bit is that I’m pleased to see your work, and I look forward to seeing what you find. PCG has always been considered as a family with the idea that new family members can be added. If it turns out that some audiences want even stronger scrambling, we can see what we can come up with. We always know that we could resort to applying a maximal-avalanche hash function, but the question is always whether we can get what we need with less compute than that, because the big concern for many people is how much it costs to generate a number. |
I dont know if this fear is baseless. Theres CVE-2017-1000373 about openbsd quicksort for example.
Ive removed all specific mentions of omitted PRNGs in that article and replaced them by a generic and more concise text. I think that reduces some unintended ways of reading this
I dont know what the definition of a random invertible mapping is. I would imagine thats a list of 2²⁵⁶ distinct 256bit values in this case filled with values from a true random number source. Iam not sure if a simple PRNG sufficiently matches this definition
Iam sure there are many great PRNGs i did not list. Because i dont know them or have not had the time to investigate them or maybe even because i spend too much time investigating them :) |
Perhaps you glanced over that CVE too quickly—it is specifically about a non-randomized quicksort.
First, a random mapping is another name for a (randomly constructed) endofunction, but typically viewed through the lens of a directed pseudoforrest where we see the function as a graph and consider questions like number of cycles, distance to first cycle, node in-degree, etc. These are discussed in the paper Random Mapping Statistics and the same authors' book, Analytic Combinatorics (section VII.3.3 pages 462-467). As in all discussions of functions, the word invertible means “the function has an inverse”. In other words, a random invertible mapping is another name for a bijection/permutation, but again seeing it from more of a graph perspective, an each vertex has in-degree and out degree of exactly one. For a PRNG, it is a graph connecting next/previous state. But as it turns out, although my Google-fu had failed me when I wrote that article (causing me to just derive things myself), it didn't fail me tonight—turns out that I should have searched for Random Permutation Statistics (duh!). For the analysis of JSF, we just need the fact that “A random permutation of length at least m contains on average 1/m cycles of length m” (or an equivalent result) for it to be trivial to derive my claim that of “Number of nodes on cycles < 2^k is 2^k”. Hopefully that strengthens your sense of proof.
JSF is worth knowing about, not least because Bob Jenkins has such so much good stuff on his delightfully “old-school” website. Although his his pages on PRNGs are most relevant to this conversation, the page helping people grasp orders of magnitude and polynomial vs exponential is also good stuff. In particular, you might like his October 2007 talk on PRNG design where he says “no nice math”. Specifically, his slides say “nice math causes long range patterns”, so that might resonate with you, and perhaps leave you wondering how much self-similarity there might be in other PRNGs just waiting to be uncovered. |
That looks interesting. If i find the time ill read these
The problem i have is that a human written PRNG is not a random sample from all possible "random invertible mappings". I can proof my point by simply pointing to a LFSR, LCG, the simple identity function or others. None of them follow the claimed probability particularly closely. So why would this work for jsf? |
Back on the topic of PCG-DXSM, i have finished (or i hope i have) my investigation of the correlations. And summarized them on my blog: https://guru.multimedia.cx/correlations-in-randomly-initialized-pcg-dxsm/ In summary, there are many correlation and it seems to me possible this could affect specific use cases on supercomputers or otherwise anything needing really alot of random numbers. Also it seems there are correlations within the 2^128 period of a single stream that one could hit if one cloned a stream and moved one of the 2 by a multiple of a large power of 2. Of course i do hope i made a mistake and none of this affects anyone or my code was just buggy. But i think the issue is real |
You raise a good point, but I think this is also why Bob Jenkins had the rule “no nice math”. The problem is that nice math does indeed introduce regularities, which is why you want the state-to-state transformation not to be more chaotic. But, FWIW, as I mentioned in the PCG paper, the usual way to study this is to build smaller-scale versions of your PRNG and see what they do. If your small-scale versions pretty much always match theory, then you can expect larger-scale ones do the same. This implementation I made has |
Thanks for your efforts on this. One thing that's a bit odd in your article is that your recommend XSM64, but that is described in the PracRand docs as follows:
To my eye, XSM64 is also a permuted [linear] congruential generator, just a stronger output function. (FWIW, I was ignorant of PractRand and its XSM generators when I wrote the PCG paper—most of the academic stuff I found was focused on TestU01.) A reality is for any PRNG with only 128-bits of changing state, theory says you'll be able to observe a statistical anomaly after only “only” 2^64 128-bit outputs, well within the reach of our supercomputer. You just use the birthday test, and you don't even need to store all 2^64 outputs, just compute them. Either you'll hit a failure to generate an expected repeat (which is a statistical issue) or your generator is not uniform (also a problem). It literally has to be one or the other. (You can make 128-bit outputs by pairing two 64-bit outputs X and Y to make XY and YX). If you check the literature (e.g., various articles by L'Ecuyer), in general people say don't use more than the square root of the period numbers from any PRNG. In practice you can get away with more, but it's always going to be situational—if you give me a supercomputer and any 2^128 period PRNG, I'll happily fail it. As I think I've said, it's always all about trade-offs. You can have more changing state or stronger output permutations, but it costs compute. What gives you most bang for compute buck is always something to be considering, especially as it changes over time. And if you do need to generator more than 2^64 random numbers, you should probably do some homework on those trade-offs. |
First id like to clarify, iam not really recommanding any specific PRNG as a solution here. But
So i dont think a sqrt(seedspace) birthday attack will work |
XSM64 seems using (1<<64) + 1 as multiplier in the LCG which in an implementation with 64bit variables simplifies to a simple single addition. So it is a LCG, yes. Also i think xsm64 placing complexity in the mixer and not in the LCG multiplier is a wise choice. I really think xsm64 should be looked at before creating a new LCG+mixer |
Michael made some exploits to the DXSM output scrambler.
See: https://guru.multimedia.cx/how-correlated-is-pcg-dxsm-output/
The text was updated successfully, but these errors were encountered: