I have noticed that the PyTorch datasets implementations leverage np.random.RandomState.
I do wonder whether this might result in compromising pseudo-randomness in multiprocess scenarios:
PyTorch uses multiprocessing to load data in parallel. The worker processes are created using the fork start method. This means each worker process inherits all resources of the parent, including the state of NumPy’s random number generator.
A detailed explanation of the problem I have in mind is available here.