We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
def test_mpi_allreduce_cpu(self): """Test on CPU that the allreduce correctly sums 1D, 2D, 3D tensors.""" with mpi.Session() as session: size = session.run(mpi.size()) dtypes = [tf.int32, tf.float32] dims = [1, 2, 3] for dtype, dim in itertools.product(dtypes, dims): tf.set_random_seed(1234) tensor = tf.random_uniform([17] * dim, -100, 100, dtype=dtype) summed = mpi.allreduce(tensor, average=False) multiplied = tensor * size max_difference = tf.reduce_max(tf.abs(summed - multiplied)) # Threshold for floating point equality depends on number of # ranks, since we're comparing against precise multiplication. if size <= 3: threshold = 0 elif size < 10: threshold = 1e-4 elif size < 15: threshold = 5e-4 else: break diff = session.run(max_difference) self.assertTrue(diff <= threshold, "mpi.allreduce produces incorrect results")
mpirun -n 1 python allgather.py True True True True True True ...
mpirun -n 2 python allgather.py 2017-10-25 13:13:34.376886: E tensorflow/stream_executor/cuda/cuda_driver.cc:924] failed to allocate 998.75M (1047265280 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY 2017-10-25 13:13:34.379151: E tensorflow/stream_executor/cuda/cuda_driver.cc:924] failed to allocate 898.88M (942538752 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY 2017-10-25 13:13:34.381419: E tensorflow/stream_executor/cuda/cuda_driver.cc:924] failed to allocate 808.99M (848284928 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY 2017-10-25 13:13:34.383677: E tensorflow/stream_executor/cuda/cuda_driver.cc:924] failed to allocate 728.09M (763456512 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY 2017-10-25 13:13:34.385962: E tensorflow/stream_executor/cuda/cuda_driver.cc:924] failed to allocate 655.28M (687110912 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY 2017-10-25 13:13:34.388247: E tensorflow/stream_executor/cuda/cuda_driver.cc:924] failed to allocate 589.75M (618400000 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY False False False False False False False False False False
The text was updated successfully, but these errors were encountered:
No branches or pull requests
The text was updated successfully, but these errors were encountered: