-
Notifications
You must be signed in to change notification settings - Fork 102
Open
Description
When benchmarking an operation that ends up returning a CuArray
, the REPL becomes irresponsive for several seconds after the output of the benchmark was printed, only for printing the resulting array.
This is what I see in the example below:
- Computing and printing the product of two arrays is fast, as expected.
- Now when I try to benchmark the same operation:
2. 1. Until printing1000×1000 CuArray{Float64, 2, CUDA.DeviceMemory}:
everything is normal.
2. 2. It takes 2 minutes to print the resulting array, after 2.1.
julia> using CUDA, BenchmarkTools
julia> @time CuArray(rand(1000,1000)) * CuArray(rand(1000,1000))
0.021615 seconds (81 allocations: 15.261 MiB, 14.13% gc time)
1000×1000 CuArray{Float64, 2, CUDA.DeviceMemory}:
250.421 253.464 258.525 246.037 … 244.642 236.536 258.177
254.117 262.279 273.359 250.726 248.223 247.411 268.022
251.798 257.8 259.521 252.819 244.667 250.107 261.909
254.53 259.457 264.3 250.033 252.791 248.546 267.495
⋮ ⋱
250.403 260.72 258.866 243.41 240.832 244.33 259.653
251.661 260.73 260.969 246.357 240.858 242.686 264.053
259.919 266.168 266.078 248.596 249.35 249.275 270.664
251.824 255.709 263.872 253.726 242.164 241.979 260.17
julia> @time @btime $(CuArray(rand(1000,1000))) * $(CuArray(rand(1000,1000)))
28.111 μs (50 allocations: 1.17 KiB)
12.173401 seconds (73.87 k allocations: 17.424 MiB, 11.67% gc time, 0.18% compilation time)
1000×1000 CuArray{Float64, 2, CUDA.DeviceMemory}: ## <<< 2 minutes after this to print the array
251.765 244.449 255.83 235.117 … 253.868 246.983 250.364
249.094 242.495 253.292 237.452 251.398 248.822 250.646
250.96 247.65 259.761 235.696 258.37 250.399 254.729
251.986 242.604 259.55 234.619 253.424 248.207 250.921
⋮ ⋱
250.439 253.406 264.648 244.909 263.801 253.172 254.22
247.734 240.944 257.84 234.611 252.111 242.606 250.686
246.776 237.175 257.676 233.803 246.963 242.613 247.432
257.374 244.904 259.754 238.249 257.431 246.478 248.384
julia> versioninfo()
Julia Version 1.10.6
Commit 67dffc4a8ae (2024-10-28 12:23 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: 8 × Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-15.0.7 (ORCJIT, skylake)
Threads: 1 default, 0 interactive, 1 GC (on 8 virtual cores)
Environment:
JULIA_EDITOR = vim
JULIA_PKG_PRESERVE_TIERED_INSTALLED = true
(jl_1hRjVB) pkg> st
Status `/tmp/jl_1hRjVB/Project.toml`
[6e4b80f9] BenchmarkTools v1.5.0
[052768ef] CUDA v5.5.2
Metadata
Metadata
Assignees
Labels
No labels