forked from mondus/com4521
-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
bugSomething isn't workingSomething isn't working
Description
The CUDA Makefile
throughout the many, many branches of this repository only embed SM_35
PTX, they do not compile for any "real" architectures.
NVCC_FLAGS= -gencode arch=compute_35,code=compute_35
This means they will always preform PTX JIT, even when running on an SM_35 device, resulting in wait time and potentially less useful error messages (I.e. if too much constant cache is requested, the error is just a ptx jit compilation failed
).
Instead, IMO it should always requrest a real and virtual arch as a minimum:
e.g.
NVCC_FLAGS= -gencode arch=compute_35,code=sm_35 arch=compute_35,code=compute_35
or
NVCC_FLAGS= -gencode arch=compute_35,code=[sm_35,compute_35]
are some of the many ways this could be achieved.
Robadob
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working