Skip to content

Makefile -gencode only embed ptx  #9

@ptheywood

Description

@ptheywood

The CUDA Makefile throughout the many, many branches of this repository only embed SM_35 PTX, they do not compile for any "real" architectures.

NVCC_FLAGS= -gencode arch=compute_35,code=compute_35

This means they will always preform PTX JIT, even when running on an SM_35 device, resulting in wait time and potentially less useful error messages (I.e. if too much constant cache is requested, the error is just a ptx jit compilation failed).

Instead, IMO it should always requrest a real and virtual arch as a minimum:

e.g.

NVCC_FLAGS= -gencode arch=compute_35,code=sm_35 arch=compute_35,code=compute_35

or

NVCC_FLAGS= -gencode arch=compute_35,code=[sm_35,compute_35]

are some of the many ways this could be achieved.

NVCC docs

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions