Skip to content

Conversation

apaszke
Copy link
Collaborator

@apaszke apaszke commented Nov 15, 2021

We sometimes emit kernels that require lots of registers and cannot be
scheduled in 1024-sized blocks. This uses a CUDA driver API to query for
a good block size. In the future we might want to cache this number to
avoid any driver-related overheads.

We sometimes emit kernels that require lots of registers and cannot be
scheduled in 1024-sized blocks. This uses a CUDA driver API to query for
a good block size. In the future we might want to cache this number to
avoid any driver-related overheads.
@google-cla google-cla bot added the cla: yes label Nov 15, 2021
@apaszke apaszke added the kokoro:force-run Trigger for GPU CI label Nov 17, 2021
@kokoro-team kokoro-team removed the kokoro:force-run Trigger for GPU CI label Nov 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants