Use CUDA driver APIs to avoid scheduling too large blocks #690

apaszke · 2021-11-15T12:50:50Z

We sometimes emit kernels that require lots of registers and cannot be
scheduled in 1024-sized blocks. This uses a CUDA driver API to query for
a good block size. In the future we might want to cache this number to
avoid any driver-related overheads.

We sometimes emit kernels that require lots of registers and cannot be scheduled in 1024-sized blocks. This uses a CUDA driver API to query for a good block size. In the future we might want to cache this number to avoid any driver-related overheads.

google-cla bot added the cla: yes label Nov 15, 2021

apaszke added the kokoro:force-run Trigger for GPU CI label Nov 17, 2021

kokoro-team removed the kokoro:force-run Trigger for GPU CI label Nov 17, 2021

apaszke force-pushed the main branch from 46b8727 to 8db43fc Compare May 13, 2022 14:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use CUDA driver APIs to avoid scheduling too large blocks #690

Use CUDA driver APIs to avoid scheduling too large blocks #690

Uh oh!

apaszke commented Nov 15, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Use CUDA driver APIs to avoid scheduling too large blocks #690

Are you sure you want to change the base?

Use CUDA driver APIs to avoid scheduling too large blocks #690

Uh oh!

Conversation

apaszke commented Nov 15, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants