Skip to content
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.

Commit

Permalink
More updates to front page description of blocked/striped arrangements
Browse files Browse the repository at this point in the history
Former-commit-id: afe927e
  • Loading branch information
dumerrill committed Mar 8, 2013
1 parent f864c1f commit a7d47f8
Showing 1 changed file with 8 additions and 8 deletions.
16 changes: 8 additions & 8 deletions cub/cub.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -366,6 +366,8 @@
* - <b><em>Blocked arrangement</em></b>. The aggregate tile of items is partitioned
* evenly across threads in "blocked" fashion with thread<sub><em>i</em></sub>
* owning the <em>i</em><sup>th</sup> segment of consecutive elements.
* Blocked arrangements are often desirable for algorithmic benefits (where
* long sequences of items can be processed sequentially within each thread).
* </td>
* <td>
* \par
Expand All @@ -377,7 +379,10 @@
* \par
* - <b><em>Striped arrangement</em></b>. The aggregate tile of items is partitioned across
* threads in "striped" fashion, i.e., the \p ITEMS_PER_THREAD items owned by
* each thread have logical stride \p BLOCK_THREADS between them.
* each thread have logical stride \p BLOCK_THREADS between them. Striped arrangements
* are often desirable for data movement through global memory (where
* [read/write coalescing](http://docs.nvidia.com/cuda/cuda-c-best-practices-guide/#coalesced-access-global-memory)</a>
* is an important performance consideration).
* </td>
* <td>
* \par
Expand All @@ -398,13 +403,8 @@
* facilitates greater ILP for improved throughput and utilization.
*
* \par
* Furthermore, cub::BlockExchange provides operations for converting between blocked
* and striped arrangements. Blocked arrangements are often desirable for
* algorithmic benefits (where long sequences of items can be processed sequentially
* within each thread). Striped arrangements are often desirable for data movement
* through global memory (where
* [read/write coalescing](http://docs.nvidia.com/cuda/cuda-c-best-practices-guide/#coalesced-access-global-memory)</a>
* is an important performance consideration).
* Finally, cub::BlockExchange provides operations for converting between blocked
* and striped arrangements.
*
* \section sec7 (7) Contributors
*
Expand Down

0 comments on commit a7d47f8

Please sign in to comment.