Skip to content

rocPRIM v0.3.0

Pre-release
Pre-release
Compare
Choose a tag to compare
@VincentSC VincentSC released this 16 Apr 09:16
· 983 commits to develop since this release

Milestone 3

All functions needed for Caffe2 and Tensorflow 1.3 are now finished. Optimizations are only selectively done, where the rest should arrive with milestone 4.

Done in milestones 1 and 2:

  • Scan, reduce and sort algorithms (warp, block, device)
  • Block and thread I/O primitives
  • Block data exchange primitives
  • Reduce-by-key, transform (device)
  • Discontinuity algorithm (block)

Added in this milestone:

  • Fancy iterators
  • Segmented reduction, scan and sort (device)
  • Select (copy if) and unique operations (device)
  • Histogram algorithm (block, device)
  • Run length encode algorithm (device)

Not yet finished:

  • Partition algorithm (device)
  • Comparison sort (warp, block, device), merge (device)