Skip to content

Releases: Tencent/PatrickStar

v0.4.6

23 Dec 11:38
b1265ee
Compare
Choose a tag to compare

Evaluate on 8 nodes of SuperPod. Fix bugs in multi-GPU mem tracer.

v0.4.5

13 Dec 06:51
d2a5e1d
Compare
Choose a tag to compare

refractory the files in example and add chunk size searching.

v0.4.4

08 Dec 03:19
f5fee95
Compare
Choose a tag to compare

The system is successfully evaluated on a multi-node system.
The benchmark scripts are integrated with memory-centric tiling borrowed from DeepSpeed.
It trains an 18B model on WeChat Yard.

v0.4.3

27 Nov 13:10
b4e755f
Compare
Choose a tag to compare

PatrickStar is evaluated on 8xA100 SuperNode.

  1. Fix async copy bug in chunk move.
  2. Add Memory Allocation Cache
  3. Memory Saving Communication.

v0.4.2

24 Nov 07:24
Compare
Choose a tag to compare

Refactored memory tracer.

v0.3.0

08 Nov 02:58
Compare
Choose a tag to compare

The initial open source version 🎉🎉🎉

v0.1.0

10 May 08:04
Compare
Choose a tag to compare
v0.1.0 Pre-release
Pre-release

单机单卡版本。使用eager mode进行chunk schema调度。性能不佳,由于巨大的CPU-GPU移动开销。