Releases: Dao-AILab/flash-attention
Releases Β· Dao-AILab/flash-attention
v2.2.5
Bump to v2.2.5
v2.2.4.post1
Re-enable compilation for Hopper
v2.2.4
Bump to v2.2.4
v2.2.3.post2
Don't compile for Pytorch 2.1 on CUDA 12.1 due to nvcc segfaults
v2.2.3.post1
Set block size to 64 x 64 for kvcache to avoid nvcc segfaults
v2.2.3
Bump to v2.2.3
v2.2.2
Bump to v2.2.2
v2.2.1
Bump to v2.2.1
v2.2.0
Bump to v2.2.0
v2.1.2.post3
Set single threaded compilation for CUDA 12.2 so CI doesn't OOM