Skip to content

Commit c555642

Browse files
committed
Bump to v2.7.0
1 parent 6ffeb57 commit c555642

File tree

2 files changed

+5
-1
lines changed

2 files changed

+5
-1
lines changed

README.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -373,6 +373,10 @@ Thanks to @beginlner for this contribution.
373373
Support attention with softcapping, as used in Gemma-2 and Grok models.
374374
Thanks to @Narsil and @lucidrains for this contribution.
375375

376+
### 2.7: Compatibility with torch compile
377+
378+
Thanks to @ani300 for this contribution.
379+
376380
## Performance
377381

378382
We present expected speedup (combined forward + backward pass) and memory savings from using FlashAttention against PyTorch standard attention, depending on sequence length, on different GPUs (speedup depends on memory bandwidth - we see more speedup on slower GPU memory).

flash_attn/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
__version__ = "2.6.3"
1+
__version__ = "2.7.0"
22

33
from flash_attn.flash_attn_interface import (
44
flash_attn_func,

0 commit comments

Comments
 (0)