Bump to v2.7.0

tridao · tridao · commit c555642172e2 · 2024-11-12T14:11:44.000-08:00
diff --git a/README.md b/README.md
@@ -373,6 +373,10 @@ Thanks to @beginlner for this contribution.
 Support attention with softcapping, as used in Gemma-2 and Grok models.
 Thanks to @Narsil and @lucidrains for this contribution.
 
+### 2.7: Compatibility with torch compile
+
+Thanks to @ani300 for this contribution.
+
 ## Performance
 
 We present expected speedup (combined forward + backward pass) and memory savings from using FlashAttention against PyTorch standard attention, depending on sequence length, on different GPUs (speedup depends on memory bandwidth - we see more speedup on slower GPU memory).
diff --git a/flash_attn/__init__.py b/flash_attn/__init__.py
@@ -1,4 +1,4 @@
-__version__ = "2.6.3"
+__version__ = "2.7.0"
 
 from flash_attn.flash_attn_interface import (
     flash_attn_func,

Original file line number	Diff line number	Diff line change
`@@ -1,4 +1,4 @@`
`1`		`-__version__ = "2.6.3"`
	`1`	`+__version__ = "2.7.0"`
`2`	`2`
`3`	`3`	`from flash_attn.flash_attn_interface import (`
`4`	`4`	`flash_attn_func,`