flash attention error on instruction tune llama-2 tutorial on Sagemaker notebook #40

matthewchung74 · 2023-10-25T05:37:02Z

Thank you for the excellent Blogs!

When running https://github.com/philschmid/deep-learning-pytorch-huggingface/blob/main/training/instruction-tune-llama-2-int4.ipynb

I am trying to enable flash attention in a Sagemaker Notebook using ml.g5.2xlarge and nvidia-smi tells me I am on CUDA Version: 12.0 but

os.environ["MAX_JOBS"] = "4" 
!pip install flash-attn --no-build-isolation

gives this error

  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [12 lines of output]
      fatal: not a git repository (or any of the parent directories): .git
      
      
      torch.__version__  = 2.1.0+cu121
      
      
      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/tmp/pip-install-1b2ql47d/flash-attn_2180596c15514b7d9e4d004796412440/setup.py", line 117, in <module>
          raise RuntimeError(
      RuntimeError: FlashAttention is only supported on CUDA 11.6 and above.  Note: make sure nvcc has a supported version by running nvcc -V.
      [end of output]

Is this something you've seen?

The text was updated successfully, but these errors were encountered:

philschmid · 2023-10-25T08:09:10Z

I am not sure if cuda 12.0 is yet supported. -> that's was the error says as well

matthewchung74 · 2023-10-26T00:28:11Z

I do see that, but when I do this

base) [ec2-user@ip-172-16-30-64 notebooks]$ nvidia-smi
Thu Oct 26 00:27:41 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.85.12    Driver Version: 525.85.12    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA A10G         On   | 00000000:00:1E.0 Off |                    0 |

it looks like coda 12.0 is installed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

flash attention error on instruction tune llama-2 tutorial on Sagemaker notebook #40

flash attention error on instruction tune llama-2 tutorial on Sagemaker notebook #40

matthewchung74 commented Oct 25, 2023

philschmid commented Oct 25, 2023 •

edited

Loading

matthewchung74 commented Oct 26, 2023

flash attention error on instruction tune llama-2 tutorial on Sagemaker notebook #40

flash attention error on instruction tune llama-2 tutorial on Sagemaker notebook #40

Comments

matthewchung74 commented Oct 25, 2023

philschmid commented Oct 25, 2023 • edited Loading

matthewchung74 commented Oct 26, 2023

philschmid commented Oct 25, 2023 •

edited

Loading