Skip to content

Misc. bug: Vulkan\Llama-server.exe (b7064+) hangs during prompt processing if "--flash-attn on" #17297

@alan-l

Description

@alan-l

Name and Version

Vulkan\Llama-server.exe on Windows b7064+
Last good version: b7063

Operating systems

Windows

Which llama.cpp modules do you know to be affected?

llama-server

Command line

--flash-attn on

Problem description & steps to reproduce

b7063 with "--flash-attn on" processes prompts and shows results in WebUI.
b7064+ with "--flash-attn on" hangs while processing, needing ctrl+c multiple times to kill program.
b7064+ with "--flash-attn off" proceeds normally.

Vulkan on Intel Iris Xe GPU.

First Bad Commit

Working fine on b7063, versions b7064-b7079 same freezing.

Relevant log output

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions