Silent crash when loading model with llama.cpp on certain Qemu virtual CPUs #6712

josbraden · 2025-01-29T19:05:57Z

Describe the bug

Web UI silently crashes with exit 0 when loading a GGUF model with the llama.cpp loader on virtual machines running certain Qemu virtual CPUs

Workaround: Switch VM to supported CPU

Non-working tested CPUs:

x86-64-v2-AES (default on Proxmox 8.3.2)
qemu64
kvm64

Working tested CPUs:

host (Intel Core i7-8700 CPU in this case)
x86-64-v3

Mostly just posting this in case someone else runs into it, but would be good to get a non-silent crash

Is there an existing issue for this?

I have searched the existing issues

Reproduction

Install fresh Ubuntu 24.04.1 VM in Proxmox 8.3.2 with default CPU architecture (x86-64-v2-AES)
Clone repo and install with start_linux.sh
- Tested with both Nvidia CUDA 12.4 and CPU only
Downloaded tinyllama-1.1b-chat-v1.0.Q4_0.gguf from https://huggingface.co/TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF (tested with multiple models)
Load model with default settings
Confirm that workaround works by switching virtual CPU architecture to "host" or x86-64-v3

Screenshot

Logs

See screenshot, exit 0

System Info

Ubuntu 24.04.01 LTS on Proxmox 8.3.2

/proc/cpuinfo:

x86-64-v2-AES (causes crash)


processor	: 0
vendor_id	: GenuineIntel
cpu family	: 15
model		: 107
model name	: QEMU Virtual CPU version 2.5+
stepping	: 1
microcode	: 0x1
cpu MHz		: 3191.998
cache size	: 16384 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 2
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx lm constant_tsc nopl xtopology cpuid tsc_known_freq pni ssse3 cx16 sse4_1 sse4_2 x2apic popcnt aes hypervisor lahf_lm cpuid_fault pti
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit mmio_unknown bhi
bogomips	: 6383.99
clflush size	: 64
cache_alignment	: 128
address sizes	: 40 bits physical, 48 bits virtual
power management:

processor	: 1
vendor_id	: GenuineIntel
cpu family	: 15
model		: 107
model name	: QEMU Virtual CPU version 2.5+
stepping	: 1
microcode	: 0x1
cpu MHz		: 3191.998
cache size	: 16384 KB
physical id	: 0
siblings	: 2
core id		: 1
cpu cores	: 2
apicid		: 1
initial apicid	: 1
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx lm constant_tsc nopl xtopology cpuid tsc_known_freq pni ssse3 cx16 sse4_1 sse4_2 x2apic popcnt aes hypervisor lahf_lm cpuid_fault pti
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit mmio_unknown bhi
bogomips	: 6383.99
clflush size	: 64
cache_alignment	: 128
address sizes	: 40 bits physical, 48 bits virtual
power management:


Host (working):


processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 158
model name	: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
stepping	: 10
microcode	: 0xb4
cpu MHz		: 3191.998
cache size	: 16384 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 2
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 22
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology cpuid tsc_known_freq pni pclmulqdq vmx ssse3 fma cx16 pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault pti ssbd ibrs ibpb stibp tpr_shadow flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 xsaves arat vnmi umip md_clear flush_l1d arch_capabilities
vmx flags	: vnmi preemption_timer invvpid ept_x_only ept_ad ept_1gb flexpriority tsc_offset vtpr mtf vapic ept vpid unrestricted_guest shadow_vmcs pml
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs taa srbds mmio_stale_data retbleed gds bhi
bogomips	: 6383.99
clflush size	: 64
cache_alignment	: 64
address sizes	: 39 bits physical, 48 bits virtual
power management:

processor	: 1
vendor_id	: GenuineIntel
cpu family	: 6
model		: 158
model name	: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
stepping	: 10
microcode	: 0xb4
cpu MHz		: 3191.998
cache size	: 16384 KB
physical id	: 0
siblings	: 2
core id		: 1
cpu cores	: 2
apicid		: 1
initial apicid	: 1
fpu		: yes
fpu_exception	: yes
cpuid level	: 22
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology cpuid tsc_known_freq pni pclmulqdq vmx ssse3 fma cx16 pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault pti ssbd ibrs ibpb stibp tpr_shadow flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 xsaves arat vnmi umip md_clear flush_l1d arch_capabilities
vmx flags	: vnmi preemption_timer invvpid ept_x_only ept_ad ept_1gb flexpriority tsc_offset vtpr mtf vapic ept vpid unrestricted_guest shadow_vmcs pml
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs taa srbds mmio_stale_data retbleed gds bhi
bogomips	: 6383.99
clflush size	: 64
cache_alignment	: 64
address sizes	: 39 bits physical, 48 bits virtual
power management:



x86-64-v3 (working):


processor	: 0
vendor_id	: GenuineIntel
cpu family	: 15
model		: 107
model name	: QEMU Virtual CPU version 2.5+
stepping	: 1
microcode	: 0x1
cpu MHz		: 3191.998
cache size	: 16384 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 2
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx lm constant_tsc nopl xtopology cpuid tsc_known_freq pni ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c hypervisor lahf_lm abm cpuid_fault pti bmi1 avx2 bmi2
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit mmio_unknown bhi
bogomips	: 6383.99
clflush size	: 64
cache_alignment	: 128
address sizes	: 40 bits physical, 48 bits virtual
power management:

processor	: 1
vendor_id	: GenuineIntel
cpu family	: 15
model		: 107
model name	: QEMU Virtual CPU version 2.5+
stepping	: 1
microcode	: 0x1
cpu MHz		: 3191.998
cache size	: 16384 KB
physical id	: 0
siblings	: 2
core id		: 1
cpu cores	: 2
apicid		: 1
initial apicid	: 1
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx lm constant_tsc nopl xtopology cpuid tsc_known_freq pni ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c hypervisor lahf_lm abm cpuid_fault pti bmi1 avx2 bmi2
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit mmio_unknown bhi
bogomips	: 6383.99
clflush size	: 64
cache_alignment	: 128
address sizes	: 40 bits physical, 48 bits virtual
power management:

B0rner · 2025-01-31T12:12:59Z

Exactly same problem here. Running text-generation-webui in a container, that was created following that ( https://github.com/oobabooga/text-generation-webui/wiki/09-‐-Docker ) documentation. The container is running on a VM that is provided by proxmox.

I'm runnig text-generation-webui 2.3 or 2.4 (not sure, how to determine current version)

I'm using CUDA 12.4 driver (in vm) with the requirements.txt provided by this repo, which installed python CUDA 12.1 packages.

My cpuinfo looks like this:

processor	: 23
vendor_id	: GenuineIntel
cpu family	: 15
model		: 107
model name	: QEMU Virtual CPU version 2.5+
stepping	: 1
microcode	: 0x1
cpu MHz		: 2900.000
cache size	: 16384 KB
physical id	: 0
siblings	: 24
core id		: 23
cpu cores	: 24
apicid		: 23
initial apicid	: 23
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx lm constant_tsc nopl xtopology cpuid tsc_known_freq pni ssse3 cx16 sse4_1 sse4_2 x2apic popcnt aes hypervisor lahf_lm cpuid_fault pti
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit mmio_unknown bhi
bogomips	: 5800.00
clflush size	: 64
cache_alignment	: 128
address sizes	: 40 bits physical, 48 bits virtual
power management:

First, I tough, this is because my CPU hasn't the AVX2 flag, but it looks like, AVX2 in not the point, because @josbraden has the same problem with an AVX2-supported CPU.

josbraden added the bug Something isn't working label Jan 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Silent crash when loading model with llama.cpp on certain Qemu virtual CPUs #6712

Silent crash when loading model with llama.cpp on certain Qemu virtual CPUs #6712

josbraden commented Jan 29, 2025

B0rner commented Jan 31, 2025

Silent crash when loading model with llama.cpp on certain Qemu virtual CPUs #6712

Silent crash when loading model with llama.cpp on certain Qemu virtual CPUs #6712

Comments

josbraden commented Jan 29, 2025

Describe the bug

Is there an existing issue for this?

Reproduction

Screenshot

Logs

System Info

B0rner commented Jan 31, 2025