Segmentation fault (only) with 13B model. #45

WhoSayIn · 2023-03-18T14:53:39Z

~/alpaca# ./chat -m ggml-alpaca-13b-q4.bin
main: seed = 1679150968
llama_model_load: loading model from 'ggml-alpaca-13b-q4.bin' - please wait ...
llama_model_load: ggml ctx size = 10959.49 MB
Segmentation fault

I just downloaded the 13B model from the torrent (ggml-alpaca-13b-q4.bin), pulled the latest master and compiled. It works absolutely fine with the 7B model, but I just get the Segmentation fault with 13B model.

Checksum of the 13B model; 66f3554e700bd06104a4a5753e5f3b5b

I'm running Ubuntu under WSL on Windows.

The text was updated successfully, but these errors were encountered:

barry163 · 2023-03-18T16:07:22Z

I have the same result, I also ran it under WSL on windows, works with 7B model, not with 13B model. Same md5sum. Same result btw for ggerganov/llama.cpp, from which this project is forked. It gives a more detailed error message:

llama_model_load: llama_model_load: tensor 'tok_embeddings.weight' has wrong size in model file
main: failed to load model from 'ggml-alpaca-13b-q4.bin'

I did not use the 7B model from the torrent, but from the download url in this repo, did you do the same? Perhaps the 13B model has to be transformed to an appropriate format before it can be used in this project?

WhoSayIn · 2023-03-18T16:10:56Z

I have the same result, I also ran it under WSL on windows, works with 7B model, not with 13B model. Same md5sum. Same result btw for ggerganov/llama.cpp, from which this project is forked. It gives a more detailed error message:
llama_model_load: llama_model_load: tensor 'tok_embeddings.weight' has wrong size in model file
main: failed to load model from 'ggml-alpaca-13b-q4.bin'
I did not use the 7B model from the torrent, but from the download url in this repo, did you do the same? Perhaps the 13B model has to be transformed to an appropriate format before it can be used in this project?

Yes, I downloaded the 7B file from the direct link on the readme. Also used the magnet link on the readme for the 13B file. Haven’t seen anything about converting the 13B downloaded file.

PriNova · 2023-03-18T16:48:34Z

It has nothing to do with converting. Main.cpp thinks this is a multi-part file. Usually the 13B model is splitted into two files. But here, we have only one file.
In the main.cpp file of the llama.cpp upstream I changed (hacked) the line number 130 into n_parts = 1; //LLAMA_N_PARTS.at(hparams.n_embd); which let me load the model.

antimatter15 · 2023-03-18T17:15:51Z

Make sure to compile it again, using the latest version of the source code

WhoSayIn · 2023-03-18T17:57:35Z

@PriNova I tried that, changed the equivalent line on chat.cpp and compiled again, unfortunately didn't help :(

@antimatter15 I compiled the latest master, unfortunately didn't help :(

WhoSayIn · 2023-03-18T18:12:08Z

Well, just by doing some basic printf debugging, I can see the segfault is happening at this line; https://github.com/antimatter15/alpaca.cpp/blob/b64ca1c07cb4ff0637f48d85178b7a99ffd09d20/chat.cpp#LL254C22-L254C22

model.tok_embeddings = ggml_new_tensor_2d(ctx, wtype, n_embd, n_vocab);

No idea how to proceed from here though :(

kaz9112 · 2023-03-19T00:10:06Z

Well, just by doing some basic printf debugging, I can see the segfault is happening at this line; https://github.com/antimatter15/alpaca.cpp/blob/b64ca1c07cb4ff0637f48d85178b7a99ffd09d20/chat.cpp#LL254C22-L254C22

model.tok_embeddings = ggml_new_tensor_2d(ctx, wtype, n_embd, n_vocab);

No idea how to proceed from here though :(

i used this
./chat -c 1024 -m ggml-alpaca-13b-q4.bin
it loaded for me, but it won't replied anything i asked, maybe it will work for you, idk...

progressionnetwork · 2023-03-19T05:06:49Z

Try to provide a full path to model like ./chat -m D:/alpaca/13b/ggml-alpaca-13b-q4.bin

externvoid · 2023-03-19T06:34:43Z

In my case, the ggml-alpaca-13b-q4.bin works. I referred to this tweet.
https://twitter.com/andy_matuschak/status/1636769182066053120

This includes several corrections in chat.cpp

JCharante · 2023-03-20T11:37:21Z

Changing n_parts = 1 worked perfectly for me, my md5sum is the same as WhoSayIn. Keep in mind the latest commit in llama.cpp changed the model format, so I've been running the version of llama.cpp right before that change (git checkout 5cb63e2493c49bc2c3b9b355696e8dc26cdd0380)

james1236 · 2023-03-21T02:29:28Z

Changing n_parts = 1 worked perfectly for me, my md5sum is the same as WhoSayIn. Keep in mind the latest commit in llama.cpp changed the model format, so I've been running the version of llama.cpp right before that change (git checkout 5cb63e2493c49bc2c3b9b355696e8dc26cdd0380)

How can you change that line if there isn't a main.cpp file in the alpaca.cpp folder

PriNova · 2023-03-21T02:31:51Z

Changing n_parts = 1 worked perfectly for me, my md5sum is the same as WhoSayIn. Keep in mind the latest commit in llama.cpp changed the model format, so I've been running the version of llama.cpp right before that change (git checkout 5cb63e2493c49bc2c3b9b355696e8dc26cdd0380)

How can you change that line if there isn't a main.cpp file in the alpaca.cpp folder

It is already fixed in the chat.cpp file at line 34 in the model paramteres.

eshahrabani · 2023-03-22T13:17:51Z

I fixed this issue by troubleshooting on my own machine!

I had the same issue running on WSL. The segmentation fault is due to not enough RAM. I have 32GB RAM and was able to run up to the 13B model but not the 30B under WSL. I tried building then running it under Windows and not WSL and it worked. Seems like WSL cannot use the page file properly at least for this project. 30B is slow but works for me now!

PriNova · 2023-03-22T13:23:40Z

I fixed this issue by troubleshooting on my own machine!

I had the same issue running on WSL. The segmentation fault is due to not enough RAM. I have 32GB RAM and was able to run up to the 13B model but not the 30B under WSL. I tried building then running it under Windows and not WSL and it worked. Seems like WSL cannot use the page file properly at least for this project. 30B is slow but works for me now!

You can increase your page file globally or locally.
I increased mine to 32G. Here are the steps:

https://learn.microsoft.com/en-us/windows/wsl/wsl-config#configuration-setting-for-wslconfig

eshahrabani · 2023-03-22T13:26:07Z

I fixed this issue by troubleshooting on my own machine!
I had the same issue running on WSL. The segmentation fault is due to not enough RAM. I have 32GB RAM and was able to run up to the 13B model but not the 30B under WSL. I tried building then running it under Windows and not WSL and it worked. Seems like WSL cannot use the page file properly at least for this project. 30B is slow but works for me now!

You can increase your page file globally or locally. I increased mine to 32G. Here are the steps:

https://learn.microsoft.com/en-us/windows/wsl/wsl-config#configuration-setting-for-wslconfig

Good to know you can do it for WSL! I found it easier to just do it on Windows and compile/run it outside of WSL.

ahmetkca · 2023-03-24T08:07:57Z

This should add a better error handling when memory buffer allocation fails.
Currently the result of thememory allocation is assumed, it would always be successful, which leads to unexpected error
like these segmentation faults. This PR should at least add a better error handling and err message #142

sachinspanicker · 2023-04-09T02:58:47Z

~/alpaca# ./chat -m ggml-alpaca-13b-q4.bin
main: seed = 1679150968
llama_model_load: loading model from 'ggml-alpaca-13b-q4.bin' - please wait ...
llama_model_load: ggml ctx size = 10959.49 MB
Segmentation fault
I just downloaded the 13B model from the torrent (ggml-alpaca-13b-q4.bin), pulled the latest master and compiled. It works absolutely fine with the 7B model, but I just get the Segmentation fault with 13B model.

Checksum of the 13B model; 66f3554e700bd06104a4a5753e5f3b5b

I'm running Ubuntu under WSL on Windows.

hi, where did you get the ggml-alpaca-13b-q4.bin download file from? I can't seem to find it anywhere to download.

jwooldridge234 mentioned this issue Mar 20, 2023

13B Alpaca seems to be available cocktailpeanut/dalai#97

Open

antimatter15 closed this as completed Mar 21, 2023

james1236 mentioned this issue Mar 23, 2023

13B causes seg fault likely due to not enough RAM - Error message needed #128

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Segmentation fault (only) with 13B model. #45

Segmentation fault (only) with 13B model. #45

WhoSayIn commented Mar 18, 2023

barry163 commented Mar 18, 2023 •

edited

Loading

WhoSayIn commented Mar 18, 2023

PriNova commented Mar 18, 2023 •

edited

Loading

antimatter15 commented Mar 18, 2023

WhoSayIn commented Mar 18, 2023

WhoSayIn commented Mar 18, 2023

kaz9112 commented Mar 19, 2023

progressionnetwork commented Mar 19, 2023

externvoid commented Mar 19, 2023

JCharante commented Mar 20, 2023

james1236 commented Mar 21, 2023

PriNova commented Mar 21, 2023 •

edited

Loading

eshahrabani commented Mar 22, 2023

PriNova commented Mar 22, 2023 •

edited

Loading

eshahrabani commented Mar 22, 2023

ahmetkca commented Mar 24, 2023

sachinspanicker commented Apr 9, 2023

Segmentation fault (only) with 13B model. #45

Segmentation fault (only) with 13B model. #45

Comments

WhoSayIn commented Mar 18, 2023

barry163 commented Mar 18, 2023 • edited Loading

WhoSayIn commented Mar 18, 2023

PriNova commented Mar 18, 2023 • edited Loading

antimatter15 commented Mar 18, 2023

WhoSayIn commented Mar 18, 2023

WhoSayIn commented Mar 18, 2023

kaz9112 commented Mar 19, 2023

progressionnetwork commented Mar 19, 2023

externvoid commented Mar 19, 2023

JCharante commented Mar 20, 2023

james1236 commented Mar 21, 2023

PriNova commented Mar 21, 2023 • edited Loading

eshahrabani commented Mar 22, 2023

PriNova commented Mar 22, 2023 • edited Loading

eshahrabani commented Mar 22, 2023

ahmetkca commented Mar 24, 2023

sachinspanicker commented Apr 9, 2023

barry163 commented Mar 18, 2023 •

edited

Loading

PriNova commented Mar 18, 2023 •

edited

Loading

PriNova commented Mar 21, 2023 •

edited

Loading

PriNova commented Mar 22, 2023 •

edited

Loading