Support for Llama-3_1-Nemotron-51B #10669

ymcki · 2024-12-05T06:57:56Z

Make sure to read the contributing guidelines before submitting a PR

More details is here:
#10648

Seems like my changes in vocab.py doesn't really break CI test.

It might be a better idea to not to modify vocab.py but instead ask the user to fix the tokenizer_config.json instead. In that case, you can ignore the changes I made in vocab.py.

bartowski1182 · 2024-12-06T04:45:05Z

I wonder if it's also a better idea not to group this with the normal llama archs since it requires so many changes, may be better to make it its own model type?

ymcki · 2024-12-06T06:16:08Z

I wonder if it's also a better idea not to group this with the normal llama archs since it requires so many changes, may be better to make it its own model type?

I think src/llama.cpp doesn't change that much but convert_hf_to_gguf.py does have more changes. Anyway, I can make another fork to make it a separate model type and submit another pull request.

What do you think about the vocab.py problem? Should I just leave the original vocab.py as is and ask people to fix tokenizer_config.json instead?

ymcki · 2024-12-06T15:56:23Z

Created a separate Deci Model. This version doesn't change vocab.py and relies on people manually fixing 51B model's tokenizer_config.json.

ymcki · 2024-12-11T07:38:40Z

Any updates?

ggerganov · 2024-12-12T09:00:37Z

src/llama.cpp

+            if (n_head == 0) // attention-free layer of Llama-3_1-Nemotron-51B
+                cur = inpL;
+            else {


Suggested change

if (n_head == 0) // attention-free layer of Llama-3_1-Nemotron-51B

cur = inpL;

else {

if (n_head == 0) { // attention-free layer of Llama-3_1-Nemotron-51B

cur = inpL;

} else {

ggerganov · 2024-12-12T09:00:54Z

src/llama.cpp

+            } else if (n_head > 0)
+            // self-attention
+            {


Suggested change

} else if (n_head > 0)

// self-attention

{

} else if (n_head > 0) {

// self-attention

ngxson · 2024-12-12T15:53:35Z

To fix editorconfig / flake8 tests, you need to modify your source code to remove trailing spaces / add new line.

And to fix server CI, you need to merge latest commits from master branch

ymcki · 2024-12-15T23:51:32Z

Can someone approve the workflows?

ymcki · 2024-12-19T03:09:33Z

Yay! Finally passed all checks! :)

slaren · 2024-12-20T23:38:13Z

src/llama.cpp

+                cur = llm_build_lora_mm(lctx, ctx0, model.layers[il].wo, cur);
+                cb(cur, "wo", il);
+            } else if (n_head > 0) {
+            // self-attention


Suggested change

// self-attention

// self-attention

slaren · 2024-12-20T23:38:27Z

src/llama.cpp

+            const int64_t n_head_kv = hparams.n_head_kv(il);
+            const int64_t n_head    = hparams.n_head(il);
+
+            if (n_head == 0) { // attention-free layer of Llama-3_1-Nemotron-51B


Suggested change

if (n_head == 0) { // attention-free layer of Llama-3_1-Nemotron-51B

if (n_head == 0) {

// attention-free layer of Llama-3_1-Nemotron-51B

slaren · 2024-12-20T23:38:38Z

src/llama.cpp

+                cb(cur, "attn_norm", il);
+            }
+
+            if (n_head > 0 && n_head_kv == 0) { // "linear attention" of Llama-3_1-Nemotron-51B


Suggested change

if (n_head > 0 && n_head_kv == 0) { // "linear attention" of Llama-3_1-Nemotron-51B

if (n_head > 0 && n_head_kv == 0) {

// "linear attention" of Llama-3_1-Nemotron-51B

github-actions bot added the python python script changes label Dec 5, 2024

ymcki closed this Dec 6, 2024

ymcki force-pushed the master branch from e2afcc0 to 6c5bc06 Compare December 6, 2024 10:41

ymcki reopened this Dec 6, 2024

ggerganov approved these changes Dec 12, 2024

View reviewed changes

ymcki closed this Dec 19, 2024

ymcki force-pushed the master branch from ef82afb to 9177484 Compare December 19, 2024 01:45

conflict resolution

ecad966

ymcki reopened this Dec 19, 2024

slaren approved these changes Dec 20, 2024

View reviewed changes

slaren reviewed Dec 20, 2024

View reviewed changes

ymcki and others added 4 commits December 22, 2024 10:16

Merge branch 'ggerganov:master' into master

12aded6

move comments after bracket to its own line

643e5e8

Merge branch 'ggerganov:master' into master

e68c76d

Merge branch 'ggerganov:master' into master

6a4805f

slaren merged commit 6f0c9e0 into ggerganov:master Dec 23, 2024
50 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for Llama-3_1-Nemotron-51B #10669

Support for Llama-3_1-Nemotron-51B #10669

ymcki commented Dec 5, 2024

bartowski1182 commented Dec 6, 2024

ymcki commented Dec 6, 2024

ymcki commented Dec 6, 2024

ymcki commented Dec 11, 2024

ggerganov Dec 12, 2024

ggerganov Dec 12, 2024

ngxson commented Dec 12, 2024

ymcki commented Dec 15, 2024

ymcki commented Dec 19, 2024

slaren Dec 20, 2024

slaren Dec 20, 2024

slaren Dec 20, 2024

	if (n_head == 0) { // attention-free layer of Llama-3_1-Nemotron-51B
	if (n_head == 0) {
	// attention-free layer of Llama-3_1-Nemotron-51B

	if (n_head > 0 && n_head_kv == 0) { // "linear attention" of Llama-3_1-Nemotron-51B
	if (n_head > 0 && n_head_kv == 0) {
	// "linear attention" of Llama-3_1-Nemotron-51B

Support for Llama-3_1-Nemotron-51B #10669

Support for Llama-3_1-Nemotron-51B #10669

Conversation

ymcki commented Dec 5, 2024

bartowski1182 commented Dec 6, 2024

ymcki commented Dec 6, 2024

ymcki commented Dec 6, 2024

ymcki commented Dec 11, 2024

ggerganov Dec 12, 2024

Choose a reason for hiding this comment

ggerganov Dec 12, 2024

Choose a reason for hiding this comment

ngxson commented Dec 12, 2024

ymcki commented Dec 15, 2024

ymcki commented Dec 19, 2024

slaren Dec 20, 2024

Choose a reason for hiding this comment

slaren Dec 20, 2024

Choose a reason for hiding this comment

slaren Dec 20, 2024

Choose a reason for hiding this comment