Skip to content

Commit

Permalink
docs: add model architecture section to mamba.md
Browse files Browse the repository at this point in the history
  • Loading branch information
danbev committed Aug 12, 2024
1 parent 9594959 commit ab56485
Showing 1 changed file with 57 additions and 0 deletions.
57 changes: 57 additions & 0 deletions notes/mamba.md
Original file line number Diff line number Diff line change
Expand Up @@ -660,3 +660,60 @@ We can visualize this as
+---| A |---+
+---+
```

### Model Architecture
Lets take a look at a mamba model by parsing a gguf file:
```console
$ ./inspect-model.sh models/mamba-2.8b-q3_k_m.gguf
INFO:gguf-dump:* Loading: models/mamba-2.8b-q3_k_m.gguf
* File is LITTLE endian, script is running on a LITTLE endian host.
* Dumping 25 key/value pair(s)
1: UINT32 | 1 | GGUF.version = 3
2: UINT64 | 1 | GGUF.tensor_count = 642
3: UINT64 | 1 | GGUF.kv_count = 22
4: STRING | 1 | general.architecture = 'mamba'
5: STRING | 1 | general.name = 'mamba-2.8b-hf'
6: UINT32 | 1 | mamba.context_length = 1048576
7: UINT32 | 1 | mamba.embedding_length = 2560
8: UINT32 | 1 | mamba.feed_forward_length = 0
9: UINT32 | 1 | mamba.attention.head_count = 0
10: UINT32 | 1 | mamba.block_count = 64
11: UINT32 | 1 | mamba.ssm.conv_kernel = 4
12: UINT32 | 1 | mamba.ssm.inner_size = 5120
13: UINT32 | 1 | mamba.ssm.state_size = 16
14: UINT32 | 1 | mamba.ssm.time_step_rank = 160
15: FLOAT32 | 1 | mamba.attention.layer_norm_rms_epsilon = 9.999999747378752e-06
16: UINT32 | 1 | general.file_type = 12
17: STRING | 1 | tokenizer.ggml.model = 'gpt2'
18: [STRING] | 50280 | tokenizer.ggml.tokens
19: [INT32] | 50280 | tokenizer.ggml.token_type
20: [STRING] | 50009 | tokenizer.ggml.merges
21: UINT32 | 1 | tokenizer.ggml.bos_token_id = 0
22: UINT32 | 1 | tokenizer.ggml.eos_token_id = 0
23: UINT32 | 1 | tokenizer.ggml.unknown_token_id = 0
24: UINT32 | 1 | tokenizer.ggml.padding_token_id = 0
25: UINT32 | 1 | general.quantization_version = 2
* Dumping 642 tensor(s)
1: 128716800 | 2560, 50280, 1, 1 | Q6_K | token_embd.weight
2: 81920 | 16, 5120, 1, 1 | F32 | blk.0.ssm_a
3: 5120 | 5120, 1, 1, 1 | F32 | blk.0.ssm_d
4: 5120 | 5120, 1, 1, 1 | F32 | blk.0.ssm_conv1d.bias
5: 20480 | 4, 5120, 1, 1 | F32 | blk.0.ssm_conv1d.weight
6: 5120 | 5120, 1, 1, 1 | F32 | blk.0.ssm_dt.bias
7: 819200 | 160, 5120, 1, 1 | F32 | blk.0.ssm_dt.weight
8: 26214400 | 2560, 10240, 1, 1 | Q3_K | blk.0.ssm_in.weight
9: 13107200 | 5120, 2560, 1, 1 | Q3_K | blk.0.ssm_out.weight
10: 983040 | 5120, 192, 1, 1 | F32 | blk.0.ssm_x.weight
11: 2560 | 2560, 1, 1, 1 | F32 | blk.0.attn_norm.weight
12: 81920 | 16, 5120, 1, 1 | F32 | blk.1.ssm_a
13: 5120 | 5120, 1, 1, 1 | F32 | blk.1.ssm_d
14: 5120 | 5120, 1, 1, 1 | F32 | blk.1.ssm_conv1d.bias
15: 20480 | 4, 5120, 1, 1 | F32 | blk.1.ssm_conv1d.weight
16: 5120 | 5120, 1, 1, 1 | F32 | blk.1.ssm_dt.bias
17: 819200 | 160, 5120, 1, 1 | F32 | blk.1.ssm_dt.weight
18: 26214400 | 2560, 10240, 1, 1 | Q3_K | blk.1.ssm_in.weight
19: 13107200 | 5120, 2560, 1, 1 | Q3_K | blk.1.ssm_out.weight
20: 983040 | 5120, 192, 1, 1 | F32 | blk.1.ssm_x.weight
21: 2560 | 2560, 1, 1, 1 | F32 | blk.1.attn_norm.weight
...
```

0 comments on commit ab56485

Please sign in to comment.