Skip to content

Commit

Permalink
docs: fix typo in transformers.md
Browse files Browse the repository at this point in the history
  • Loading branch information
danbev committed Dec 10, 2024
1 parent b725a5e commit ca0a6fd
Showing 1 changed file with 5 additions and 1 deletion.
6 changes: 5 additions & 1 deletion notes/architectures/transformers.md
Original file line number Diff line number Diff line change
Expand Up @@ -183,7 +183,7 @@ Standard attention uses 3 martixes, a query matrix, a key matrix, and a value
matrix.

Let's start with the following input sentence "Dan loves icecream". The first
step it split this into tokens, so we will have might get 4 tokens:
step is to split this into tokens, so we will have might get 4 tokens:
```
["Dan", "loves", "ice", "cream"]
```
Expand All @@ -204,6 +204,10 @@ be used for each occurance. So there is currently no context or association
between these words/token embeddings. They only contain information about each
word/token itself, and nothing about the context in which it appears.

This mapping can happen using somethin like `ggml_get_rows` which uses the a
tensor that contains the embeddings for each token, and an index tensor which
contains the token ids. The index tensor is used to index into the embeddings.

So with these embeddings the first thing in the model does is to add a
positional encoding to each of the embeddings. In the original paper this used
absolute position encoding. I've written about this is
Expand Down

0 comments on commit ca0a6fd

Please sign in to comment.