diff --git a/notes/architectures/transformers.md b/notes/architectures/transformers.md
index cdb0019..9c8d69b 100644
--- a/notes/architectures/transformers.md
+++ b/notes/architectures/transformers.md
@@ -211,7 +211,7 @@ contains the token ids. The index tensor is used to index into the embeddings.
 So with these embeddings the first thing in the model does is to add a
 positional encoding to each of the embeddings. In the original paper this used
 absolute position encoding. I've written about this is
-[embeddings.md](./embeddings.md).
+[embeddings.md](../position-embeddings/embeddings.md).
 
 So we have our input matrix which in our case is a 4x512 matrix, where each
 entry is one of the tokens in the input sentence. Notice that we in this case