Transformer from Scratch with PyTorch

This repository contains a notebook-first, from-scratch implementation of the Transformer architecture in PyTorch, based on the Attention Is All You Need paper.

Paper: https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf

Notebook coverage

The notebook (transformer_from_scratch_with_PyTorch.ipynb) walks through:

Input embedding and positional embedding
Layer normalization and feed-forward blocks
Multi-head attention
Residual connections
Encoder and decoder blocks
Projection layer and full Transformer assembly
Dataset preparation
Training loop setup
Inference
Attention visualization

Dependencies used in the notebook

PyTorch
Hugging Face datasets
tokenizers
NumPy
Pandas
Altair
TensorBoard (torch.utils.tensorboard)
tqdm

How to run

Open the notebook in Google Colab using the badge above, or run it locally in Jupyter.
Execute cells in order, from model components through training and inference.

Project status

🚧 Work in progress / educational implementation.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
README.md		README.md
tokenizer_en.json		tokenizer_en.json
tokenizer_it.json		tokenizer_it.json
transformer_from_scratch_with_PyTorch.ipynb		transformer_from_scratch_with_PyTorch.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Transformer from Scratch with PyTorch

Notebook coverage

Dependencies used in the notebook

How to run

Project status

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Transformer from Scratch with PyTorch

Notebook coverage

Dependencies used in the notebook

How to run

Project status

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages