Pocket LLM

Low scale LLM for autocompletion on open corpus database (currently Tiny Shakespeare, 1.1M tokens)
Change max_length in the generate function to decide how many tokens to generate (context length will stay block_length regardless), toggle nb_iter if you want to train
Character level tokenizer:
Training Loss: 1.33
Validation Loss: 1.55
Byte-Pair Encoding tokenizer:
Training Loss: 1.56
Validation Loss: 3.34

To-DO

Add device support (cuda, load move all params to cuda)
Add weight decay
Add Multi-Latent Attention
Gradient Accumulation
Gradient clipping
RoPE

Resources

Andrej Karparthy DeepSeek-V3 Technical Report DeepSeek-V3 Github (and many many youtube videos & medium articles)

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
README.md		README.md
bpe_tokenizer.py		bpe_tokenizer.py
input.txt		input.txt
model.py		model.py
model_weights.pth		model_weights.pth
model_weights_bpe.pth		model_weights_bpe.pth

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pocket LLM

To-DO

Resources

About

Uh oh!

Releases

Packages

Languages

forknay/pocketllm

Folders and files

Latest commit

History

Repository files navigation

Pocket LLM

To-DO

Resources

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages