Dedale LLM

General Description

Dedale is a large language model (LLM) using the Transformer architecture and Mixture of Experts.

The main idea is to have a large number of different Transformer blocks and a router. The router will choose the next blocks to pass through a certain number of times before passing by an FFN to get the next token.

Warning: It is currently not trained. I will first train it on very small datasets, and if I can get the resources, I will try to train it on a large dataset.

Description of this Code

The model source code is split into the lib.py and model.py files. It uses the PyTorch library.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
configs		configs
data		data
.gitignore		.gitignore
README.md		README.md
custom_tokenisers.py		custom_tokenisers.py
lib.py		lib.py
lib_data.py		lib_data.py
load_data.py		load_data.py
main_training.py		main_training.py
model.py		model.py
test_random_forward.py		test_random_forward.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dedale LLM

General Description

Description of this Code

About

Releases

Packages

Languages

nath54/Dedale_LLM

Folders and files

Latest commit

History

Repository files navigation

Dedale LLM

General Description

Description of this Code

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages