Skip to content

This is a prototype of a MixtureOfExpert LLM made with pytorch. Currently in developpment, I am testing its capabilities of learning with simple little tests before learning it on large language datasets.

Notifications You must be signed in to change notification settings

nath54/Dedale_LLM

Repository files navigation

Dedale LLM

General Description

Dedale is a large language model (LLM) using the Transformer architecture and Mixture of Experts.

The main idea is to have a large number of different Transformer blocks and a router. The router will choose the next blocks to pass through a certain number of times before passing by an FFN to get the next token.

Warning: It is currently not trained. I will first train it on very small datasets, and if I can get the resources, I will try to train it on a large dataset.

Description of this Code

The model source code is split into the lib.py and model.py files. It uses the PyTorch library.

About

This is a prototype of a MixtureOfExpert LLM made with pytorch. Currently in developpment, I am testing its capabilities of learning with simple little tests before learning it on large language datasets.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages