Skip to content

SakanaAI/smctm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SMCTM: Self-Modifying Continuous Thought Machines

A self-modifying continuous thought machine.

Setup

Requirements can be installed with:

uv sync

Data for the FewShotCIFAR and FewShotMiniImageNet experiments should be set up as follows:

FewShotCIFAR: Download the data from Google Drive:

uv run gdown 12V7qi-AjrYi6OoJdYcN_k502BM_jcP8D -O data/raw_data/
unzip data/raw_data/CIFAR-FS.zip -d data/cifar_fs/

MiniImageNet: Download the data from Google Drive:

uv run gdown 1GjGMI0q3bgcpcB_CjI40fX54WgLPuTpS -O data/raw_data/
unzip data/raw_data/miniImageNet.zip -d data/miniImageNet/

Running Experiments

Run the training script:

uv run main.py task=<TASK> model=<MODEL>

Parameters:

  • TASK: Choose from Copy, FewShotCIFAR, FewShotMiniImageNet
  • MODEL: Choose from PlasticCTM, CTM, LSTM, PlasticLSTM, HyperLSTM, STPN

Examples:

# Train PlasticCTM on FewShotCIFAR
uv run main.py task=FewShotCIFAR model=PlasticCTM

# Train LSTM on MiniImageNet
uv run main.py task=FewShotMiniImageNet model=LSTM

Development

Multiruns

Multiple runs can be triggered using -m. For example, if we want to sequentially train both a PlasticCTM and a CTM we can run:

uv run python main.py -m task=Copy model=PlasticCTM,CTM

If there are multiple GPUs available we can run them in parallel with:

uv run python main.py -m task=Copy model=PlasticCTM,CTM hydra/launcher=submitit_slurm

For more details on parallel multiruns see conf/hydra/launcher/submitit_slurm.yaml.

Hyperparameter Optimization

Hyperparameter optimization can be ran by adding -m --config-name to the command:

uv run python main.py -m --config-name search task=Copy model=CTM
uv run python main.py -m --config-name search task=Copy model=CTM hydra/launcher=submitit_slurm

About

Self-Modifying Continuous Thought Machines

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages