steerling

Repository for post-block low rank steering described in https://sprocketlab.github.io/posts/2025/11/actvweight/

Environment Setup

The repo ships with setup_env.sh, which provisions a Conda environment named steerling (Python 3.10) and installs all runtime dependencies.

# Create or update the Conda env
bash setup_env.sh

# Activate it before running any scripts
conda activate steerling

The script installs GPU-enabled PyTorch wheels by default. Set TORCH_CUDA_CHANNEL=https://download.pytorch.org/whl/cpu before running it if you are on CPU-only hardware.

Training

train.py fine-tunes adapters or fixed vectors on any dataset defined under utils.dataset_dict.

conda activate steerling
python train.py \
  --model-name meta-llama/Meta-Llama-3-8B \
  --dataset-name ListOps \
  --lr 1e-3 \
  --bs 8

Defaults match the method in the blog (adapters applied to all tokens). If you want, you can experiment variants with:

--no-adapter – use fixed steering vectors (fewer trainable params, less expressive).
--intervene-last – restrict steering to the final token at every generation step; incurs an extra forward pass per generation step.
--submodules – inject adapters into attention/MLP submodules rather than the entire block.

Performance under these alternate configurations is not guaranteed.

train.py --help lists every option.

Inference / Evaluation

Given a trained checkpoint directory, run:

conda activate steerling
python inference.py \
  --model-name [your chosen model to train]\
  --dataset-name [your chosen dataset] \
  --ckpt-path [the checkpoint of saved model from train.py] \
  --split test

Outputs are written to outputs/<project>/<split>.json unless --output-file is provided.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
images		images
utils		utils
.gitignore		.gitignore
README.md		README.md
custom_trainer.py		custom_trainer.py
inference.py		inference.py
setup_env.sh		setup_env.sh
steerer.py		steerer.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

steerling

Environment Setup

Training

Inference / Evaluation

About

Uh oh!

Releases

Packages

Languages

SprocketLab/steerling

Folders and files

Latest commit

History

Repository files navigation

steerling

Environment Setup

Training

Inference / Evaluation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages