Skip to content

0-5788719150923125/praxis

Repository files navigation

praxis

Praxis is the process by which a theory, lesson, or skill is enacted, embodied, realized, applied, or put into practice.

Terminal

what we're building

The Praxis swarm is a decentralized, peer-to-peer, always online, and continuously-learning AI intelligence - with Hivemind directly-integrated into core layers of the model itself. The goal is to build an expert model that is small and simple, easy to parallelize and performant at a scale of hundreds/thousands of peers. We will do this via a sparse mixture of experts, curated routing, algorithmic switching and weighted self-modeling of remotely-hosted peers.

join us

install

From a Linux shell, run these commands:

# Setup a virtual environment
source make-venv.sh

# Install core model dependencies
pip install .

# Install training dependencies
pip install .[all]

contribute to the swarm

To donate your compute:

python run.py

Supported arguments:

python run.py \
  --seed 42 \                    # Set a global seed for random number generation.
  --device cuda:0 \              # Specify your GPU's index. Omit this argument to use CPU.
  --batch_size 8 \               # Set the batch size to use for training.
  --depth 7 \                    # The number of layers to host.
  --dense \                      # Run as a fully-connected (dense) model. (default: False)
  --sparse \                     # Run as a sparse model. (default: True)
  --no_dashboard \               # Disables the CLI interface.
  --no_tokenizer \               # Replace the LLaMA-2 tokenizer with a T-FREE variant.
  --data_path /path/to/my/data \ # Train on a local directory of data.
  --phi \                        # Supplement with expert data.
  --dev \                        # Launch with settings that bootstrap faster (3 layers, a smaller dataset, etc.)
  --reset                        # Delete your checkpoints and start-over.

do inference

Send a JSON-encoded payload via POST to:

http://localhost:5000/input

This payload should support all arguments in the Transformers text generation API.

Example request:

import requests

url = "http://localhost:5000/input"
payload = {"prompt": "Once upon a time, ", "do_sample": True, "temperature": 0.7}

response = requests.post(url, json=payload)

print(response.status_code)
print(response.json())

to register with transformers

from transformers import AutoConfig, AutoModel, AutoModelForCausalLM, AutoTokenizer
from praxis import PraxisConfig, PraxisForCausalLM, PraxisModel

AutoConfig.register("praxis", PraxisConfig)
AutoModel.register(PraxisConfig, PraxisModel)
AutoModelForCausalLM.register(PraxisConfig, PraxisForCausalLM)

config = PraxisConfig(
    n_emb=512,
    n_dim=384,
    n_layer=6,
    n_head=8,
    device_map="cuda:0",
)

tokenizer_model = "NousResearch/Llama-2-7b-hf"
tokenizer = AutoTokenizer.from_pretrained(tokenizer_model)

model = AutoModelForCausalLM.from_config(config)

input_ids = tokenizer.encode("The quick brown fox ")

outputs = model.generate(input_ids, do_sample=True)

print(self.tokenizer.decode(outputs[0], skip_special_tokens=True))
# --> The quick brown fox jumped over a lazy dog.

tasks

  • a global swarm
  • leverage self-modeling to focus learning on remote peers
  • experts with a stack of attention/feedforward blocks
  • commit to yourself
  • if an expert is comprised of multiple transformer blocks, rather than a single layer, then the network might learn to dynamically-route through deeper subnetworks, or it could learn to relay/ensemble information across multiple peers, or it could learn that "no relay is needed" at all, simply returning a simple prediction back to the requestee.
  • treat every peer as an experiment in hyperparameter search; publish results to the DHT, and ensure that better-performing hparams are assigned more often
  • build connectors, allowing people to integrate their nodes with personal data
  • Soft Merging of Experts with Adaptive Routing?
  • Mixture of a Million Experts?
  • Mixture of Depths.

tbd

  • a proper and robust DHT
  • central and persistent relay peers, to act as global initial_peers
  • a routing algorithm with multi-hop support (ping -> pang -> pong -> ping)
  • helix, octopi, pyramids
  • multi-level experts
  • peer validation (zero knowledge proofs)
  • self-modeling of remote experts

won't do

  • cryptocurrency (donations are appreciated, though!)