- Zurich, Switzerland
- https://nestler.sh/
-
HeavyBall Public
Efficient optimizers
-
kron_torch Public
Forked from evanatyourservice/kron_torchAn implementation of PSGD Kron second-order optimizer for PyTorch
Python Creative Commons Attribution 4.0 International UpdatedJan 19, 2025 -
adaptive-muon Public
Forked from leloykun/adaptive-muonA version of @KellerJordan's Muon that adapts to the scale of the gradients
Jupyter Notebook UpdatedJan 2, 2025 -
PufferLib Public
Forked from PufferAI/PufferLibSimplifying reinforcement learning for complex game environments
-
entropix Public
Forked from xjdr-alt/entropixEntropy Based Sampling and Parallel CoT Decoding
Python Apache License 2.0 UpdatedDec 14, 2024 -
fsdp_optimizers Public
Forked from ethansmith2000/fsdp_optimizerssupporting pytorch FSDP for optimizers
Python Apache License 2.0 UpdatedDec 8, 2024 -
stochastic_round_cuda Public
Forked from ethansmith2000/stochastic_round_cuda -
-
pytorch Public
Forked from pytorch/pytorchTensors and Dynamic neural networks in Python with strong GPU acceleration
Python Other UpdatedNov 7, 2024 -
schedule_free Public
Forked from facebookresearch/schedule_freeSchedule-Free Optimization in PyTorch
-
oracle-head-gpt Public
Forked from SonicCodes/oracle-head-gptprobe for predicting future hiddenstates on gpt-2 vibes....
Python UpdatedOct 28, 2024 -
modded-nanogpt Public
Forked from KellerJordan/modded-nanogptNanoGPT (124M) quality in 3.25B tokens
Python UpdatedOct 13, 2024 -
flux-fp8-api Public
Forked from aredden/flux-fp8-apiFlux diffusion model implementation using quantized fp8 matmul & remaining layers use faster half precision accumulate, which is ~2x faster on consumer devices.
Python UpdatedSep 4, 2024 -
guided-diffusion Public
Forked from kostarion/guided-diffusionPython MIT License UpdatedJul 23, 2024 -
-
FABRAG Public
FABRIC.. but RAG!
-
-
memory-transformer-pt4 Public
Forked from Avelina9X/memory-transformer-pt4Jupyter Notebook UpdatedFeb 24, 2024 -
-
TrueGrad Public
PyTorch interface for TrueGrad Optimizers
-
AgileRL Public
Forked from AgileRL/AgileRLStreamlining reinforcement learning with RLOps
Python Apache License 2.0 UpdatedJul 14, 2023 -
pytorch-image-models Public
Forked from huggingface/pytorch-image-modelsPyTorch image models, scripts, pretrained weights -- ResNet, ResNeXT, EfficientNet, EfficientNetV2, NFNet, Vision Transformer, MixNet, MobileNet-V3/V2, RegNet, DPN, CSPNet, and more
Python Apache License 2.0 UpdatedJun 13, 2023 -
-
-
SharedUtils Public
Easy usage of Python's new SharedMemory for reduced memory and CPU cost
-
tpucare Public
Automatically take good care of your preemptible TPUs
-
pytorch-center-loss Public
Forked from KaiyangZhou/pytorch-center-lossPytorch implementation of Center Loss
Python MIT License UpdatedFeb 19, 2023 -
HomebrewNLP-MTF Public
Forked from HomebrewNLP/HomebrewNLP-MTFHomebrewNLP in Mesh-TensorFlow flavour for distributed TPU training
Python BSD 2-Clause "Simplified" License UpdatedFeb 1, 2023 -
locoprop-torch Public
Forked from dvruette/locoprop-torchLocoProp implementation in PyTorch. (https://proceedings.mlr.press/v151/amid22a/amid22a.pdf)
Jupyter Notebook MIT License UpdatedJan 15, 2023 -
PerfTorch Public
High performance pytorch modules