A scalar-valued autograd engine and neural net library in Rust, inspired by Karpathy's micrograd.
Same idea, Rust idioms. Zero external dependencies.
- Automatic differentiation: Build a computation graph of scalar values, then call
.backward()to compute gradients via reverse-mode autodiff (backpropagation). - Neural networks: Composable
Neuron,Layer, andMLPbuilding blocks with ReLU activations and SGD training. - Zero dependencies: No external crates. Pure Rust standard library.
src/
lib.rs -- public API + 40 tests
value.rs -- Value: the autograd scalar (Rc<RefCell<ValueInner>>)
nn.rs -- Neuron, Layer, MLP
examples/
demo.rs -- binary classification with ASCII loss curve
A Value wraps an f64 and tracks the computation graph for automatic differentiation.
use zeusgrad::value::Value;
let a = Value::new(2.0);
let b = Value::new(-3.0);
let c = Value::new(10.0);
let d = &a * &b + &c; // d = 2*(-3) + 10 = 4
d.backward();
println!("a.grad = {}", a.grad()); // -3.0 (dd/da = b)
println!("b.grad = {}", b.grad()); // 2.0 (dd/db = a)
println!("c.grad = {}", c.grad()); // 1.0 (dd/dc = 1)| Operation | Syntax |
|---|---|
| Add | &a + &b, a + 2.0 |
| Subtract | &a - &b, a - 2.0 |
| Multiply | &a * &b, a * 2.0 |
| Divide | &a / &b, a / 2.0 |
| Power | a.pow(2.0) |
| Negate | -&a |
| ReLU | a.relu() |
| Tanh | a.tanh() |
| Exp | a.exp() |
All operations track gradients. Mixed Value/f64 arithmetic is supported.
use zeusgrad::nn::MLP;
use zeusgrad::value::Value;
// 2 inputs, two hidden layers of 16, 1 output
let model = MLP::new(2, &[16, 16, 1], 42);
println!("Parameters: {}", model.param_count()); // 337
let x = vec![Value::new(1.0), Value::new(-1.0)];
let pred = model.forward(&x);
// Compute loss, backprop, SGD
let target = Value::new(1.0);
let loss = (&pred[0] - &target).pow(2.0);
loss.backward();
let lr = 0.01;
for p in model.parameters() {
let new_val = p.data() - lr * p.grad();
p.set_data(new_val);
}
model.zero_grad();- Neuron:
w . x + bwith optional ReLU - Layer: Vector of neurons
- MLP: Stack of layers. Hidden layers use ReLU, final layer is linear.
Run the binary classification example:
cargo run --example demoTrains a 337-parameter MLP on a synthetic circle dataset using hinge loss + L2 regularization, with an ASCII loss curve:
=== zeusgrad demo: binary classification ===
Dataset: 100 points (circle boundary classification)
Model: MLP(2, [16, 16, 1]) (337 parameters)
Epoch 0 | loss: 0.7739 | accuracy: 67.0%
Epoch 50 | loss: 0.7033 | accuracy: 69.0%
Epoch 99 | loss: 0.6594 | accuracy: 70.0%
--- Loss Curve ---
0.7739 |*
| **
| ***
| ***
| ****
| *****
| *****
0.7167 | ******
| ******
| *******
| ******
| ******
| *****
| ******
0.6594 | ***
+------------------------------------------------------------
cargo test40 tests covering:
- Forward pass correctness (add, mul, sub, div, neg, pow, relu, tanh, exp)
- Backprop gradient correctness for every operation
- Numerical gradient checking (finite differences) for polynomials, tanh, exp, and composed expressions
- Shared node gradients (a + a, a * a)
- Zero-grad propagation
- Neural net forward/backward, parameter counting, zero-grad
- SGD step reduces loss
- Multi-epoch training convergence
- f64 operator convenience overloads
- Doc tests
Value uses Rc<RefCell<ValueInner>> for shared ownership in the compute graph. Each operation creates a new Value node that stores:
- Its computed
data - A
gradfield (accumulated during backward) - A
backwardclosure that propagates gradients to its parents - References to parent
Valuenodes
Calling backward() on the output:
- Topologically sorts the DAG
- Seeds the output gradient to 1.0
- Walks the sorted list in reverse, calling each node's backward closure
This is the same algorithm as PyTorch's autograd, just on scalars instead of tensors.
MIT