Skip to content

Latest commit

 

History

History
130 lines (92 loc) · 4.27 KB

File metadata and controls

130 lines (92 loc) · 4.27 KB

Neural Networks: Zero to Hero

This project is a Rust implementation of the youtube series Neural Networks: Zero to Hero by Andrej Karpathy. The goal is to implement the code in each video in Rust in a way that allows the viewer to follow along, but instead of looking at Python code they can look the Rust code in this repository.

In the series Mr. Karpathy uses some notebook exploration coding which is very helpful for learning. I'll try to provide the same thing in the code itself, or as debugging sessions where applicable.

Install

This part is not optimal, but to see plots we depend on Python matplolib and numpy to be available. This is only for plotting, we will be only be using Rust libraries for everything else, but visualizing the data is an important part of the learning process so I wanted to provide something and hope to be replace it with a pure Rust solution in the future.

Create a virtual python environment and install the requirements:

$ python3 -m venv zeroh
$ source zeroh/bin/activate
(zeroh) $ pip install -r requirements.txt

We also need to install the lapack and openblas development libraries:

$ sudo dnf install lapack-devel openblas-devel

Part1: Building micrograd

The first [part] of the series is called The spelled-out intro to neural networks and backpropagation: building micrograd. Since the idea is to follow along with the series, I won't decribe what it does here which might take away from the learning experience.

The Rust code for this first part of the series can be found in part1 and can be run with the following command:

(zeroh) $ cargo run -p part1

The following plot shows the function the will be used in the intro section:

image

The first diagram/graph from the intro section looks like this:

image

The second diagram is of the single neuron network before gradients have been calculated:

image

Tanh graph for refrence:

image

After adding the tanh activation function:

image

After manually calculating the gradients:

image

Before performing backpropagation using the backward function:

image

After performing backpropagation manually using the explicit backward function:

image

After performing backpropagation using topological order (calling backward) function:

image

After performing backpropagation using topological order (calling backward) function with the "decomposed" tanh function:

image

Next we have a multi-layer perceptron (MLP) with two hidden layers, each containing, before backpropagation:

image

Finally we have a multi-layer perceptron (MLP) with two hidden layers, each containing, after backpropagation:

image

And then we have...

image

Part2: Building makemore

The second [part] of the series is called The spelled-out intro to language modeling: building makemore.

This part uses the tch crate, so you'll need pytorch and extract it to the current directory.

$ wget https://download.pytorch.org/libtorch/cpu/libtorch-cxx11-abi-shared-with-deps-2.0.1%2Bcpu.zip
$ unzip libtorch-cxx11-abi-shared-with-deps-2.0.1+cpu.zip
$ export LIBTORCH=$PWD/libtorch
$ export LD_LIBRARY_PATH=$PWD/libtorch/lib:$LD_LIBRARY_PATH

We also need fontconfig-devel for plotters:

$ sudo dnf install fontconfig-devel

The Rust code for this part of the series can be found in part2 and can be run with the following command:

(zeroh) $ cargo run -p part2

The first diagram/graph is the count of words distribution:

image

It's still a work-in-progress!