DeepC is a lightweight neural network library written in C developed as part of Introduction to Deep Learning (CSE 599g1) course in Fall 2018. The src/
subfolder contains the header and source files. uwnet.py
is a wrapper function which provides a pure Python interface to our C library. trycifar.py
and trymnist.py
are python scripts where we load the data, define our network and run the train and test functions.
matrix
is the most fundamental data structure in this framework. It contains 4 variables, the number of rows and columns of the matrix, an array of floats containing data in row-major order. The matrix
struct is declared in the matrix.h
file. The file also declares functions to manipulate and copy matrices which are later defined in matrix.c
.
data
struct is declared and defined in uwnet.h
. It contains the input matrix X
and the output matrix y
on which the neural network is trained.
activations.c
defines activate_matrix
and gradient_matrix
for different activation function. Right now, it supports Logistic, RELU, LRELU and Softmax activation function.
It takes a matrix m
and an activation function a
which is a constant string and returns f(m)
, where f
is defined by what the activation a
is. f
is applied elementwise to m
.
Given the output of a layer i.e f(m)
and the activation function a
that defines f
, it returns f'(m)
.
The layer
struct contains the input and output of a layer. It also stores the weights and biases and has function pointers for forward propagation, backpropagation and for updating the weights and biases.
It is a fully connected layer defined in connected_layer.c
. It has three functions forward_connected_layer
, backward_connected_layer
and update_connected_layer
.
The function outputs a matrix f(in*w + b)
where: in
is the input, w
is the weights, b
is the bias, and f
is the activation function.
The function takes dL/df(in*w+b)
(derivative of loss wrt output) as the argument and returns dL/dw
(derivative of
loss wrt layer weights), dL/db
(derivative of loss wrt biases) and dL/din
(derivative of loss wrt to input).
The function updates our weights and biases using SGD with momentum and weight decay.
convolutional_layer.c
defines the convolutional layer. It contains the following major functions: im2col
, col2im
, forward_convolutional_layer
, backward_convolutional_layer
and update_convolutional_layer
It take spatial blocks of image and put them into columns of a matrix. Im2col
handles kernel size, stride, padding, etc.
It does the exact opposite of im2col
restoring the image structure.
The output of im2col is multiplied by the filter matrix, where each filter is a seperate row in the matrix in row-major order. The output matrix is then passed through activate_matrix
function.
Similar to backward_connected_layer
, it calculates the derivative of loss wrt to weights, biases and the inputs.
The function updates our weights (kernel values) and biases using SGD with momentum and weight decay.
Maxpooling is another core building block of convolutional neural network architecture. Implementing maxpool_layer
will be similar to implementing convolution_layer
in some ways, we have to iterate over the image, process a window of pixels with some fixed size, and then find the maximum value to put into the output. It is defined inmaxpool_layer.c
.
The method finds the maximum value in a given window size, moving by some strided amount between applications. maxpooling happens on each channel independently.
The backward method is similar to forward. Even though the window size may be large, only one element contributed to the error in the prediction so we only backpropagate our deltas to a single element in the input per window. Thus, you'll iterate through again and find the maximum value and then backpropagate error to the appropriate element in dL/din
corresponding to the position of the maximum element.
The net
struct is an array of layers.
This function runs forward propagation function for each layer in order they are stacked.
It backpropagates the derivative of loss through each layer by calling the backpropagation function of each layer.
This function updates the weights and biases of all the layers in the network.
classifier
has train_image_classifier
function that calls forward_net
, backward_net
and update_net
in that order for specified number of iterations. It also has functions to calculate cross entropy loss and accuracy of the network.
Run make
to compile your source file and to make sure your executables and libraries are up to date. Then either run python trymnist.py
or python trycifar.py
to train the neural network on MNIST and CIFAR-10 data respectivey. All data resides in individual subfolders in the data/
folder.