Skip to content

rmeno12/cuda-toy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

68 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

cuda-toy

Check out the write up of this project.

The goal of this project is to implement a simple neural network in C++ that is GPU accelerated via CUDA.

Goals

  • [DONE] Implement non-accelerated net
  • [DONE] Optimize net architecture a little for MNIST data
  • [DONE] Implement CUDA-accelerated matrix functions
  • [DONE] Quantitatively compare accelerated vs non-accelerated performance

Results

CPU with std::vector

CPU_vec_graph Mean iteration time ~ 3430 ms

CPU with arrays

CPU_graph Mean iteration time ~ 428 ms

CPU with GPU matrix multiplication

GPU_matmul_graph Mean iteration time ~ 18 ms

CPU with GPU matrix arithmetic

GPU_graph Mean iteration time ~ 12 ms