ReLU forward and backward implemented in train_gpt2_fp32.cu #1

Hrancheng · 2024-09-25T23:14:18Z

Add Relu forward and backward kernels and it's fitting into model with validation loss decreasing
Usage:
make train_gpt2fp32cu
Enable relu training:
./train_gpt2fp32cu -r 1

gkielian · 2024-10-15T17:22:30Z

train_gpt2_fp32.cu

@@ -299,6 +299,25 @@ __global__ void softmax_forward_kernel5(float* out, float inv_temperature, const
    }
 }

+//ReLU forward kernel
+__global__ void relu_forward_kernel(float* out, const float* inp, int N, int T) {


I think we need to add the scale (temperature) as well into the kernel

Hrancheng added 2 commits September 25, 2024 18:57

ReLU forward and backward implemented in train_gpt2_fp32.cu

4c403a0

relu flash attention only for profiling

a221c54

gkielian reviewed Oct 15, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ReLU forward and backward implemented in train_gpt2_fp32.cu #1

ReLU forward and backward implemented in train_gpt2_fp32.cu #1

Hrancheng commented Sep 25, 2024

gkielian Oct 15, 2024

ReLU forward and backward implemented in train_gpt2_fp32.cu #1

Are you sure you want to change the base?

ReLU forward and backward implemented in train_gpt2_fp32.cu #1

Conversation

Hrancheng commented Sep 25, 2024

gkielian Oct 15, 2024

Choose a reason for hiding this comment