Backward register #423

StrongSpoon · 2025-01-16T09:22:46Z

PR Category

Operator

Type of Change

New Feature

Description

register backward functions as aten interfaces
implement threshold operator incidentally

Issue

Progress

Change is properly reviewed (1 reviewer required, 2 recommended).
Change is responded to an issue.
Change is fully covered by a UT.

Performance

…e time

tongxin · 2025-02-19T06:36:37Z

src/flag_gems/ops/batch_norm.py

+    save_invstd=None,
+    train=False,
+    eps=1e-05,
+    output_mask=None,


The last argument should be grad_input_mask.

tongxin · 2025-02-19T07:33:07Z

src/flag_gems/ops/batch_norm.py

-    affine: tl.constexpr,
+    input_grad_mask: tl.constexpr,
+    weight_grad_mask: tl.constexpr,
+    bias_grad_mask: tl.constexpr,


The backward kernel may need is_train arg also, to distinguish between train and non-train cases.

We can leave it for future work tho.

tongxin · 2025-02-19T07:35:16Z

src/flag_gems/ops/batch_norm.py

+    running_var=None,
+    save_mean=None,
+    save_invstd=None,
+    train=False,


kernel should be able to handle train=True case.

tongxin · 2025-02-19T07:50:47Z

src/flag_gems/ops/dropout.py

-
-def native_dropout(x, p=0.5, train=True):
-    return NativeDropout.apply(x, p, train)
+def dropout(input, p, train):


Arg train is optional.

tongxin · 2025-02-19T07:55:35Z

src/flag_gems/ops/dropout.py

+    logging.debug("GEMS NATIVE DROPOUT FORWARD")
+    assert p > 0.0 and p < 1.0, "p must be in (0, 1)"
+    device = input.device
+    input = input.contiguous()


Add a note that we'll remove contiguous enforcement in the future.

tongxin · 2025-02-19T07:57:34Z

src/flag_gems/ops/embedding.py

+    indices = indices.contiguous()
+    weight = weight.contiguous()


Refactor this in TODOs.

tongxin · 2025-02-19T08:06:35Z

src/flag_gems/ops/groupnorm.py

+    mean = mean.contiguous()
+    rstd = rstd.contiguous()
+    weight = None if weight is None else weight.contiguous()
+    group_size = C // group


tongxin · 2025-02-19T08:09:10Z

src/flag_gems/ops/groupnorm.py

-                BLOCK_GROUP_SIZE=triton.next_power_of_2(C // num_groups),
-                BLOCK_HW_SIZE=triton.next_power_of_2(HW),
+                HxW,
+                BLOCK_GROUP_SIZE=triton.next_power_of_2(C // group),


cdiv(C, group)?

tongxin · 2025-02-19T08:52:41Z

src/flag_gems/ops/dropout.py

-
-def native_dropout(x, p=0.5, train=True):
-    return NativeDropout.apply(x, p, train)
+def dropout(input, p, train):


I realized we didn't handle we train=False correctly in the previous version. Let's fix that.

StrongSpoon force-pushed the bwd branch 2 times, most recently from 9f79739 to 01bee17 Compare February 6, 2025 09:26

StrongSpoon marked this pull request as ready for review February 11, 2025 02:04

StrongSpoon added 20 commits February 19, 2025 11:12

[Operator] register backward independently for tanh

da96279

[Operator] register backward independently for gelu

c6a99dd

[Operator] implement threshold fwd and bwd, as bwd of relu at the sam…

c7631bc

…e time

[Operator] register sigmoid independently

e1d4d43

[Operator] register silu backward independently

a3c58b4

[Operator] register dropout backward independently

08d9f09

[Operator] register embedding backward

b94dc2f

[Operator] register group_norm backward

99810c0

[Operator] register layer_norm backward

1ff4fe6

[Test] test backward with torch.ops.aten functions

22c6465

[Operator] optimize group_norm_backward to allow larger input

6c1f57f

[Bugfix] wrong call of threshold_backward

15dae3c

[Operator] register backward of softmax

1dd7dc4

[Operator] register log_softmax backward

3f8df42

[Operator] register batch_norm backward

87e00d8

[Operator] register weightnorm_interface_backward

a941c58

[Operator] modify weight_norm

31448e8

[Bugfix] weight_norm test error

f6e4906

[Bugfix] diagonal_backward

75b4bb4

[Bugfix] initialize cuda context properly and reduce test cases

d9aeb36

StrongSpoon force-pushed the bwd branch from cdcef25 to d9aeb36 Compare February 19, 2025 03:13

tongxin reviewed Feb 19, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Backward register #423

Backward register #423

StrongSpoon commented Jan 16, 2025 •

edited

Loading

tongxin Feb 19, 2025

tongxin Feb 19, 2025

tongxin Feb 19, 2025

tongxin Feb 19, 2025

tongxin Feb 19, 2025

tongxin Feb 19, 2025

tongxin Feb 19, 2025

tongxin Feb 19, 2025

tongxin Feb 19, 2025

tongxin Feb 19, 2025

Backward register #423

Are you sure you want to change the base?

Backward register #423

Conversation

StrongSpoon commented Jan 16, 2025 • edited Loading

PR Category

Type of Change

Description

Issue

Progress

Performance

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

StrongSpoon commented Jan 16, 2025 •

edited

Loading