[FEATURE] Effective drop path. #1836

leng-yue · 2023-05-31T05:45:56Z

Is your feature request related to a problem? Please describe.

While current drop path implementation in TIMM doesn't save computation resources, implementing a true drop path that ignores unnecessary tokens will significantly speed up training when path drop ratio is high (e.g. 0.3 or 0.4).

Describe the solution you'd like
Reference to: https://github.com/facebookresearch/dinov2/blob/c3c2683a13cde94d4d99f523cf4170384b00c34c/dinov2/layers/block.py#L110

I already implemented a modified Block that utilizes this function and it gives me a huge performance improvement. I can add it to PR #1835 if it's a good idea.

The text was updated successfully, but these errors were encountered:

rwightman · 2023-05-31T19:04:03Z

@leng-yue I noticed that, but # the overhead is compensated only for a drop path rate larger than 0.1 suggested the impact isn't that great if you need at least 0.1 for it to be break even? doesn't seem worth the added complexity, can it be better quantified?

leng-yue · 2023-05-31T19:18:44Z

The ratio in dinov2's paper is 0.3 or 0.4. In our test, using their Drop Path implementation can make the training 15% faster when the ratio is 0.3. I can do a benchmark if you want.

leng-yue · 2023-05-31T20:17:28Z

https://colab.research.google.com/drive/1ydeHogHNlGgVYCFBbgd4a5PYi9LZWHLH?usp=sharing

As this benchmark shows, when dpr is 0.4, it can save 41% of the time when training.

leng-yue · 2023-06-07T22:32:43Z

Any suggestion?

rwightman · 2023-06-08T20:25:20Z

@leng-yue sorry I've got quite a few other tasks to plow through so haven't had a chance to look more closely at this, I do want to test and weight the added complexity vs benefit before making final decisions

leng-yue · 2023-06-10T00:38:54Z

Maybe adding it as a new Block is better? So that it won't have negative effects to existing codes.

rwightman · 2023-06-15T21:08:22Z

@leng-yue yeah, I suppose a new Block would mitigate risk concerns for now, and also fix the breakage of other blocks that don't current support it. Can figure out how to make it easier to select later...

leng-yue · 2023-06-16T03:26:13Z

I will implement it later.

leng-yue · 2023-06-20T06:51:06Z

Updated.

leng-yue added the enhancement New feature or request label May 31, 2023

leng-yue mentioned this issue Jun 3, 2023

Add drop path schedule #1835

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Effective drop path. #1836

[FEATURE] Effective drop path. #1836

leng-yue commented May 31, 2023

rwightman commented May 31, 2023

leng-yue commented May 31, 2023 •

edited

Loading

leng-yue commented May 31, 2023 •

edited

Loading

leng-yue commented Jun 7, 2023

rwightman commented Jun 8, 2023

leng-yue commented Jun 10, 2023

rwightman commented Jun 15, 2023

leng-yue commented Jun 16, 2023

leng-yue commented Jun 20, 2023

[FEATURE] Effective drop path. #1836

[FEATURE] Effective drop path. #1836

Comments

leng-yue commented May 31, 2023

rwightman commented May 31, 2023

leng-yue commented May 31, 2023 • edited Loading

leng-yue commented May 31, 2023 • edited Loading

leng-yue commented Jun 7, 2023

rwightman commented Jun 8, 2023

leng-yue commented Jun 10, 2023

rwightman commented Jun 15, 2023

leng-yue commented Jun 16, 2023

leng-yue commented Jun 20, 2023

leng-yue commented May 31, 2023 •

edited

Loading

leng-yue commented May 31, 2023 •

edited

Loading