Layer Skip looks interesting #432

SinanAkkoyun · 2024-04-27T14:27:04Z

SinanAkkoyun
Apr 27, 2024

https://huggingface.co/papers/2404.16710

Hey! :)

I just found this and the self-speculative decoding looks promising at first glance

@turboderp What do you think about it?

turboderp · 2024-04-28T05:21:26Z

turboderp
Apr 28, 2024
Maintainer

It does look interesting, but it requires models specifically trained (or at least finetuned) for early exit. And the speedup isn't amazing compared to using a tiny draft model, so it's questionable if it's really worth it to finetune the large models that would really benefit from it.

One big drawback is you still need to fill out the K/V cache entries of any skipped layers, so you can't just exit early and proceed even if you're what token you're going to sample, halfway through the forward pass. You can exit early and then do a full, batched pass over some number of early-exit tokens, but at that point you start to run into the same limitations as with other speculative methods, and the speedup ends up being very comparable.

1 reply

SinanAkkoyun Apr 28, 2024
Author

I see, thank you very much for the assessment :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Layer Skip looks interesting #432

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Layer Skip looks interesting #432

SinanAkkoyun Apr 27, 2024

Replies: 1 comment · 1 reply

turboderp Apr 28, 2024 Maintainer

SinanAkkoyun Apr 28, 2024 Author

SinanAkkoyun
Apr 27, 2024

Replies: 1 comment 1 reply

turboderp
Apr 28, 2024
Maintainer

SinanAkkoyun Apr 28, 2024
Author