Incorpopration with llama 2? #8

BDHU · 2023-09-27T04:40:58Z

Is it possible to use this in llama 2? I'm interested in improving the inference speed so the accuracy loss doesn't matter right now

JamesTheZ · 2023-09-27T13:53:13Z

I believe there is no problem with using Flash-LLM kernel on llama 2. Flash-LLM mainly consists of the high-performance SpMM GPU kernel, which should be efficient for all existing LLM inference MatMul shapes.

YixinSong-e · 2023-10-02T06:14:32Z

Does llama 2 have such high unstructure sparsity? And can our method be combined with quantification?

BDHU · 2023-10-03T00:29:24Z

Does llama 2 have such high unstructure sparsity? And can our method be combined with quantification?

We do have ongoing research that achieves 70% unstructured sparsity on llama 2 with negligible accuracy loss, that's why we want to see the speed gain from removing those weights.

YixinSong-e · 2023-10-03T01:44:33Z

That's

Does llama 2 have such high unstructure sparsity? And can our method be combined with quantification?

We do have ongoing research that achieves 70% unstructured sparsity on llama 2 with negligible accuracy loss, that's why we want to see the speed gain from removing those weights.

That's amazing! To my knowledge, The sparsegpt and Wanda methods significantly increase ppl when llama has 70% sparsity

Summer-Summer · 2023-10-03T01:51:22Z

Does llama 2 have such high unstructure sparsity? And can our method be combined with quantification?

We do have ongoing research that achieves 70% unstructured sparsity on llama 2 with negligible accuracy loss, that's why we want to see the speed gain from removing those weights.

That is amazing！ For 70% unstructured sparsity, I believe even better performance can be achieved compared to our existing implementation. I am currently working on another project, but we can help you further optimize the support for 70% unstructured sparsity later.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incorpopration with llama 2? #8

Incorpopration with llama 2? #8

BDHU commented Sep 27, 2023

JamesTheZ commented Sep 27, 2023

YixinSong-e commented Oct 2, 2023 •

edited

Loading

BDHU commented Oct 3, 2023

YixinSong-e commented Oct 3, 2023

Summer-Summer commented Oct 3, 2023

Incorpopration with llama 2? #8

Incorpopration with llama 2? #8

Comments

BDHU commented Sep 27, 2023

JamesTheZ commented Sep 27, 2023

YixinSong-e commented Oct 2, 2023 • edited Loading

BDHU commented Oct 3, 2023

YixinSong-e commented Oct 3, 2023

Summer-Summer commented Oct 3, 2023

YixinSong-e commented Oct 2, 2023 •

edited

Loading