How can I speed up using llmlingua2 ? #175

yyjabiding · 2024-08-06T03:02:03Z

Describe the issue

I have a context length of about 100k.
Is there any methods that I can do to speed up using llmlingua2 to compress it and keep it within s short time like shorter than 2 seconds?"
Thanks.

iofu728 · 2024-08-22T04:59:25Z

Hi @yyjabiding, thanks for your interest in LLMLingua.

Although we haven't tested it, it seems possible. LLMLingua-2 forwards a BERT-level model chunk by chunk, so increasing the batch size could potentially reduce latency. You can check the implementation here: LLMLingua Prompt Compressor.

yyjabiding added the question Further information is requested label Aug 6, 2024

iofu728 self-assigned this Aug 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How can I speed up using llmlingua2 ? #175

How can I speed up using llmlingua2 ? #175

yyjabiding commented Aug 6, 2024 •

edited

Loading

iofu728 commented Aug 22, 2024

How can I speed up using llmlingua2 ? #175

How can I speed up using llmlingua2 ? #175

Comments

yyjabiding commented Aug 6, 2024 • edited Loading

Describe the issue

iofu728 commented Aug 22, 2024

yyjabiding commented Aug 6, 2024 •

edited

Loading