You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a context length of about 100k.
Is there any methods that I can do to speed up using llmlingua2 to compress it and keep it within s short time like shorter than 2 seconds?"
Thanks.
The text was updated successfully, but these errors were encountered:
Hi @yyjabiding, thanks for your interest in LLMLingua.
Although we haven't tested it, it seems possible. LLMLingua-2 forwards a BERT-level model chunk by chunk, so increasing the batch size could potentially reduce latency. You can check the implementation here: LLMLingua Prompt Compressor.
Describe the issue
I have a context length of about 100k.
Is there any methods that I can do to speed up using llmlingua2 to compress it and keep it within s short time like shorter than 2 seconds?"
Thanks.
The text was updated successfully, but these errors were encountered: