Skip to content

Add PruLong to KVPress #2

@alessiodevoto

Description

@alessiodevoto

Hi! After reading your paper, we think PruLong would be a great addition to the NVIDIA/KVPress library! KVPress is a library to easily implement and benchmark KV Cache compression methods. It allows standardized benchmarking and helps broader community access and adoption. We've already received contributions for many recent works (including DuoAttention), and would be happy to receive yours as well 🙂

As you might be interested in contributing an implementation, you can take a look at this notebook that shows some ways to implement a new compression method.

Please feel free to open a PR when convenient, and don't hesitate to reach out for any help or guidance!

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions