From 561f646ded33d8a23fc44d258d3beb6658ea6f5a Mon Sep 17 00:00:00 2001
From: Zihao Ye <expye@outlook.com>
Date: Thu, 2 Jan 2025 22:47:46 -0800
Subject: [PATCH] misc: add bibtex reference (#712)

This pull request includes an update to the `README.md` file to add a
new citation section. The most important change is the addition of a
citation format for users who find FlashInfer helpful in their projects
or research.

Documentation update:

*
[`README.md`](diffhunk://#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5R144-R169):
Added a new "Citation" section with a BibTeX entry for citing the
FlashInfer paper.
---
 README.md | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/README.md b/README.md
index 1f70dd38..a453b5f8 100644
--- a/README.md
+++ b/README.md
@@ -141,3 +141,29 @@ We are thrilled to share that FlashInfer is being adopted by many cutting-edge p
 ## Acknowledgement
 
 FlashInfer is inspired by [FlashAttention 1&2](https://github.com/dao-AILab/flash-attention/), [vLLM](https://github.com/vllm-project/vllm), [stream-K](https://arxiv.org/abs/2301.03598), [cutlass](https://github.com/nvidia/cutlass) and [AITemplate](https://github.com/facebookincubator/AITemplate) projects.
+
+## Citation
+
+If you find FlashInfer helpful in your project or research, please consider citing our [paper](https://arxiv.org/abs/2501.01005):
+
+```bibtex
+@article{ye2025flashinfer,
+    title = {FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving},
+    author = {
+      Ye, Zihao and
+      Chen, Lequn and
+      Lai, Ruihang and
+      Lin, Wuwei and
+      Zhang, Yineng and
+      Wang, Stephanie and
+      Chen, Tianqi and
+      Kasikci, Baris and
+      Grover, Vinod and
+      Krishnamurthy, Arvind and
+      Ceze, Luis
+    },
+    journal = {arXiv preprint arXiv:2501.01005},
+    year = {2025},
+    url = {https://arxiv.org/abs/2501.01005}
+}
+```