milvus的GPU向量索引性能不符合预期, 测试效果没有达到宣传的加速效果 #33873
Replies: 4 comments 10 replies
-
is number of query equal to 1? |
Beta Was this translation helpful? Give feedback.
-
Making the segment size smaller can also help the performance. |
Beta Was this translation helpful? Give feedback.
-
For the GPU, which is a high-latency, high-throughput device, we first need to align our recall requirements. Then, we need to increase the level of concurrency. Based on our experience, the concurrency needs to be increased to 64 or even higher. |
Beta Was this translation helpful? Give feedback.
-
may we know more details about your use case? We'd definitely like to offer more help on setup |
Beta Was this translation helpful? Give feedback.
-
测试背景
Milvus 2.4版本支持多种 GPU 索引类型,以加速搜索性能和效率,特别是在高吞吐量、低延迟和高召回率的场景中。GPU 加速可以极大地提高 Milvus 的搜索性能和效率,其支持的GPU索引类型如下:
● GPU_CAGRA
● GPU_IVF_FLAT
测试目的
为了验证Milvus官网披露的GPU性能加速在索引构建+索引搜索方面的表现, 目前准备在向量测试数据集Cohere 1M数据量 维度在768场景在保证在相同召回率情况下, 测试CPU HNSW, CPU IVF_FLAT, GPU IVF_FLAT, GPU CAGRA各种索引的构建时间, QPS等指标性能。
测试机器
测试环境
Milvus standalone CPU 单机部署 资源是8C32GB
Milvus standalone GPU单机部署 资源是8C31GB+T4 GPU卡
都是通过docker compose up -d 拉取来的测试环境
测试工具
ann benchmark
测试方案
1.索引构建
索引查询:
多并发测试
测试的数据集是:Cohere
● 100w数据,768维
● Milvus CPU HNSW索引的构建参数: M: 16 efConstruction: 100
● Milvus GPU CAGRA索引的构建参数: intermediate_graph_degree: 64 graph_degree: 32 build_algo: NN_DESCENT
CPU HNSW:M = 16 efConstruction = 100 ef = 400 性能如下:
CPU IVF FLAT: nprobe: 32 性能如下:
GPU CAGRA: intermediate_graph_degree: 64 graph_degree: 32 build_algo: NN_DESCENT itopk_size:128 search_width:16 性能如下:
GPU IVF FLAT: nprobe: 32 性能如下:
测试结论:
Milvus的索引构建速度GPU CAGRA > CPU HNSW, 但是CAGRA相对HNSW只有提速非常有限。
单从查询性能QPS来看, GPU CAGRA > HNSW。目前从测试数据看, GPU CAGRA的性能表现是HNSW的2.5倍。也远没有达到宣传稿上的加速效果.
Beta Was this translation helpful? Give feedback.
All reactions