Skip to content

Commit 933cdea

Browse files
authored
[BugFix] Don’t compute reorder threshold when there are no attention groups (vllm-project#27861)
1 parent 3933f18 commit 933cdea

File tree

1 file changed

+5
-0
lines changed

1 file changed

+5
-0
lines changed

vllm/v1/worker/gpu_model_runner.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4149,6 +4149,11 @@ def calculate_reorder_batch_threshold(self) -> None:
41494149
group.get_metadata_builder().reorder_batch_threshold
41504150
for group in self._attn_group_iterator()
41514151
]
4152+
# If there are no attention groups (attention-free model) or no backend
4153+
# reports a threshold, leave reordering disabled.
4154+
if len(reorder_batch_thresholds) == 0:
4155+
self.reorder_batch_threshold = None
4156+
return
41524157
self.reorder_batch_threshold = reduce(min_none_high, reorder_batch_thresholds)
41534158

41544159
def _find_compatible_block_sizes(

0 commit comments

Comments
 (0)