[Feature]: Add support for attention score output #11365

WoutDeRijck · 2024-12-20T08:50:58Z

🚀 The feature, motivation and pitch

Problem

vLLM currently doesn't provide access to attention scores during inference, which are essential for model analysis and interpretability research. #11862

Feature Request

Add the ability to retrieve attention scores during model inference, similar to HuggingFace's output_attentions=True parameter.

Motivation

Need to analyze token-level relationships in model outputs
Required for building visualization tools and debugging model behavior
Critical for research into attention mechanisms

Alternatives

No response

Additional context

No response

Before submitting a new issue...

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Dineshkumar-Anandan-ZS0367 · 2025-01-06T10:02:27Z

You are asking output_attentions=True or return_cross_attentions=True for getting coordinates right.

These only given by vision encoder decoder models or cross encoder models.

Which model?

WoutDeRijck · 2025-01-09T14:20:28Z

I don't mean to get coordinates. I am using Llama-3.1-8b, let's say I want to extract data out of the input context, then I need the attention scores to be able to visualize where the model is looking. (Pure text-based, no vision)

These are ofcourse also present in decoder-only models.

Dineshkumar-Anandan-ZS0367 · 2025-01-10T10:32:05Z

Apologise by mistakes. I have integrated score using tensor logits already. Thanks!

WoutDeRijck · 2025-01-12T10:32:27Z

I do not need the logits as well. I need the attention scores.

HuiSiqi · 2025-01-15T07:37:06Z

Any update of this? I also need to visualize the attention scores of decoder-based models.

WoutDeRijck added the feature request label Dec 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Add support for attention score output #11365

[Feature]: Add support for attention score output #11365

WoutDeRijck commented Dec 20, 2024 •

edited

Loading

Dineshkumar-Anandan-ZS0367 commented Jan 6, 2025

WoutDeRijck commented Jan 9, 2025 •

edited

Loading

Dineshkumar-Anandan-ZS0367 commented Jan 10, 2025

WoutDeRijck commented Jan 12, 2025

HuiSiqi commented Jan 15, 2025

[Feature]: Add support for attention score output #11365

[Feature]: Add support for attention score output #11365

Comments

WoutDeRijck commented Dec 20, 2024 • edited Loading

🚀 The feature, motivation and pitch

Problem

Feature Request

Motivation

Alternatives

Additional context

Before submitting a new issue...

Dineshkumar-Anandan-ZS0367 commented Jan 6, 2025

WoutDeRijck commented Jan 9, 2025 • edited Loading

Dineshkumar-Anandan-ZS0367 commented Jan 10, 2025

WoutDeRijck commented Jan 12, 2025

HuiSiqi commented Jan 15, 2025

WoutDeRijck commented Dec 20, 2024 •

edited

Loading

WoutDeRijck commented Jan 9, 2025 •

edited

Loading