Expose vLLM logprobs in model output #3491

CoolFish88 · 2024-10-01T20:01:54Z

Description

vLLM sampling parameters include a richer set of values, among which logprobs has a wider utility.

When testing by adding the logpobs option to the request payload, the model output schema was unchanged
{"generated text": "model_output"} suggesting it was not propagated to the output

Will this change the current api? How?

Probably by enriching the output schema.

Who will benefit from this enhancement?

Anyone who wants logprobs extracted from model predictions.

References

list known implementations
This thread provides a starting point for tackling this issue.

The text was updated successfully, but these errors were encountered:

frankfliu · 2024-10-02T04:22:55Z

@sindhuvahinis

CoolFish88 · 2024-10-02T09:12:37Z

Found this while looking into CouldWatch logs:

The following parameters are not supported by vllm with rolling batch: {'max_tokens', 'seed', 'logprobs', 'temperature'}

siddvenk · 2024-10-02T15:26:22Z

What is the payload you are using to invoke the endpoint?

We do expose generation parameters that can be included in the inference request. Details are in https://docs.djl.ai/master/docs/serving/serving/docs/lmi/user_guides/lmi_input_output_schema.html.

We have slightly different names for some of the generation/sampling parameters - our API unifies different inference backends like vllm, tensorrt-llm, huggingface accelerate, and transformers-neuronx.

If you want to use a different API schema, we provide documentation on writing your own input/output parsers https://docs.djl.ai/master/docs/serving/serving/docs/lmi/user_guides/lmi_input_output_schema.html#custom-pre-and-post-processing.

We also support the OpenAI chat completions schema for chat type models https://docs.djl.ai/master/docs/serving/serving/docs/lmi/user_guides/chat_input_output_schema.html.

CoolFish88 added the enhancement New feature or request label Oct 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose vLLM logprobs in model output #3491

Expose vLLM logprobs in model output #3491

CoolFish88 commented Oct 1, 2024

frankfliu commented Oct 2, 2024

CoolFish88 commented Oct 2, 2024

siddvenk commented Oct 2, 2024 •

edited

Loading

Expose vLLM logprobs in model output #3491

Expose vLLM logprobs in model output #3491

Comments

CoolFish88 commented Oct 1, 2024

Description

References

frankfliu commented Oct 2, 2024

CoolFish88 commented Oct 2, 2024

siddvenk commented Oct 2, 2024 • edited Loading

siddvenk commented Oct 2, 2024 •

edited

Loading