Skip to content

Add vllm:request_max_num_generation_tokens metric #243

@mayabar

Description

@mayabar

vllm:request_max_num_generation_tokens - This is the minimum of max-model-len - prompt length and max_tokens if defined.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions