Skip to content

Conversation

@wangshangsam
Copy link
Contributor

@wangshangsam wangshangsam commented Oct 28, 2025

This is the first PR towards the VLM reference implementation for the v6.0 round.
This PR currenlty supports the Offline scenario + performance-only mode. Server scenario and accuracy mode will be introduced through subsequent PRs.
The issue_query implemenation adopted the purely asyncio-based design from the DSR1 reference implementation, but the code here is simpler mostly because we only access the inference endpoint through OpenAI APIs.

@github-actions
Copy link
Contributor

github-actions bot commented Oct 28, 2025

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

@wangshangsam wangshangsam marked this pull request as ready for review November 4, 2025 08:31
@wangshangsam wangshangsam requested a review from a team as a code owner November 4, 2025 08:31
@wangshangsam wangshangsam changed the title VLM reference implementation [VLM] Offline scenario, performance-only mode for the reference implementation Nov 4, 2025
@wangshangsam wangshangsam changed the title [VLM] Offline scenario, performance-only mode for the reference implementation [VLM] Offline scenario, performance-only mode of the reference implementation Nov 4, 2025
Comment on lines +105 to +112

min_duration: Annotated[
timedelta,
Field(
description="The minimum testing duration.",
),
] = timedelta(seconds=5)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we change this to float? Currently is timedelta. If I try to enter
--settings.min_duration 60 I get:

│ Invalid value for _pydantic_settings_min_duration: Input should be a valid timedelta, "day" identifier in duration not correctly formatted                                                                                                                                                                                                                                                 │```

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm this is not supposed to happen. The string format that this flag can take is defined in https://docs.pydantic.dev/2.0/usage/types/datetime/

timedelta fields will accept values of type:

  • str; the following formats are accepted:
    [-][DD ][HH:MM]SS[.ffffff]
    [±]P[DD]DT[HH]H[MM]M[SS]S (ISO 8601 format for timedelta)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, so the input will be something like: --settings.min_duration 00:01:00

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

00:01:00 can work but what I was trying to say is that it's supposed to be able to take in an single integer as well (in which case it would mean "seconds")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when, I entered 60 , it threw me the error

Comment on lines +121 to +122
settings.use_token_latencies = True
return settings

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, when we use settings.use_token_latencies = True, the constrains become:

  /// Token latency parameters
  uint64_t server_ttft_latency = 100000000;
  uint64_t server_tpot_latency = 100000000;

If we use false the constrain is:

/// \brief The latency constraint for the Server scenario.
  uint64_t server_target_latency_ns = 100000000;

We may need to add more flags to let the user choose the constrains and values

…the client, event loop and the event loop thread
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants