Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
81b993d
Initial commit.
wangshangsam Oct 7, 2025
37f1f14
WIP
wangshangsam Oct 9, 2025
d032513
[Automated Commit] Format Codebase
github-actions[bot] Oct 9, 2025
41c94c4
misc
wangshangsam Oct 23, 2025
40e62bc
Merge branch 'wangshangsam/vlm-sut-prototype' of github.com:CentML/ml…
wangshangsam Oct 23, 2025
a240d7c
adding pydantic_typer
wangshangsam Oct 23, 2025
990503c
offline WIP
wangshangsam Oct 28, 2025
0bc8773
[Automated Commit] Format Codebase
github-actions[bot] Oct 28, 2025
ed021c5
Merge branch 'master' into wangshangsam/vlm-sut-prototype
wangshangsam Oct 28, 2025
7a7c1bc
[Automated Commit] Format Codebase
github-actions[bot] Oct 28, 2025
ab7eeee
rename the notebook
wangshangsam Oct 28, 2025
e4f5a7e
Merge branch 'wangshangsam/vlm-sut-prototype' of github.com:CentML/ml…
wangshangsam Oct 28, 2025
754207e
clean-up
wangshangsam Nov 4, 2025
83ccca4
[Automated Commit] Format Codebase
github-actions[bot] Nov 4, 2025
36ba877
Downgrade from 3.13 to 3.12
wangshangsam Nov 4, 2025
a2176d2
[Automated Commit] Format Codebase
github-actions[bot] Nov 4, 2025
126f945
send the response back to LoadGen one at a time
wangshangsam Nov 5, 2025
b2400a0
Move the ownership of the AsyncOpenAI client into Task, and clean up …
wangshangsam Nov 5, 2025
cdd0a4a
Merge branch 'wangshangsam/vlm-sut-prototype' of github.com:CentML/ml…
wangshangsam Nov 5, 2025
5ac23a5
[Automated Commit] Format Codebase
github-actions[bot] Nov 5, 2025
0ff5f13
fixing typo
wangshangsam Nov 5, 2025
272e31d
Merge branch 'wangshangsam/vlm-sut-prototype' of github.com:CentML/ml…
wangshangsam Nov 5, 2025
ecf95ed
[Automated Commit] Format Codebase
github-actions[bot] Nov 5, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,6 @@ libmlperf_loadgen.a
__pycache__/
generated/
*.swp
*.egg-info/
*.so
.vscode/
170 changes: 170 additions & 0 deletions multimodal/vl2l/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,170 @@
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
.pybuilder/
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version

# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock

# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock

# pdm
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
# in version control.
# https://pdm.fming.dev/latest/usage/project/#working-with-version-control
.pdm.toml
.pdm-python
.pdm-build/

# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# IDE
.vscode

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# pytype static type analyzer
.pytype/

# Cython debug symbols
cython_debug/

# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/

# NPM
node_modules
package-lock.json
/package.json
130 changes: 130 additions & 0 deletions multimodal/vl2l/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
# Reference Implementation for the Vision-language-to-language (VL2L) Benchmark

## Quick Start

### Get the source code

Clone the MLPerf Inference repo via:

```bash
git clone --recurse-submodules https://github.com/mlcommons/inference.git mlperf-inference
```

Then enter the repo:

```bash
cd mlperf-inference/
```

### Create a Conda environment

Follow [this link](https://www.anaconda.com/docs/getting-started/miniconda/install#quickstart-install-instructions)
on how to install Miniconda on your host machine. Then, you can create a new conda
environment via:

```bash
conda create -n mlperf-inf-mm-vl2l python=3.12
```

### Install LoadGen

Update `libstdc++` in the conda environment:

```bash
conda install -c conda-forge libstdcxx-ng
```

Install `absl-py` and `numpy`:

```bash
conda install absl-py numpy
```

Build and install LoadGen from source:

```bash
cd loadgen/
CFLAGS="-std=c++14 -O3" python -m pip install .
cd ../
```

Run a quick test to validate that LoadGen was installed correctly:

```bash
python loadgen/demos/token_metrics/py_demo_server.py
```

### Install the VL2L benchmarking CLI

For users, install `mlperf-inf-mm-vl2l` with:

```bash
pip install multimodal/vl2l/
```

For developers, install `mlperf-inf-mm-vl2l` and the development tools with:

```bash
pip install multimodal/vl2l/[dev]
```

After installation, you can check the CLI flags that `mlperf-inf-mm-vl2l` can take with:

```bash
mlperf-inf-mm-vl2l --help
```

You can enable shell autocompletion for `mlperf-inf-mm-vl2l` with:

```bash
mlperf-inf-mm-vl2l --install-completion
```

> NOTE: Shell auto-completion will take effect once you restart the terminal.

### Start an inference endpoint on your local host machine with vLLM

Please refer to [this guide on how to launch vLLM for various Qwen3 VL MoE models](https://docs.vllm.ai/projects/recipes/en/latest/Qwen/Qwen3-VL.html).

```bash
docker run --gpus all \ # Use all the GPUs on this host machine.
-v ~/.cache/huggingface:/root/.cache/huggingface \ # Use the HuggingFace cache from your host machine.
-p 8000:8000 \ # This assumes the endpoint will use port 8000.
--ipc=host \ # The container can access and utilize the host's IPC mechanisms (e.g., shared memory).
vllm/vllm-openai:nightly \ # You can also use the `:latest` container or a specific release.
--model Qwen/Qwen3-VL-235B-A22B-Instruct \ # Specifies the model for vLLM to deploy.
--tensor-parallel-size 8 \ # 8-way tensor-parallel inference across 8 GPUs.
--limit-mm-per-prompt.video 0 # The input requests will contain images only (i.e., no videos).
```

### Run the benchmark for the Offline scenario

Performance only mode:

```bash
mlperf-inf-mm-vl2l --settings.senario offline --settings.mode performance_only
```

Accuracy only mode:

TBD

### Run the benchmark for the Server scenario

Performance only mode:

TBD

Accuracy only mode:

TBD

## Developer Guide

### Linting

You can lint the VL2L benchmark source code by running the following script:

```bash
bash multimodal/vl2l/scripts/linters.sh
```
Loading