-
Notifications
You must be signed in to change notification settings - Fork 288
[tools] ReadME update and fixes #2819
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
f7107e9
to
061d54a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR updates the preprocessing for text reranker models and improves documentation across several tools. The main purpose is to simplify the text processing pipeline for Qwen3 models while maintaining backward compatibility and enhance README documentation.
- Streamlined preprocessing logic for Qwen3 reranker models by removing conditional architecture handling
- Added comprehensive documentation for reranker and embedding pipelines in the who_what_benchmark tool
- Enhanced llm_bench README with detailed use cases and parameter explanations
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 10 comments.
Show a summary per file
File | Description |
---|---|
tools/who_what_benchmark/whowhatbench/reranking_evaluator.py | Simplified Qwen3 model preprocessing by removing conditional causal LM architecture handling |
tools/who_what_benchmark/README.md | Added documentation for text reranking and text embedding model comparison workflows |
tools/llm_bench/task/text_reranker.py | Simplified tokenization logic for Qwen3 models, removing conditional preprocessing |
tools/llm_bench/benchmark.py | Fixed help text typos for command-line arguments |
tools/llm_bench/README.md | Comprehensive documentation update with detailed use cases and parameter descriptions |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
# post/pre processing for qwen models added according to transformers Qwen3-Embedding-0.6B model card: | ||
# https://huggingface.co/Qwen/Qwen3-Reranker-0.6B#transformers-usage | ||
if model.config.model_type == "qwen3": | ||
print("NEW WAY") |
Copilot
AI
Oct 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Debug print statement should be removed from production code.
print("NEW WAY") |
Copilot uses AI. Check for mistakes.
# Export model to OpenVINO | ||
optimum-cli export openvino -m BAAI/bge-reranker-v2-m3 bge-reranker-v2-m3 --task text-classification | ||
|
||
# Collect the references and save the mappling in the .csv file. |
Copilot
AI
Oct 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected spelling of 'mappling' to 'mapping'.
Copilot uses AI. Check for mistakes.
# Reference images will be stored in the "reference" subfolder under the same path with .csv. | ||
wwb --base-model BAAI/bge-small-en-v1.5 --gt-data embed_test/gt.csv --model-type text-embedding --embeds_pooling mean --embeds_normalize --embeds_padding_side "left" --hf | ||
# Compute the metric | ||
# Target images will be stored in the "target" subfolder under the same path with .csv. |
Copilot
AI
Oct 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment mentions 'images' but should refer to 'references' or 'data' since this is for text reranking models, not image processing.
# Reference images will be stored in the "reference" subfolder under the same path with .csv. | |
wwb --base-model BAAI/bge-small-en-v1.5 --gt-data embed_test/gt.csv --model-type text-embedding --embeds_pooling mean --embeds_normalize --embeds_padding_side "left" --hf | |
# Compute the metric | |
# Target images will be stored in the "target" subfolder under the same path with .csv. | |
# Reference data will be stored in the "reference" subfolder under the same path with .csv. | |
wwb --base-model BAAI/bge-small-en-v1.5 --gt-data embed_test/gt.csv --model-type text-embedding --embeds_pooling mean --embeds_normalize --embeds_padding_side "left" --hf | |
# Compute the metric | |
# Target data will be stored in the "target" subfolder under the same path with .csv. |
Copilot uses AI. Check for mistakes.
# Reference images will be stored in the "reference" subfolder under the same path with .csv. | ||
wwb --base-model BAAI/bge-small-en-v1.5 --gt-data embed_test/gt.csv --model-type text-embedding --embeds_pooling mean --embeds_normalize --embeds_padding_side "left" --hf | ||
# Compute the metric | ||
# Target images will be stored in the "target" subfolder under the same path with .csv. |
Copilot
AI
Oct 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment mentions 'images' but should refer to 'targets' or 'data' since this is for text reranking models, not image processing.
Copilot uses AI. Check for mistakes.
|
||
```sh | ||
# prompt lookup decoding | ||
python benchmark.py -m models/llama-2-7b-chat/ -p "What is openvino?" -n 2 --task text_gen --max_ngram_siz 3 --num_assistant_tokens 5 |
Copilot
AI
Oct 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected spelling of 'max_ngram_siz' to 'max_ngram_size'.
python benchmark.py -m models/llama-2-7b-chat/ -p "What is openvino?" -n 2 --task text_gen --max_ngram_siz 3 --num_assistant_tokens 5 | |
python benchmark.py -m models/llama-2-7b-chat/ -p "What is openvino?" -n 2 --task text_gen --max_ngram_size 3 --num_assistant_tokens 5 |
Copilot uses AI. Check for mistakes.
# load speaker embeddings | ||
wget https://huggingface.co/datasets/Xenova/cmu-arctic-xvectors-extracted/resolve/main/cmu_us_awb_arctic-wav-arctic_a0001.bin | ||
# run benchmark.py | ||
python benchmark.py -m models/speecht5_tts/ -p "Hello OpenVINO GenAI" -n 2 --task speech_to_text --speaker_embeddings ./cmu_us_awb_arctic-wav-arctic_a0001.bin |
Copilot
AI
Oct 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorrect task type: should be 'text_to_speech' instead of 'speech_to_text' for a TTS model.
python benchmark.py -m models/speecht5_tts/ -p "Hello OpenVINO GenAI" -n 2 --task speech_to_text --speaker_embeddings ./cmu_us_awb_arctic-wav-arctic_a0001.bin | |
python benchmark.py -m models/speecht5_tts/ -p "Hello OpenVINO GenAI" -n 2 --task text_to_speech --speaker_embeddings ./cmu_us_awb_arctic-wav-arctic_a0001.bin |
Copilot uses AI. Check for mistakes.
# load audio | ||
wget https://storage.openvinotoolkit.org/models_contrib/speech/2021.2/librispeech_s5/how_are_you_doing_today.wav | ||
# run benchmark.py | ||
python benchmark.py -m models/whisper-base/ -p ./how_are_you_doing_today.wav -n 2 --task text_to_speech |
Copilot
AI
Oct 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorrect task type: should be 'speech_to_text' instead of 'text_to_speech' for a Whisper STT model.
python benchmark.py -m models/whisper-base/ -p ./how_are_you_doing_today.wav -n 2 --task text_to_speech | |
python benchmark.py -m models/whisper-base/ -p ./how_are_you_doing_today.wav -n 2 --task speech_to_text |
Copilot uses AI. Check for mistakes.
## 8. Memory constipation mode | ||
Enables memory usage information collection mode. This mode is affect of execution time, so it is not recommended to run memory consumption and performance benchmarking at the same time. Effect on performance can be reduced by specifying a longer --memory_consumption_delay, but the impact is still expected. | ||
|
||
```sh | ||
# run benchmark.py in memory constipation mode |
Copilot
AI
Oct 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected spelling of 'constipation' to 'consumption'.
## 8. Memory constipation mode | |
Enables memory usage information collection mode. This mode is affect of execution time, so it is not recommended to run memory consumption and performance benchmarking at the same time. Effect on performance can be reduced by specifying a longer --memory_consumption_delay, but the impact is still expected. | |
```sh | |
# run benchmark.py in memory constipation mode | |
## 8. Memory consumption mode | |
Enables memory usage information collection mode. This mode affects execution time, so it is not recommended to run memory consumption and performance benchmarking at the same time. Effect on performance can be reduced by specifying a longer --memory_consumption_delay, but the impact is still expected. | |
```sh | |
# run benchmark.py in memory consumption mode |
Copilot uses AI. Check for mistakes.
Enables memory usage information collection mode. This mode is affect of execution time, so it is not recommended to run memory consumption and performance benchmarking at the same time. Effect on performance can be reduced by specifying a longer --memory_consumption_delay, but the impact is still expected. | ||
|
||
```sh | ||
# run benchmark.py in memory constipation mode |
Copilot
AI
Oct 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected spelling of 'constipation' to 'consumption'.
# run benchmark.py in memory constipation mode | |
# run benchmark.py in memory consumption mode |
Copilot uses AI. Check for mistakes.
**Parameters:** | ||
- `-mc, --memory_consumption`: Enables memory usage information collection mode. If the value is 1, output the maximum memory consumption in warm-up iterations. If the value is 2, output the maximum memory consumption in all iterations. | ||
- `--memory_consumption_delay`: Delay for memory consumption check in seconds, smaller value will lead to more precised memory consumption, but may affects performance. | ||
- `-mc_dir, --memory_consumption_dir`: Path to store memory consamption logs and chart. |
Copilot
AI
Oct 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected spelling of 'consamption' to 'consumption'.
- `-mc_dir, --memory_consumption_dir`: Path to store memory consamption logs and chart. | |
- `-mc_dir, --memory_consumption_dir`: Path to store memory consumption logs and chart. |
Copilot uses AI. Check for mistakes.
Description
Simplify preprocessing step for text reranker pipeline with Qwen3 models(prev and current ways work both)
Update ReadMe for wwb with reranker/embedding pipeline
Added Use cases in llm_bench ReadMe
Checklist: