Add Customization Capabilities to Cache-Aware Models #14757

artbataev · 2025-09-18T13:45:02Z

Important

The Update branch button must only be pressed in very rare occassions.
An outdated branch is never blocking the merge of a PR.
Please reach out to the automation team before pressing that button.

What does this PR do ?

Refactors cache-aware streaming script:

unifies the interface with speech_to_text_eval.py and speech_to_text_streaming_infer_rnnt.py (using Hydra/OmegaConf instead of Argparse)
exposes decoding parameters for customization purposes - to allow passing LM and/or phrase boosting params

Known issues: currently, cache-aware models do not support torch.bfloat16/float16 directly, only via AMP. This should be fixed in future.

Collection: [ASR]

Changelog

Add specific line by line info of high level changes in this PR.

Usage

python ${NEMO_DIR}/examples/asr/asr_cache_aware_streaming/speech_to_text_cache_aware_streaming_infer.py \
  model_path=${ASR_MODEL_PATH} \
  dataset_manifest=${DATASET} \
  compute_dtype=float32 \
  matmul_precision=high \
  allow_mps=true \
  batch_size=${BATCH_SIZE} \
  output_path="${RESULTS_DIR} \
  rnnt_decoding.strategy=greedy_batch \
  rnnt_decoding.greedy.boosting_tree.key_phrases_file=${KEY_WORDS_LIST} \
  rnnt_decoding.greedy.boosting_tree.context_score=1.0 \
  rnnt_decoding.greedy.boosting_tree.depth_scaling=2.0 \
  rnnt_decoding.greedy.boosting_tree_alpha=${BT_ALPHA}

GitHub Actions CI

The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.

The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR.
To re-run CI remove and add the label again.
To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you add or update any necessary documentation?
Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
- Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

New Feature
Bugfix
Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

Related to # (issue)

Signed-off-by: Vladimir Bataev <[email protected]>

…tial hypotheses Signed-off-by: Vladimir Bataev <[email protected]>

…ormed with partial hypotheses Signed-off-by: Vladimir Bataev <[email protected]>

# Conflicts: # examples/asr/asr_chunked_inference/rnnt/speech_to_text_streaming_infer_rnnt.py

Signed-off-by: Vladimir Bataev <[email protected]>

Signed-off-by: artbataev <[email protected]>

Signed-off-by: Vladimir Bataev <[email protected]>

examples/asr/asr_cache_aware_streaming/speech_to_text_cache_aware_streaming_infer.py

Signed-off-by: Vladimir Bataev <[email protected]>

Copilot

Pull Request Overview

This PR refactors the cache-aware streaming script to unify its interface with other ASR scripts by replacing argparse with Hydra/OmegaConf configuration management and exposing decoding parameters for customization.

Migrates from argparse to Hydra/OmegaConf configuration system
Extracts common device and dtype selection utilities to shared module
Adds support for customizable decoding parameters including language model and phrase boosting

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
`nemo/collections/asr/parts/utils/transcribe_utils.py`	Adds common device and dtype selection utilities
`nemo/collections/asr/parts/submodules/tdt_beam_decoding.py`	Adds NotImplementedError for unsupported partial hypotheses
`nemo/collections/asr/parts/submodules/rnnt_beam_decoding.py`	Adds NotImplementedError for unsupported partial hypotheses
`examples/asr/transcribe_speech.py`	Updates to use shared dtype selection utility
`examples/asr/asr_chunked_inference/rnnt/speech_to_text_streaming_infer_rnnt.py`	Updates to use shared device and dtype utilities
`examples/asr/asr_cache_aware_streaming/speech_to_text_cache_aware_streaming_infer.py`	Major refactor from argparse to Hydra configuration with dataclass config

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-09-25T11:02:24Z

examples/asr/asr_cache_aware_streaming/speech_to_text_cache_aware_streaming_infer.py


 ## Multi-lookahead models
-For models which support multiple lookaheads, the default is the first one in the list of model.encoder.att_context_size. To change it, you may use --att_context_size, for example --att_context_size [70,1].
+For models which support multiple lookaheads, the default is the first one in the list of model.encoder.att_context_size. To change it, you may use att_context_size, for example att_context_size=§[70,1].


There is a typographical error with '§' symbol before '[70,1]'. This should be removed.

Suggested change

For models which support multiple lookaheads, the default is the first one in the list of model.encoder.att_context_size. To change it, you may use att_context_size, for example att_context_size=§[70,1].

For models which support multiple lookaheads, the default is the first one in the list of model.encoder.att_context_size. To change it, you may use att_context_size, for example att_context_size=[70,1].

Copilot · 2025-09-25T11:02:24Z

examples/asr/asr_cache_aware_streaming/speech_to_text_cache_aware_streaming_infer.py

+            processed_signal, processed_signal_length = streaming_buffer.get_all_audios()
+            processed_signal = processed_signal.to(compute_dtype)


[nitpick] The processed_signal assignment on line 206 could be chained with line 205 to reduce redundant variable assignment: processed_signal = streaming_buffer.get_all_audios()[0].to(compute_dtype)

Copilot · 2025-09-25T11:02:24Z

examples/asr/asr_cache_aware_streaming/speech_to_text_cache_aware_streaming_infer.py

-                    )
+            # keep_all_outputs needs to be True for the last step of streaming when model is trained with att_context_style=regular
+            # otherwise the last outputs would get dropped
+            chunk_audio = chunk_audio.to(compute_dtype)


[nitpick] Similar to the offline case, this dtype conversion could be optimized by moving it inline or consolidating the conversion logic into a helper function if this pattern appears frequently.

nithinraok

LGTM. Neat work. Thanks

nithinraok · 2025-09-25T15:56:41Z

Shouldn;t any CI-CD tests be updated that relies on this script? @artbataev

KunalDhawan

This is great work, thank you @artbataev! I tested it locally on my end and everything works well. Added minor comments and a few nitpicks, other than that LGTM

KunalDhawan · 2025-09-25T18:41:57Z

examples/asr/asr_cache_aware_streaming/speech_to_text_cache_aware_streaming_infer.py


 ## To evaluate a model in cache-aware streaming mode on a single audio file:

 python speech_to_text_streaming_infer.py \


need to change this to speech_to_text_cache_aware_streaming_infer.py

KunalDhawan · 2025-09-25T18:42:21Z

examples/asr/asr_cache_aware_streaming/speech_to_text_cache_aware_streaming_infer.py


 ## To evaluate a model in cache-aware streaming mode on a manifest file:

 python speech_to_text_streaming_infer.py \


Same as above, change to speech_to_text_cache_aware_streaming_infer.py

KunalDhawan · 2025-09-25T18:44:39Z

examples/asr/asr_cache_aware_streaming/speech_to_text_cache_aware_streaming_infer.py

+    compare_vs_offline=true \
+    amp=true \
+    debug_mode=true
+


Could we add an example here to show how to do word boosting and LM rescoring?

I added an example and also a link to the documentation. Documentation about word boosting would be updated shortly in #14800

KunalDhawan · 2025-09-25T19:13:38Z

examples/asr/asr_cache_aware_streaming/speech_to_text_cache_aware_streaming_infer.py

+    allow_mps: bool = False  # allow to select MPS device (Apple Silicon M-series GPU)
+    amp: bool = False
+    amp_dtype: str = "float16"  # can be set to "float16" or "bfloat16" when using amp
+    compute_dtype: Optional[str] = (


should we set compute_dtype to be float32 by default? This is because currently by default compute_dtype is None and amp is false, if a user calls the inference script as follows:

python examples/asr/asr_cache_aware_streaming/speech_to_text_cache_aware_streaming_infer.py model_path=model.nemo dataset_manifest=manifest.json batch_size=16 att_context_size=[70,13]

They'll run into the NotImplementedError on line 310: Compute dtype None is not yet supported for cache-aware models, use float32 instead

Thanks, good catch! Fixed (and added comments).

Signed-off-by: Vladimir Bataev <[email protected]>

artbataev · 2025-09-25T20:36:12Z

Shouldn;t any CI-CD tests be updated that relies on this script? @artbataev

@nithinraok I do not see tests for streaming inference scripts in tests/functional_tests, looks like only transcribe_speech.py is tested. Do you suggest adding a bunch of tests there? I can add simple tests for cache-aware + (new) streaming here or in future PR.

github-actions · 2025-09-25T22:38:49Z

[🤖]: Hi @artbataev 👋,

We wanted to let you know that a CICD pipeline for this PR just finished successfully.

So it might be time to merge this PR or get some approvals.

//cc @chtruong814 @ko3n1g @pablo-garay @thomasdhc

KunalDhawan

LGTM, thank you!

nithinraok · 2025-09-26T18:20:38Z

Shouldn;t any CI-CD tests be updated that relies on this script? @artbataev

@nithinraok I do not see tests for streaming inference scripts in tests/functional_tests, looks like only transcribe_speech.py is tested. Do you suggest adding a bunch of tests there? I can add simple tests for cache-aware + (new) streaming here or in future PR.

Yes, pls add one for cache-aware as we are prioritzing cache-aware models

artbataev · 2025-09-26T19:51:38Z

Shouldn;t any CI-CD tests be updated that relies on this script? @artbataev

@nithinraok I do not see tests for streaming inference scripts in tests/functional_tests, looks like only transcribe_speech.py is tested. Do you suggest adding a bunch of tests there? I can add simple tests for cache-aware + (new) streaming here or in future PR.

Yes, pls add one for cache-aware as we are prioritzing cache-aware models

I made a separate PR with tests, will finalize shortly #14823

* Support QwenVL for inference API (#14534) * Support QwenVL for inference engine * Apply isort and black reformatting Signed-off-by: meatybobby <[email protected]> * Remove comment out * Reformat * Skip pylint check * Add unit tests * Apply isort and black reformatting Signed-off-by: meatybobby <[email protected]> --------- Signed-off-by: meatybobby <[email protected]> Co-authored-by: meatybobby <[email protected]> * Hyena: Allow to use unfused RMSNorm + TELinear to restore accuracy and some speed (#14542) * Fix sequence packing loss calculation (#14437) * Fix sequence packing loss calculation Signed-off-by: Rayan Dasoriya <[email protected]> * Fix nemo2 path Signed-off-by: Rayan Dasoriya <[email protected]> * Skip pylint Signed-off-by: Rayan Dasoriya <[email protected]> --------- Signed-off-by: Rayan Dasoriya <[email protected]> Co-authored-by: Dmytro Pykhtar <[email protected]> * [Audio]: added streaming mode to SpectrogramToAudio (#14524) * [Audio]: added streaming mode to SpectrogramToAudio Signed-off-by: Rauf <[email protected]> * added time buffer Signed-off-by: Rauf <[email protected]> * renamed Nf -> num_frames Signed-off-by: Rauf <[email protected]> * added AudioToSpectrogram and scale and magnitude power Signed-off-by: Rauf <[email protected]> * added multiple chunking support Signed-off-by: Rauf <[email protected]> * added properties _stream_initialized, _eps, got rid of _prev_spec_frame Signed-off-by: Rauf <[email protected]> * added hanning window Signed-off-by: Rauf <[email protected]> * Apply isort and black reformatting Signed-off-by: nasretdinovr <[email protected]> * added a docstring regarding streaming istft mode Signed-off-by: Rauf <[email protected]> --------- Signed-off-by: Rauf <[email protected]> Signed-off-by: nasretdinovr <[email protected]> Co-authored-by: nasretdinovr <[email protected]> * fix: fix missing rope scaling in exporting llama embedding model (#14523) Signed-off-by: Zhiyu Li <[email protected]> * Update evo2 defaults so converted checkpoints have the right parameters (#14514) * Update evo2 defaults so converted checkpoints have the right parameters Signed-off-by: John St John <[email protected]> * Fix line too long issue Signed-off-by: John St John <[email protected]> * Fix expected changes to configs that are locked into our tests Signed-off-by: John St John <[email protected]> --------- Signed-off-by: John St John <[email protected]> * deprecate t0 scripts (#14585) Signed-off-by: dimapihtar <[email protected]> * cfg typo correction (#14588) Signed-off-by: Malay Nagda <[email protected]> * [Perf script] Add use_te_activation_func and activation_func_fp8_input_store flags (#14522) * Add use te activation func and save act input in fp8 flags Signed-off-by: Guyue Huang <[email protected]> * Fix field name Signed-off-by: Guyue Huang <[email protected]> * Update scripts/performance/vlm/finetune_qwen25vl_32b.py Co-authored-by: malay-nagda <[email protected]> Signed-off-by: Guyue Huang <[email protected]> --------- Signed-off-by: Guyue Huang <[email protected]> Signed-off-by: Guyue Huang <[email protected]> Co-authored-by: malay-nagda <[email protected]> * Modify logging message to signal that RestoreConfig will be used (#14469) * Bump TE and Mcore (#14568) * Bump TE and Mcore Signed-off-by: Charlie Truong <[email protected]> * Use Mcore 69b65 Signed-off-by: Charlie Truong <[email protected]> --------- Signed-off-by: Charlie Truong <[email protected]> * Avoid host-device sync in PTL logging (#14489) * remove sync in logging Signed-off-by: qiyuw <[email protected]> * Apply isort and black reformatting Signed-off-by: WanZzzzzz <[email protected]> * add class and func docstrings in data_sampler.py for pylint Signed-off-by: qiyuw <[email protected]> * Apply isort and black reformatting Signed-off-by: WanZzzzzz <[email protected]> --------- Signed-off-by: qiyuw <[email protected]> Signed-off-by: WanZzzzzz <[email protected]> Co-authored-by: qiyuw <[email protected]> Co-authored-by: WanZzzzzz <[email protected]> * Integrate implicit filter kernel with Hyena layer (#14621) * add 1b arclongcontextconfig Signed-off-by: Farhad Ramezanghorbani <[email protected]> * fix device mess Signed-off-by: Farhad Ramezanghorbani <[email protected]> * add implicit_filter support Signed-off-by: Farhad Ramezanghorbani <[email protected]> * use padded input Signed-off-by: Farhad Ramezanghorbani <[email protected]> * Apply isort and black reformatting Signed-off-by: farhadrgh <[email protected]> * Revert "add 1b arclongcontextconfig" This reverts commit 029969bae07e5c1651abd519640424d4aaece216. --------- Signed-off-by: Farhad Ramezanghorbani <[email protected]> Signed-off-by: farhadrgh <[email protected]> * Fix kv_channels configuration for Gemma2 27b (#14590) * fix gemma2 27b kv dimension Signed-off-by: Ananth Subramaniam <[email protected]> * fix gemma2 27b kv dimension Signed-off-by: Ananth Subramaniam <[email protected]> --------- Signed-off-by: Ananth Subramaniam <[email protected]> * [Flux] small fixes (#14333) * feat: print expert groups on megatron init (#13874) Signed-off-by: Alexander Zhipa <[email protected]> Co-authored-by: Alexander Zhipa <[email protected]> Signed-off-by: CarlosGomes98 <[email protected]> * set a different seed for each dp rank Signed-off-by: CarlosGomes98 <[email protected]> * calculate loss inside autocast Signed-off-by: CarlosGomes98 <[email protected]> * disable per token loss, grad acc fusion Signed-off-by: CarlosGomes98 <[email protected]> * add missing self.seed Signed-off-by: CarlosGomes98 <[email protected]> * black formatting Signed-off-by: CarlosGomes98 <[email protected]> * Apply isort and black reformatting Signed-off-by: gautham-kollu <[email protected]> --------- Signed-off-by: Alexander Zhipa <[email protected]> Signed-off-by: CarlosGomes98 <[email protected]> Signed-off-by: gautham-kollu <[email protected]> Co-authored-by: Alexander Zhipa <[email protected]> Co-authored-by: Alexander Zhipa <[email protected]> Co-authored-by: gautham-kollu <[email protected]> Co-authored-by: gautham-kollu <[email protected]> * [Flux] Add MXFP8 Support (#14473) * [Flux] Add MXFP8 support. Signed-off-by: Wil Kong <[email protected]> * [Flux] Add current and block scaling. Signed-off-by: Wil Kong <[email protected]> --------- Signed-off-by: Wil Kong <[email protected]> * use hf hub to download ckpt (#14638) Signed-off-by: Ao Tang <[email protected]> * Fine-tune embedding models (E5-Large-V2 and LLaMA-3.2-1B) on the allnli triplet dataset with NeMo Framework (#14584) * Create E2E-Embedding-Finetuning Signed-off-by: Hemant Giri <[email protected]> * Update E2E-Embedding-Finetuning Signed-off-by: Hemant Giri <[email protected]> * Delete tutorials/llm/embedding/E2E-Embedding-Finetuning Signed-off-by: Hemant Giri <[email protected]> * Create README.md Signed-off-by: Hemant Giri <[email protected]> * Add files via upload Signed-off-by: Hemant Giri <[email protected]> * Add files via upload This is a notebook for E2E finetuning a embedding model Signed-off-by: Hemant Giri <[email protected]> * Update README.md Signed-off-by: Hemant Giri <[email protected]> * Update README.md Signed-off-by: Hemant Giri <[email protected]> * Delete tutorials/llm/embedding/E2E-Embedding-Finetuning/download_dataset.py Signed-off-by: Hemant Giri <[email protected]> * Delete tutorials/llm/embedding/E2E-Embedding-Finetuning/finetune_e5.py Signed-off-by: Hemant Giri <[email protected]> * Delete tutorials/llm/embedding/E2E-Embedding-Finetuning/finetune_llama1b.py Signed-off-by: Hemant Giri <[email protected]> * Delete tutorials/llm/embedding/E2E-Embedding-Finetuning/import_e5_large.py Signed-off-by: Hemant Giri <[email protected]> * Delete tutorials/llm/embedding/E2E-Embedding-Finetuning/import_llama1b.py Signed-off-by: Hemant Giri <[email protected]> --------- Signed-off-by: Hemant Giri <[email protected]> Co-authored-by: Ao Tang <[email protected]> * [Perf script] Llama and GPT3 perf script use mlp cast fusion Signed-off-by: Guyue Huang <[email protected]> * remove service launch scripts (#14647) Signed-off-by: dimapihtar <[email protected]> * warning instead of error with chat template (#14641) Signed-off-by: jenchen13 <[email protected]> * fix notebook (#14643) Signed-off-by: Chen Cui <[email protected]> * [Audio]: fixed bug in conformet unet (#14626) Signed-off-by: Rauf <[email protected]> * Delete tutorials/llm/llama/biomedical-qa directory (#14653) Signed-off-by: Chen Cui <[email protected]> * Fix code checkout during test (#14658) Signed-off-by: Charlie Truong <[email protected]> * Fix Flux seed as optional Arg (#14652) * fix flux seed as optional Signed-off-by: Ao Tang <[email protected]> * fix fluxcontrolnet Signed-off-by: Ao Tang <[email protected]> * Fix code checkout during test Signed-off-by: Charlie Truong <[email protected]> --------- Signed-off-by: Ao Tang <[email protected]> Signed-off-by: Charlie Truong <[email protected]> Co-authored-by: Charlie Truong <[email protected]> * remove older TTS tutorials (#14660) Signed-off-by: Jason <[email protected]> * Remove PEFT scheme condition from recipe (#14661) * Remove PEFT scheme condition from recipe Signed-off-by: Ali Taghibakhshi <[email protected]> * remove unnecessary peft conditioning 12b --------- Signed-off-by: Ali Taghibakhshi <[email protected]> * Add gpt-oss lora exporter (#14589) * add gpt-oss lora exporter Signed-off-by: Chen Cui <[email protected]> * Apply isort and black reformatting Signed-off-by: cuichenx <[email protected]> * update lora exporter for experts Signed-off-by: Chen Cui <[email protected]> * disallow exporting expert lora since nemo implementation is not equivalent to hf Signed-off-by: Chen Cui <[email protected]> * linting Signed-off-by: Chen Cui <[email protected]> * Apply isort and black reformatting Signed-off-by: cuichenx <[email protected]> * address comment Signed-off-by: Chen Cui <[email protected]> --------- Signed-off-by: Chen Cui <[email protected]> Signed-off-by: cuichenx <[email protected]> Co-authored-by: cuichenx <[email protected]> Co-authored-by: Charlie Truong <[email protected]> * Add NeMo Voice Agent (#14325) * update streaming ASR Signed-off-by: stevehuang52 <[email protected]> * add voice agent Signed-off-by: stevehuang52 <[email protected]> * update readme Signed-off-by: stevehuang52 <[email protected]> * update websocket Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * update readme Signed-off-by: stevehuang52 <[email protected]> * update Signed-off-by: stevehuang52 <[email protected]> * clean up Signed-off-by: stevehuang52 <[email protected]> * clean up Signed-off-by: stevehuang52 <[email protected]> * fix typo Signed-off-by: stevehuang52 <[email protected]> * fix codeQL Signed-off-by: stevehuang52 <[email protected]> * update cfg Signed-off-by: stevehuang52 <[email protected]> * remove unused Signed-off-by: stevehuang52 <[email protected]> * update readme Signed-off-by: stevehuang52 <[email protected]> * change default models Signed-off-by: stevehuang52 <[email protected]> * fix diar diable Signed-off-by: stevehuang52 <[email protected]> * fix diar diable Signed-off-by: stevehuang52 <[email protected]> * update ux Signed-off-by: stevehuang52 <[email protected]> * update tts Signed-off-by: stevehuang52 <[email protected]> * update readme Signed-off-by: stevehuang52 <[email protected]> * fix and update Signed-off-by: stevehuang52 <[email protected]> * fix asr Signed-off-by: stevehuang52 <[email protected]> * update readmme Signed-off-by: stevehuang52 <[email protected]> * update doc and llm dtype Signed-off-by: stevehuang52 <[email protected]> * refactor and add example prompts Signed-off-by: stevehuang52 <[email protected]> * update readme Signed-off-by: stevehuang52 <[email protected]> * update readme Signed-off-by: stevehuang52 <[email protected]> * clean up Signed-off-by: stevehuang52 <[email protected]> * clean up Signed-off-by: stevehuang52 <[email protected]> * update info on streaming sortformer Signed-off-by: stevehuang52 <[email protected]> * move code to 'nemo/agents/voice_agent' Signed-off-by: stevehuang52 <[email protected]> * update doc Signed-off-by: stevehuang52 <[email protected]> * clean up Signed-off-by: stevehuang52 <[email protected]> * refactor Signed-off-by: stevehuang52 <[email protected]> * update doc Signed-off-by: stevehuang52 <[email protected]> * remove the unnecessary streaming state conversion and import it from sortformer_modules, remove PostProcessingParams Signed-off-by: Weiqing Wang <[email protected]> * Apply isort and black reformatting Signed-off-by: weiqingw4ng <[email protected]> * update doc Signed-off-by: stevehuang52 <[email protected]> * clean up Signed-off-by: stevehuang52 <[email protected]> * fix for llama-nemotron template, and refactor Signed-off-by: stevehuang52 <[email protected]> * fix tts separator Signed-off-by: stevehuang52 <[email protected]> * fix for llama-nemotron Signed-off-by: stevehuang52 <[email protected]> * update cfg Signed-off-by: stevehuang52 <[email protected]> * refactor and update doc Signed-off-by: stevehuang52 <[email protected]> * change default llm to qwen Signed-off-by: stevehuang52 <[email protected]> * update doc Signed-off-by: stevehuang52 <[email protected]> --------- Signed-off-by: stevehuang52 <[email protected]> Signed-off-by: Weiqing Wang <[email protected]> Signed-off-by: weiqingw4ng <[email protected]> Co-authored-by: Taejin Park <[email protected]> Co-authored-by: Kunal Dhawan <[email protected]> Co-authored-by: Weiqing Wang <[email protected]> Co-authored-by: weiqingw4ng <[email protected]> * Update get_tensor_shapes function whose signature was refactored (#14594) * Update get_tensor_shapes function whose signature changed and wasn't refactored Signed-off-by: Asha Anoosheh <[email protected]> * Bump Mcore commit to latest on 0.14.0 branch Signed-off-by: Charlie Truong <[email protected]> * Bump Mcore Signed-off-by: Charlie Truong <[email protected]> * Set flux fsdp test to optional Signed-off-by: Charlie Truong <[email protected]> * Fix flux test to skip Signed-off-by: Charlie Truong <[email protected]> --------- Signed-off-by: Asha Anoosheh <[email protected]> Signed-off-by: Charlie Truong <[email protected]> Co-authored-by: Charlie Truong <[email protected]> * fixing kernel restarting when transcribing (#14665) * fixing kernel restarting when transcribing Signed-off-by: Weiqing Wang <[email protected]> * fixing the same issue for tutorials/asr/ASR_with_NeMo.ipynb Signed-off-by: Weiqing Wang <[email protected]> * remove the change caused by IDE Signed-off-by: Weiqing Wang <[email protected]> --------- Signed-off-by: Weiqing Wang <[email protected]> * Skip trt-llm and vllm install in install test (#14663) Signed-off-by: Charlie Truong <[email protected]> * Canary tutorial fix (#14673) Signed-off-by: Nune <[email protected]> * Downgrade "datasets" library version in ASR tutorial to ensure compatibility with HF Datasets used (#14679) * downgrade dataset in notebooks to ensure comparibility with HF datsets used Signed-off-by: Kunal Dhawan <[email protected]> * remove env information from notebook Signed-off-by: Kunal Dhawan <[email protected]> --------- Signed-off-by: Kunal Dhawan <[email protected]> * End_to_End_Diarization_Training.ipynb (#14680) Signed-off-by: taejinp <[email protected]> * Fix deepseek export dtype (#14307) * add cast dtype option Signed-off-by: Chen Cui <[email protected]> * linting Signed-off-by: Chen Cui <[email protected]> * fix Signed-off-by: Chen Cui <[email protected]> * add atol option Signed-off-by: Chen Cui <[email protected]> * Update L2_NeMo_2_Conversion_Test_DeepSeek.sh Signed-off-by: Chen Cui <[email protected]> * Update state.py Signed-off-by: Chen Cui <[email protected]> * fix test Signed-off-by: Chen Cui <[email protected]> * Apply isort and black reformatting Signed-off-by: cuichenx <[email protected]> * fix test Signed-off-by: Chen Cui <[email protected]> --------- Signed-off-by: Chen Cui <[email protected]> Signed-off-by: cuichenx <[email protected]> Co-authored-by: cuichenx <[email protected]> Co-authored-by: Charlie Truong <[email protected]> * Delete nemo1 notebooks (#14677) * Delete tutorials/llm/llama/sdg-law-title-generation directory Signed-off-by: Chen Cui <[email protected]> * Delete tutorials/llm/llama/domain-adaptive-pretraining/code/domain_adaptive_pretraining_nemo1.0.ipynb Signed-off-by: Chen Cui <[email protected]> --------- Signed-off-by: Chen Cui <[email protected]> * Bump latest Mcore 020abf01 (#14676) * Bump latest Mcore Signed-off-by: Charlie Truong <[email protected]> * Pin Mcore to 020abf01 Signed-off-by: Charlie Truong <[email protected]> --------- Signed-off-by: Charlie Truong <[email protected]> * correct shapes (#14425) Signed-off-by: CarlosGomes98 <[email protected]> Co-authored-by: gautham-kollu <[email protected]> * Fix for "EncDecRNNTBPEModel transcribe() failed with TypeError" (#14698) * fix decode_ids_to_str for AggregateTokenizer Signed-off-by: andrusenkoau <[email protected]> * minor fix Signed-off-by: andrusenkoau <[email protected]> --------- Signed-off-by: andrusenkoau <[email protected]> * Bump modelopt to 0.35.0 and remove `safe_import("modelopt")` in llm collection (#14656) * Bump modelopt to 0.35.0 and remove safe_import in llm collection Signed-off-by: Keval Morabia <[email protected]> * Update eagle architecture spec setting Signed-off-by: Asha Anoosheh <[email protected]> * Reduce specdec memory usage Signed-off-by: Asha Anoosheh <[email protected]> --------- Signed-off-by: Keval Morabia <[email protected]> Signed-off-by: Asha Anoosheh <[email protected]> Co-authored-by: Charlie Truong <[email protected]> Co-authored-by: Asha Anoosheh <[email protected]> * Tutorial fix (#14699) Signed-off-by: Nune <[email protected]> * Add option for LoRA with Transformer Engine op fuser (#14411) * Initial implementation of fused LoRA Signed-off-by: Tim Moon <[email protected]> * Get fused LoRA to run Signed-off-by: Tim Moon <[email protected]> * Initial work toward tensor-parallel support Missing all-gather op Signed-off-by: Tim Moon <[email protected]> * Enable fused LoRA based on model config Signed-off-by: Tim Moon <[email protected]> * Tweak comments Signed-off-by: Tim Moon <[email protected]> * Add TE version checks Signed-off-by: Tim Moon <[email protected]> * Fix linter warning Signed-off-by: Tim Moon <[email protected]> * Apply isort and black reformatting Signed-off-by: timmoon10 <[email protected]> * Use in-place fork/add ops to enable GEMMs with beta=1 Signed-off-by: Tim Moon <[email protected]> * Add ops directly to te.op.Sequential Signed-off-by: Tim Moon <[email protected]> * Move fused LoRA impl into LoRALinear subclass Signed-off-by: Tim Moon <[email protected]> * Fix bug where fused impl was always disabled Signed-off-by: Tim Moon <[email protected]> * Apply isort and black reformatting Signed-off-by: timmoon10 <[email protected]> * Support wgrad accumulation fusion Signed-off-by: Tim Moon <[email protected]> * Add integration test for TE op fuser Signed-off-by: Tim Moon <[email protected]> * Apply isort and black reformatting Signed-off-by: timmoon10 <[email protected]> * Explicitly list module containers that are compatible with list or dict APIs Mcore subclasses of te.ops.Sequential are iterable, but are not compatible with list API. Signed-off-by: Tim Moon <[email protected]> * Apply isort and black reformatting Signed-off-by: timmoon10 <[email protected]> * Add missing docstring Signed-off-by: Tim Moon <[email protected]> * Apply isort and black reformatting Signed-off-by: timmoon10 <[email protected]> * Update Mcore version Signed-off-by: Tim Moon <[email protected]> * Update Megatron-LM commit Signed-off-by: Tim Moon <[email protected]> * Attempt to support forward hooks in fused LoRA Signed-off-by: Tim Moon <[email protected]> * Apply isort and black reformatting Signed-off-by: timmoon10 <[email protected]> --------- Signed-off-by: Tim Moon <[email protected]> Signed-off-by: timmoon10 <[email protected]> Signed-off-by: Tim Moon <[email protected]> Signed-off-by: gautham-kollu <[email protected]> Co-authored-by: timmoon10 <[email protected]> Co-authored-by: gautham-kollu <[email protected]> Co-authored-by: Charlie Truong <[email protected]> * add load-in-4bit param (#14636) Signed-off-by: dimapihtar <[email protected]> * fp4 support (#14625) Signed-off-by: qiyuw <[email protected]> Co-authored-by: qiyuw <[email protected]> Co-authored-by: gautham-kollu <[email protected]> * Update Reasoning-SFT.ipynb (#14716) Signed-off-by: Chen Cui <[email protected]> * Remove artificial block to vortex fp8 TP (#14684) * Remove artificial block to vortex fp8 TP Signed-off-by: John St John <[email protected]> * Handle sequence_parallel=True TP>1 case properly where theres an all gather Signed-off-by: John St John <[email protected]> --------- Signed-off-by: John St John <[email protected]> * Replace MegatronTokenizer with MegatronLegacyTokenizer (#14721) Signed-off-by: Charlie Truong <[email protected]> * Update ModelCommPGs API from megatron-core (#14578) * update Signed-off-by: yaoyu-33 <[email protected]> * Bump Mcore to b615e73 Signed-off-by: Charlie Truong <[email protected]> * Replace ProcessGroupsCollection with ProcessGroupCollection Signed-off-by: Charlie Truong <[email protected]> * Replace pgs_collection with pg_collection Signed-off-by: Charlie Truong <[email protected]> --------- Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Charlie Truong <[email protected]> Co-authored-by: Charlie Truong <[email protected]> * drop speech_llm example suite (#14683) Signed-off-by: yaoyu-33 <[email protected]> * feat: Compatibility modification of megatron-fsdp (#14593) * nvfsdp_update Signed-off-by: Selvaraj Anandaraj <[email protected]> Signed-off-by: jianbinc <[email protected]> * add megatron-fsdp checkpoint support Signed-off-by: jianbinc <[email protected]> * update use_custom_fsdp to use_megatron_fsdp Signed-off-by: jianbinc <[email protected]> * revert back pretrain_llama3_8b.py formt code Signed-off-by: jianbinc <[email protected]> * Apply isort and black reformatting Signed-off-by: shjwudp <[email protected]> * keep use_custom_fsdp as backup and notify this will deprecated on m-core 0.14 Signed-off-by: jianbinc <[email protected]> * Apply isort and black reformatting Signed-off-by: shjwudp <[email protected]> * fix CodeQL check Signed-off-by: jianbinc <[email protected]> --------- Signed-off-by: Selvaraj Anandaraj <[email protected]> Signed-off-by: jianbinc <[email protected]> Signed-off-by: shjwudp <[email protected]> Co-authored-by: Selvaraj Anandaraj <[email protected]> Co-authored-by: shjwudp <[email protected]> * imported get_moe_layer_wise_logging_tracker from megatron core moe_utils (#14694) * imported get_moe_layer_wise_logging_tracker from megatron core moe_utils Signed-off-by: Prathamesh Kalamkar <[email protected]> * Apply isort and black reformatting Signed-off-by: prathamk-tw <[email protected]> * moved import to the top * Apply isort and black reformatting Signed-off-by: prathamk-tw <[email protected]> --------- Signed-off-by: Prathamesh Kalamkar <[email protected]> Signed-off-by: prathamk-tw <[email protected]> Co-authored-by: prathamk-tw <[email protected]> * cast SE weights and activations to fp32 (#14743) Signed-off-by: Elena Rastorgueva <[email protected]> * remove env var (#14739) Signed-off-by: Malay Nagda <[email protected]> * detach arg option for run scripts (#14722) * detach arg option for run scripts Signed-off-by: Malay Nagda <[email protected]> * int dit opt instances Signed-off-by: Malay Nagda <[email protected]> --------- Signed-off-by: Malay Nagda <[email protected]> * Use lhotse dataloader for ASR models to support in-manifest channel selection for multichannel recordings (#14586) * make EncDecCTCModelBPE use lhotse dataloader when transcribing Signed-off-by: Roman Korostik <[email protected]> * make EncDecHybridRNNTCTCBPEModel use lhotse dataloader when transcribing Signed-off-by: Roman Korostik <[email protected]> * make EncDecRNNTBPEModel use lhotse dataloader when transcribing Signed-off-by: Roman Korostik <[email protected]> * clarify some error messages Signed-off-by: Roman Korostik <[email protected]> --------- Signed-off-by: Roman Korostik <[email protected]> * Randomized shard slicing for tarred data (#14558) * Randomized shard slicing for tarred data Signed-off-by: Piotr Żelasko <[email protected]> * Add shuffling shards in untarred sharegpt and multimodal conversation sources Signed-off-by: Piotr Żelasko <[email protected]> * Extend slice_length support to multimodal and sharegpt conversations Signed-off-by: Piotr Żelasko <[email protected]> * Update lhotse requirement version Signed-off-by: Piotr Żelasko <[email protected]> --------- Signed-off-by: Piotr Żelasko <[email protected]> * Data prediction objective for flow matching speech enhancement models (#14749) * flow matching: support x-prediction (data as target for the estimator) Signed-off-by: Roman Korostik <[email protected]> * flow matching: fix model init in x-prediction case Signed-off-by: Roman Korostik <[email protected]> * flow matching: add estimator_target to sampler in example configs Signed-off-by: Roman Korostik <[email protected]> * flow matching: expand tests to include data prediction models Signed-off-by: Roman Korostik <[email protected]> * Apply isort and black reformatting Signed-off-by: racoiaws <[email protected]> --------- Signed-off-by: Roman Korostik <[email protected]> Signed-off-by: racoiaws <[email protected]> Co-authored-by: racoiaws <[email protected]> * Fix Some Failures (#14763) * Use megatron_fsdp instead of custom_fsdp for Flux tests. Signed-off-by: Wil Kong <[email protected]> * Update megatron.core quick_gelu import path. Signed-off-by: Wil Kong <[email protected]> --------- Signed-off-by: Wil Kong <[email protected]> * Support additional Slurm parameters (#14742) * support additional slurm params and test with nemotron4 * fixed parsing of slurm params * fix incorrect parsing due to fallback * add support for all performance scripts * Apply isort and black reformatting * remove unused import --------- Signed-off-by: bdubauski <[email protected]> Signed-off-by: Barys Dubauski <[email protected]> Co-authored-by: Barys Dubauski <[email protected]> Co-authored-by: bdubauski <[email protected]> * [Flux] Remove redundant host & device sync. (#14711) Signed-off-by: Wil Kong <[email protected]> Co-authored-by: gautham-kollu <[email protected]> * [Flux] Add cuda_graph_scope and cache images ids for full iteration cuda graph. (#14744) Signed-off-by: Wil Kong <[email protected]> Co-authored-by: gautham-kollu <[email protected]> * Add transducer timestamps without alignments, timestamps to streaming (#14766) * refactored timestamps, fully identical to previuos Signed-off-by: lilithgrigoryan <[email protected]> * removed alignments from rnnt timestamps Signed-off-by: lilithgrigoryan <[email protected]> * clean up Signed-off-by: lilithgrigoryan <[email protected]> * Apply isort and black reformatting Signed-off-by: lilithgrigoryan <[email protected]> * clean up Signed-off-by: lilithgrigoryan <[email protected]> * fix tdt confidence without alignments Signed-off-by: lilithgrigoryan <[email protected]> * minor fix Signed-off-by: lilithgrigoryan <[email protected]> * minor fix Signed-off-by: lilithgrigoryan <[email protected]> * minor fix Signed-off-by: lilithgrigoryan <[email protected]> * minor fix Signed-off-by: lilithgrigoryan <[email protected]> * Add timestamps option to streaming inference script Signed-off-by: Vladimir Bataev <[email protected]> * Fix config params Signed-off-by: Vladimir Bataev <[email protected]> * Fix tdt Signed-off-by: Vladimir Bataev <[email protected]> * fix tdt durations, clean up Signed-off-by: lilithgrigoryan <[email protected]> * Apply isort and black reformatting Signed-off-by: lilithgrigoryan <[email protected]> * tests fix, clean up Signed-off-by: lilithgrigoryan <[email protected]> * Apply isort and black reformatting Signed-off-by: lilithgrigoryan <[email protected]> * remove starting SOS symbols from beam decodings to match timestamps length Signed-off-by: lilithgrigoryan <[email protected]> * Apply isort and black reformatting Signed-off-by: lilithgrigoryan <[email protected]> --------- Signed-off-by: lilithgrigoryan <[email protected]> Signed-off-by: lilithgrigoryan <[email protected]> Signed-off-by: Vladimir Bataev <[email protected]> Co-authored-by: lilithgrigoryan <[email protected]> Co-authored-by: Vladimir Bataev <[email protected]> * Adding bf16 Sortformer train and inference (#14627) * Adding disabled autocast on bce_loss Signed-off-by: taejinp <[email protected]> * Adding Sortformer BF16 inference Signed-off-by: taejinp <[email protected]> * Adding BF16 inference and adding a config Signed-off-by: taejinp <[email protected]> * Apply isort and black reformatting Signed-off-by: tango4j <[email protected]> * Adding bf16-mixed option for both training and inference Signed-off-by: taejinp <[email protected]> * Apply isort and black reformatting Signed-off-by: tango4j <[email protected]> * Adding bf16-mixed option for e2e_diarize_speech.py Signed-off-by: taejinp <[email protected]> * Apply isort and black reformatting Signed-off-by: tango4j <[email protected]> --------- Signed-off-by: taejinp <[email protected]> Signed-off-by: tango4j <[email protected]> Co-authored-by: tango4j <[email protected]> * Replace texterrors with kaldialign library (#14775) * replace texterros with kaldialign for f-score computation Signed-off-by: andrusenkoau <[email protected]> * replace texterros with kaldialign for asr confidence Signed-off-by: andrusenkoau <[email protected]> * replace texterrors with kaldialign for ASR_Confidence_Estimation.ipynb Signed-off-by: andrusenkoau <[email protected]> * replace texterrors with kaldialing for ASR_Context_Biasing.ipynb Signed-off-by: andrusenkoau <[email protected]> * Apply isort and black reformatting Signed-off-by: andrusenkoau <[email protected]> * decrease kaldialign version Signed-off-by: andrusenkoau <[email protected]> --------- Signed-off-by: andrusenkoau <[email protected]> Signed-off-by: andrusenkoau <[email protected]> Co-authored-by: andrusenkoau <[email protected]> * Update prune-distill notebooks to Qwen3 + simplify + mmlu eval (#14785) * Update prune-distill notebooks to Qwen3 + simplify Signed-off-by: Keval Morabia <[email protected]> * address comments Signed-off-by: Keval Morabia <[email protected]> * Add readme.rst Signed-off-by: Keval Morabia <[email protected]> --------- Signed-off-by: Keval Morabia <[email protected]> * ci: Automodel deprecation warning (#14787) * add deprecation notice Signed-off-by: Alexandros Koumparoulis <[email protected]> * add deprecation notice Signed-off-by: Alexandros Koumparoulis <[email protected]> * add deprecation warning Signed-off-by: Alexandros Koumparoulis <[email protected]> * remove import Signed-off-by: Alexandros Koumparoulis <[email protected]> * move code Signed-off-by: Alexandros Koumparoulis <[email protected]> * add more notices Signed-off-by: Alexandros Koumparoulis <[email protected]> * Apply isort and black reformatting Signed-off-by: akoumpa <[email protected]> * Remove automodel cicd Signed-off-by: Dong Hyuk Chang <[email protected]> * Add deprecation notice for Automodel Signed-off-by: Dong Hyuk Chang <[email protected]> --------- Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: akoumpa <[email protected]> Signed-off-by: Dong Hyuk Chang <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: akoumpa <[email protected]> * Remove export-deploy, automodel, and eval tutorials (#14790) Signed-off-by: Charlie Truong <[email protected]> * Update gpt_oss.py (#14706) Signed-off-by: Chen Cui <[email protected]> * MXFP8 must only use E4M3 as dtype (#14793) Signed-off-by: Aditya Vavre <[email protected]> * fix: Use shutil.copy fallback to handle file metadata permission errors (#14639) * Add fallback for file copy to handle metadata errors Signed-off-by: vipnydav <[email protected]> * Add robust_copy for resilient file copy Signed-off-by: vipnydav <[email protected]> * Apply isort and black reformatting Signed-off-by: vipnydav <[email protected]> * remove imported Path from test_file.py Signed-off-by: vipnydav <[email protected]> * Move robust_copy method to util file Signed-off-by: vipnydav <[email protected]> * Apply isort and black reformatting Signed-off-by: vipnydav <[email protected]> * Fix lint Signed-off-by: vipnydav <[email protected]> --------- Signed-off-by: vipnydav <[email protected]> Signed-off-by: vipnydav <[email protected]> Co-authored-by: vipnydav <[email protected]> * OneLogger Integration (#13437) * feat: add callback group definition & callback ABC Signed-off-by: Jiashang Hu <[email protected]>\nSigned-off-by: "Zhengjiang Shao" <[email protected]> Signed-off-by: Zhengjiang Shao <[email protected]> * Apply isort and black reformatting Signed-off-by: PytLab <[email protected]> * feat: insert callback functions of CallbackGroup Signed-off-by: Jiashang Hu <[email protected]>\nSigned-off-by: "Zhengjiang Shao" <[email protected]> Signed-off-by: Zhengjiang Shao <[email protected]> * Apply isort and black reformatting Signed-off-by: PytLab <[email protected]> * chore: PR test for jiashang Signed-off-by: Jiashang Hu <[email protected]> * feat: use __init_subclass__ to cover all ModelPT subclasses Signed-off-by: Jiashang Hu <[email protected]>\nSigned-off-by: "Zhengjiang Shao" <[email protected]> Signed-off-by: Zhengjiang Shao <[email protected]> * Apply isort and black reformatting Signed-off-by: PytLab <[email protected]> * feat: Adding metadata config manager poc Signed-off-by: Jiashang Hu <[email protected]>\nSigned-off-by: "Saju Prasad" <[email protected]> Signed-off-by: Saju Prasad <[email protected]> * Apply isort and black reformatting Signed-off-by: sajup-oss <[email protected]> * feat: revert test changes. Signed-off-by: liquor233 <[email protected]> * fix: Updating metadata attributes Signed-off-by: Jiashang Hu <[email protected]>\nSigned-off-by: sajup <[email protected]> * Apply isort and black reformatting Signed-off-by: sajup-oss <[email protected]> * fix: Adding OneloggerCallback Signed-off-by: Jiashang Hu <[email protected]>\nSigned-off-by: sajup <[email protected]> * fix: Reverting changes in examples/multimodal/speech_llm/modular_audio_gpt_train.py Signed-off-by: Jiashang Hu <[email protected]>\nSigned-off-by: sajup <[email protected]> * Apply isort and black reformatting Signed-off-by: sajup-oss <[email protected]> * fix: update modular models and megatron GPT models Signed-off-by: Jiashang Hu <[email protected]> * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> Signed-off-by: liquor233 <[email protected]> * feat: add on_app_start and on_app_end Signed-off-by: Jiashang Hu <[email protected]> * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> Signed-off-by: liquor233 <[email protected]> * fix: Adding small test example for testing Signed-off-by: Jiashang Hu <[email protected]>\nSigned-off-by: sajup <[email protected]> * Apply isort and black reformatting Signed-off-by: sajup-oss <[email protected]> * fix: Fixing review comments as discussed with Jiashang Signed-off-by: Jiashang Hu <[email protected]>\nSigned-off-by: "Saju Prasad" <[email protected]> Signed-off-by: Saju Prasad <[email protected]> * Apply isort and black reformatting Signed-off-by: sajup-oss <[email protected]> * fix: updating nemo code to v2 Signed-off-by: Jiashang Hu <[email protected]>\nSigned-off-by: sajup <[email protected]> * Apply isort and black reformatting Signed-off-by: sajup-oss <[email protected]> * fix: updating wandb to get info from env Signed-off-by: Jiashang Hu <[email protected]>\nSigned-off-by: sajup <[email protected]> * Apply isort and black reformatting Signed-off-by: sajup-oss <[email protected]> * fix: fix som impl issue Signed-off-by: Jiashang Hu <[email protected]>\nSigned-off-by: liquor233 <[email protected]> * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> Signed-off-by: liquor233 <[email protected]> * fix: fix issue for exp manager. Signed-off-by: Jiashang Hu <[email protected]>\nSigned-off-by: liquor233 <[email protected]> * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> Signed-off-by: liquor233 <[email protected]> * feat: remove callback_group Signed-off-by: Jiashang Hu <[email protected]>\nSigned-off-by: liquor233 <[email protected]> * feat: fix timingtracker issue Signed-off-by: Jiashang Hu <[email protected]>\nSigned-off-by: liquor233 <[email protected]> * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> Signed-off-by: liquor233 <[email protected]> * feat: fix for startup callbcaks Signed-off-by: Jiashang Hu <[email protected]>\nSigned-off-by: liquor233 <[email protected]> * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> Signed-off-by: liquor233 <[email protected]> * feat: change to adapter Signed-off-by: Jiashang Hu <[email protected]>\nSigned-off-by: liquor233 <[email protected]> * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> Signed-off-by: liquor233 <[email protected]> * feat: use new nv-one-logger Signed-off-by: Jiashang Hu <[email protected]>\nSigned-off-by: liquor233 <[email protected]> * feat: add on_app_end Signed-off-by: Jiashang Hu <[email protected]>\nSigned-off-by: liquor233 <[email protected]> * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> Signed-off-by: liquor233 <[email protected]> * feat: make OneLogger configurable Signed-off-by: Jiashang Hu <[email protected]>\nSigned-off-by: liquor233 <[email protected]> * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> Signed-off-by: liquor233 <[email protected]> * feat: remove NeMocallback import Signed-off-by: Jiashang Hu <[email protected]>\nSigned-off-by: liquor233 <[email protected]> * feat: fix the enable_onelogger setting. Signed-off-by: Jiashang Hu <[email protected]>\nSigned-off-by: liquor233 <[email protected]> * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> Signed-off-by: liquor233 <[email protected]> * feat: clean the code. Signed-off-by: Jiashang Hu <[email protected]>\nSigned-off-by: liquor233 <[email protected]> * feat: enable onelogger Signed-off-by: Jiashang Hu <[email protected]>\nSigned-off-by: liquor233 <[email protected]> * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> Signed-off-by: liquor233 <[email protected]> * test: Adding few unit tests Signed-off-by: Jiashang Hu <[email protected]>\nSigned-off-by: "Saju Prasad" <[email protected]> Signed-off-by: Saju Prasad <[email protected]> * Apply isort and black reformatting Signed-off-by: sajup-oss <[email protected]> * feat: tmp fix for functional testing. Signed-off-by: Jiashang Hu <[email protected]>\nSigned-off-by: liquor233 <[email protected]> * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> Signed-off-by: liquor233 <[email protected]> * fix: add on_app_end for NeMov2 Signed-off-by: Jiashang Hu <[email protected]>\nSigned-off-by: liquor233 <[email protected]> * fix: typo. Signed-off-by: Jiashang Hu <[email protected]>\nSigned-off-by: liquor233 <[email protected]> * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> Signed-off-by: liquor233 <[email protected]> * fix: fix the get attributes Signed-off-by: Jiashang Hu <[email protected]>\nSigned-off-by: liquor233 <[email protected]> * fix: moving test test_meta_info_manager.py to tests/collections/common/ Signed-off-by: Jiashang Hu <[email protected]>\nSigned-off-by: "Saju Prasad" <[email protected]> Signed-off-by: Saju Prasad <[email protected]> * fix: fix format issue. Signed-off-by: Jiashang Hu <[email protected]>\nSigned-off-by: liquor233 <[email protected]> * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> Signed-off-by: liquor233 <[email protected]> * fix: fix lint errors Signed-off-by: liquor233 <[email protected]> * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> Signed-off-by: liquor233 <[email protected]> * Revert "Apply isort and black reformatting" This reverts commit de6994d7e6e12e4040a5819cd1375c7a22ee7e0a. Signed-off-by: Jiashang Hu <[email protected]> * Revert "fix: fix lint errors" This reverts commit 8e47ecd749a1583597e8b8253f4eee4b231dbdf6. Signed-off-by: Jiashang Hu <[email protected]> * fix: fix linting issues. Signed-off-by: Jiashang Hu <[email protected]> * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> Signed-off-by: liquor233 <[email protected]> * fix: fix linting issue Signed-off-by: Jiashang Hu <[email protected]> * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> Signed-off-by: liquor233 <[email protected]> * fix: add copyright info Signed-off-by: Jiashang Hu <[email protected]> * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> Signed-off-by: liquor233 <[email protected]> * fix: small fix. Signed-off-by: Jiashang Hu <[email protected]>\nSigned-off-by: liquor233 <[email protected]> * fix: fix small issues for t5 Signed-off-by: Jiashang Hu <[email protected]>\nSigned-off-by: liquor233 <[email protected]> * fix: fix dataloader issue. Signed-off-by: Jiashang Hu <[email protected]>\nSigned-off-by: liquor233 <[email protected]> * fix: remove dataloader setting. Signed-off-by: Jiashang Hu <[email protected]>\nSigned-off-by: liquor233 <[email protected]> * feat: update OneLogger. Signed-off-by: Jiashang Hu <[email protected]>\nSigned-off-by: liquor233 <[email protected]> * fix: fix hydra runner. Signed-off-by: Jiashang Hu <[email protected]> * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> Signed-off-by: liquor233 <[email protected]> * fix: start using partial config. Signed-off-by: Jiashang Hu <[email protected]>\nSigned-off-by: liquor233 <[email protected]> * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> Signed-off-by: liquor233 <[email protected]> * fix: fix the unused variables Signed-off-by: Jiashang Hu <[email protected]> * fix: change get_one_logger name Signed-off-by: Jiashang Hu <[email protected]> * fix: code clean up. Signed-off-by: Jiashang Hu <[email protected]> * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> Signed-off-by: liquor233 <[email protected]> * fix: import more specific to avoid circular dependency. (#14306) Signed-off-by: Jiashang Hu <[email protected]>\nSigned-off-by: Peiyuan <[email protected]> * fix: use ptl callback from ls Signed-off-by: Jiashang Hu <[email protected]> * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> Signed-off-by: liquor233 <[email protected]> * feat: fix meta info manager. Signed-off-by: Jiashang Hu <[email protected]>\nSigned-off-by: liquor233 <[email protected]> * fix: fix meta data issue. Signed-off-by: Jiashang Hu <[email protected]> * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> Signed-off-by: liquor233 <[email protected]> * fix: fix the lint issue Signed-off-by: Jiashang Hu <[email protected]> * fix: fix the unit tests. Signed-off-by: Jiashang Hu <[email protected]> * fix: fix minor metadata issue. Signed-off-by: Jiashang Hu <[email protected]> * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> Signed-off-by: liquor233 <[email protected]> * fix: fix some test issues Signed-off-by: Jiashang Hu <[email protected]> * fix: fix pytest issue for meta info manager Signed-off-by: Jiashang Hu <[email protected]> * fix: fix lint issues for optimizers. Signed-off-by: Jiashang Hu <[email protected]> * fix: fix pytest issues. Signed-off-by: Jiashang Hu <[email protected]> * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> Signed-off-by: liquor233 <[email protected]> * fix: fix CICD issues. Signed-off-by: Jiashang Hu <[email protected]> * fix: fix all pytests Signed-off-by: Jiashang Hu <[email protected]> * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> Signed-off-by: liquor233 <[email protected]> * chore: fix lint Signed-off-by: Jiashang Hu <[email protected]> * chore: fix unused import issues. Signed-off-by: Jiashang Hu <[email protected]> * chore: fix CICD issues. Signed-off-by: Jiashang Hu <[email protected]> * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> Signed-off-by: liquor233 <[email protected]> * fix: fix the CICD issues. Signed-off-by: Jiashang Hu <[email protected]> * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> Signed-off-by: liquor233 <[email protected]> * fix: fix the linting issue Signed-off-by: Jiashang Hu <[email protected]> * fix: fix CICD issues. Signed-off-by: Jiashang Hu <[email protected]> * fix: fix the circular import issue. Signed-off-by: Jiashang Hu <[email protected]> * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> Signed-off-by: liquor233 <[email protected]> * fix: fix some pytests. Signed-off-by: Jiashang Hu <[email protected]> * fix: revert some change. Signed-off-by: Jiashang Hu <[email protected]> * fix: error handling for init onelogger Signed-off-by: Jiashang Hu <[email protected]> * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> Signed-off-by: liquor233 <[email protected]> * chore: fix one_logger code. Signed-off-by: Jiashang Hu <[email protected]>\nSigned-off-by: liquor233 <[email protected]> * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> Signed-off-by: liquor233 <[email protected]> * chore: remove unused vars. Signed-off-by: Jiashang Hu <[email protected]> * fix: fix CICD for nemo Signed-off-by: Jiashang Hu <[email protected]> * chore: fix NeMo CICD. Signed-off-by: Jiashang Hu <[email protected]> * chore: renaming onelogger Signed-off-by: Jiashang Hu <[email protected]> * chore: fix some exception. Signed-off-by: Jiashang Hu <[email protected]> * chore: renaming. Signed-off-by: Jiashang Hu <[email protected]> * chore: resolve some comments. Signed-off-by: Jiashang Hu <[email protected]> * chore: remove duplicate init. Signed-off-by: Jiashang Hu <[email protected]> * chore: resolve some github comments. Signed-off-by: Jiashang Hu <[email protected]> * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> Signed-off-by: liquor233 <[email protected]> * chore: fix the linting issue. Signed-off-by: Jiashang Hu <[email protected]> * chore(callbacks): restore generic CallbackGroup and route telemetry v… (#14628) * chore(callbacks): restore generic CallbackGroup and route telemetry via group\n\n- Add BaseCallback and CallbackGroup with update_config and class init hook\n- Register OneLoggerAdapterCallback into group; merge config update into class\n- Replace direct OneLogger API usages with CallbackGroup across code\n- Ensure trainer attaches registered callbacks via group.update_config\n- Add nv-one-logger>=2.0.0 to base requirements\n\nSigned-off-by: Jiashang Hu <[email protected]> Signed-off-by: Jiashang Hu <[email protected]> * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> * chore: renaming. * chore: revert the change to install nv-one-logger * chore: fix the linting issue Signed-off-by: Jiashang Hu <[email protected]> * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> --------- Signed-off-by: Jiashang Hu <[email protected]> Signed-off-by: liquor233 <[email protected]> Co-authored-by: liquor233 <[email protected]>\nSigned-off-by: liquor233 <[email protected]> * Add tests for callback group (#14632) * chore: fix some circular dependency issues. * chore: move the files to utils. * chore: add unit tests * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> * chore: fix nv-one-logger tests * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> * chore: fix lint issue. * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> * chore: change the location. * chore: remaining fix. * chore: remaining changes. * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> * chore: fix the tests * chore: fix some lint. * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> * Revert prompt_encoder.py to c5ef26c (Jason Wang) to undo auto-formatting * pre-commit: exclude prompt_encoder.py from black/isort formatting * chore: undo lasst commit. * fix: fix some part for nemocallback. * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> * chore: fix some pytest * fix: verify the auto-hooked functions are called once Signed-off-by: Zhengjiang Shao <[email protected]> --------- Signed-off-by: liquor233 <[email protected]> Signed-off-by: Zhengjiang Shao <[email protected]> Co-authored-by: liquor233 <[email protected]> Co-authored-by: Zhengjiang Shao <[email protected]>\nSigned-off-by: liquor233 <[email protected]> * fix: fix the double init issue Signed-off-by: Jiashang Hu <[email protected]> * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> Signed-off-by: liquor233 <[email protected]> * fix: fix the push Signed-off-by: Jiashang Hu <[email protected]> * Guarantee one logger on_app_end calls (#14691) * fix: guarantee on_app_end calls can be invoked finally Signed-off-by: Zhengjiang Shao <[email protected]> * feat: add context manager creator in CallbackGroup * Revert "feat: add context manager creator in CallbackGroup" This reverts commit 381f83de5c914f08707fecb22e4674e7b3f6b104. Signed-off-by: Zhengjiang Shao <[email protected]> --------- Signed-off-by: Zhengjiang Shao <[email protected]> * fix: remove meta info manager (#14689) * fix: remove meta info manager Signed-off-by: Jiashang Hu <[email protected]> * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> --------- Signed-off-by: Jiashang Hu <[email protected]> Signed-off-by: liquor233 <[email protected]> Co-authored-by: liquor233 <[email protected]> * fix: fix some linting issues. * fix: fix unit tests. * chore: fix mcore Signed-off-by: Jiashang Hu <[email protected]> * fix: fix the installing problem Signed-off-by: Jiashang Hu <[email protected]> * fix: fix requirements Signed-off-by: Jiashang Hu <[email protected]> * fix: fix the mcore version. Signed-off-by: Jiashang Hu <[email protected]> * fix: fix the mcore version. Signed-off-by: Jiashang Hu <[email protected]> * fix: fix the mcore version. Signed-off-by: Jiashang Hu <[email protected]> * fix: fix the mcore version. Signed-off-by: Jiashang Hu <[email protected]> * fix: fix the mcore version. Signed-off-by: Jiashang Hu <[email protected]> * fix: fix the mcore version. Signed-off-by: Jiashang Hu <[email protected]> * fix: use correct global_step for async ckpt success event Signed-off-by: Zhengjiang Shao <[email protected]> * fix: fix unit tests Signed-off-by: Jiashang Hu <[email protected]> * fix: fix requirements Signed-off-by: Jiashang Hu <[email protected]> * fix: refactor the unit tests Signed-off-by: Jiashang Hu <[email protected]> * fix: insert callbacks in CallbackGroup before other PTL callbacks Signed-off-by: Zhengjiang Shao <[email protected]> * fix: fix call on app start flag Signed-off-by: Jiashang Hu <[email protected]> * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> * fix: fix unit tests Signed-off-by: Jiashang Hu <[email protected]> * fix: bump nv-one-logger version Signed-off-by: Jiashang Hu <[email protected]> * fix: fix the unit tests Signed-off-by: Jiashang Hu <[email protected]> * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> * fix: fix the cicd issues. * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> * fix: fix some lint issues Signed-off-by: Jiashang Hu <[email protected]> * fix: fix unused import Signed-off-by: Jiashang Hu <[email protected]> * fix: make oneloggernemocallback singleton Signed-off-by: Jiashang Hu <[email protected]> * fix: fix lint issues Signed-off-by: Jiashang Hu <[email protected]> * fix: make oneloggernemocallback singleton * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> * fix: keep the original callbacks order in CallbackGroup when merging with trainer.callbacks * fix: fix the unit tests Signed-off-by: Jiashang Hu <[email protected]> * fix: fix unit tests Signed-off-by: Jiashang Hu <[email protected]> * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> * fix: fix lint issues Signed-off-by: Jiashang Hu <[email protected]> * fix: fix the pickle issue. * Apply isort and black reformatting Signed-off-by: liquor233 <[email protected]> * fix: fix issue. * fix: fix callback Signed-off-by: Jiashang Hu <[email protected]> * fix: fix callback group Signed-off-by: Jiashang Hu <[email protected]> --------- Signed-off-by: Zhengjiang Shao <[email protected]> Signed-off-by: PytLab <[email protected]> Signed-off-by: Jiashang Hu <[email protected]> Signed-off-by: Saju Prasad <[email protected]> Signed-off-by: sajup-oss <[email protected]> Signed-off-by: liquor233 <[email protected]> Signed-off-by: Jiashang Hu <[email protected]>\nSigned-off-by: sajup <[email protected]> Signed-off-by: sajup <[email protected]> Signed-off-by: Saju Prasad <[email protected]> Signed-off-by: Jiashang Hu <[email protected]>\nSigned-off-by: liquor233 <[email protected]> Signed-off-by: Saju Prasad <[email protected]> Signed-off-by: Jiashang Hu <[email protected]>\nSigned-off-by: Peiyuan <[email protected]> Signed-off-by: liquor233 <[email protected]> Co-authored-by: PytLab <[email protected]> Co-authored-by: Jiashang Hu <[email protected]> Co-authored-by: Saju Prasad <[email protected]> Co-authored-by: sajup-oss <[email protected]> Co-authored-by: sajup <[email protected]> Co-authored-by: liquor233 <[email protected]> Co-authored-by: Saju Prasad <[email protected]> Co-authored-by: Saju Prasad <[email protected]> Co-authored-by: Peiyuan <[email protected]> Co-authored-by: Peiyuan Qi <[email protected]> * Disable blank Issues (#14788) Signed-off-by: Pablo Garay <[email protected]> * Add community label bot (#14796) Signed-off-by: Charlie Truong <[email protected]> * Add mistral small3 24B config and recipe (#14784) * Add mistral small3 24B config and recipe Signed-off-by: Joosung Yoon <[email protected]> --------- Signed-off-by: Joosung Yoon <[email protected]> Signed-off-by: Alexandros Koumparoulis <[email protected]> Signed-off-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> Co-authored-by: Alexandros Koumparoulis <[email protected]> * Update changelog for `r2.3.0` (#14812) * beep boop: Update changelog Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Update changelog for 2.3.3 Signed-off-by: Charlie Truong <[email protected]> * Fix changelog for 2.3.3 Signed-off-by: Charlie Truong <[email protected]> --------- Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Signed-off-by: Charlie Truong <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Charlie Truong <[email protected]> * QWEN2.5-VL 7B FP8 Recipe (#14801) * QWEN2.5-VL FP8 Recipe Signed-off-by: Lifu Zhang <[email protected]> * Apply isort and black reformatting Signed-off-by: tomlifu <[email protected]> * add model configs Signed-off-by: Lifu Zhang <[email protected]> --------- Signed-off-by: Lifu Zhang <[email protected]> Signed-off-by: tomlifu <[email protected]> Co-authored-by: tomlifu <[email protected]> * disk space management: nemo install test (#14822) * Add Customization Capabilities to Cache-Aware Models (#14757) * Add Customization Capabilities to Cache-Aware Models Signed-off-by: Vladimir Bataev <[email protected]> * Unify params with other transcription scripts Signed-off-by: Vladimir Bataev <[email protected]> * Fix usage with manifests containing relative paths Signed-off-by: Vladimir Bataev <[email protected]> * Fix decoding config setup Signed-off-by: Vladimir Bataev <[email protected]> * Return back output_path Signed-off-by: Vladimir Bataev <[email protected]> * Raise not implemented error if batched beam search performed with partial hypotheses Signed-off-by: Vladimir Bataev <[email protected]> * Raise not implemented error if batched beam search in transducer performed with partial hypotheses Signed-off-by: Vladimir Bataev <[email protected]> * Fix after merge Signed-off-by: Vladimir Bataev <[email protected]> * Fix att_context_size param Signed-off-by: Vladimir Bataev <[email protected]> * Use optional for left_chunks Signed-off-by: Vladimir Bataev <[email protected]> * Apply isort and black reformatting Signed-off-by: artbataev <[email protected]> * Unify parameters with transcribe_speech Signed-off-by: Vladimir Bataev <[email protected]> * Fix docstring Signed-off-by: Vladimir Bataev <[email protected]> * Unify dtype selection Signed-off-by: Vladimir Bataev <[email protected]> * Fix unused variables Signed-off-by: Vladimir Bataev <[email protected]> * Enhance inline documentation. Set compute_dtype=float32 by default. Signed-off-by: Vladimir Bataev <[email protected]> --------- Signed-off-by: Vladimir Bataev <[email protected]> Signed-off-by: artbataev <[email protected]> Co-authored-by: artbataev <[email protected]> * Evo2 address rare over-masking in 1m context dataset (#14821) * Address problems where sometimes in 1m dataset there are very large masked segments Signed-off-by: John St John <[email protected]> * only flip the tag extra if the segment length is too long Signed-off-by: John St John <[email protected]> * Undo the change to the pre commit config Signed-off-by: John St John <[email protected]> * Add clarifying comments about the state flipping logic Signed-off-by: John St John <[email protected]> --------- Signed-off-by: John St John <[email protected]> * Update cherry-pick workflow to use version 0.63.0 (#14832) * Update cherry-pick workflow to use version 0.63.0 Signed-off-by: Pablo Garay <[email protected]> * Update cherry-pick workflow version tag Signed-off-by: Pablo Garay <[email protected]> --------- Signed-off-by: Pablo Garay <[email protected]> * docs: Removing automodel items (#14840) Signed-off-by: Andrew Schilling <[email protected]> * update docs per guidance (#14841) * Update changelog for `v2.4.1` (#14828) * beep boop: Update changelog Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Fix changelog for 2.4.1 Signed-off-by: Charlie Truong <[email protected]> --------- Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Signed-off-by: Charlie Truong <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Charlie Truong <[email protected]> * Fi…

Add Customization Capabilities to Cache-Aware Models

e73aa6e

Signed-off-by: Vladimir Bataev <[email protected]>

github-actions bot added the ASR label Sep 18, 2025

artbataev and others added 14 commits September 18, 2025 18:23

Unify params with other transcription scripts

2cf395d

Signed-off-by: Vladimir Bataev <[email protected]>

Merge branch 'main' into cache_aware_customization

39e6316

Fix usage with manifests containing relative paths

1e70fa6

Signed-off-by: Vladimir Bataev <[email protected]>

Fix decoding config setup

5203706

Signed-off-by: Vladimir Bataev <[email protected]>

Return back output_path

0eab64d

Signed-off-by: Vladimir Bataev <[email protected]>

Raise not implemented error if batched beam search performed with par…

d37e354

…tial hypotheses Signed-off-by: Vladimir Bataev <[email protected]>

Raise not implemented error if batched beam search in transducer perf…

1e426a2

…ormed with partial hypotheses Signed-off-by: Vladimir Bataev <[email protected]>

Merge branch 'main' into cache_aware_customization

c53d686

# Conflicts: # examples/asr/asr_chunked_inference/rnnt/speech_to_text_streaming_infer_rnnt.py

Fix after merge

cf70571

Signed-off-by: Vladimir Bataev <[email protected]>

Fix att_context_size param

abcc668

Signed-off-by: Vladimir Bataev <[email protected]>

Use optional for left_chunks

94c53cb

Signed-off-by: Vladimir Bataev <[email protected]>

Apply isort and black reformatting

be2e184

Signed-off-by: artbataev <[email protected]>

Unify parameters with transcribe_speech

526373b

Signed-off-by: Vladimir Bataev <[email protected]>

Fix docstring

283334e

Signed-off-by: Vladimir Bataev <[email protected]>

github-advanced-security bot found potential problems Sep 25, 2025

View reviewed changes

artbataev added 2 commits September 25, 2025 14:27

Unify dtype selection

7b70d20

Signed-off-by: Vladimir Bataev <[email protected]>

Merge branch 'main' into cache_aware_customization

71ecb17

artbataev marked this pull request as ready for review September 25, 2025 10:41

Fix unused variables

d562151

Signed-off-by: Vladimir Bataev <[email protected]>

artbataev requested review from KunalDhawan, Copilot, lilithgrigoryan and nithinraok September 25, 2025 10:49

Copilot AI reviewed Sep 25, 2025

View reviewed changes

nithinraok previously approved these changes Sep 25, 2025

View reviewed changes

KunalDhawan requested changes Sep 25, 2025

View reviewed changes

Merge branch 'main' into cache_aware_customization

4a05ae7

Enhance inline documentation. Set compute_dtype=float32 by default.

67d32eb

Signed-off-by: Vladimir Bataev <[email protected]>

artbataev dismissed nithinraok’s stale review via 67d32eb September 25, 2025 20:33

artbataev requested a review from KunalDhawan September 25, 2025 20:36

artbataev added the Run CICD label Sep 25, 2025

artbataev temporarily deployed to test September 25, 2025 21:07 — with GitHub Actions Inactive

github-actions bot removed the Run CICD label Sep 25, 2025

artbataev requested a review from nithinraok September 26, 2025 18:13

KunalDhawan approved these changes Sep 26, 2025

View reviewed changes

artbataev merged commit 64365ca into main Sep 26, 2025
151 of 160 checks passed

artbataev deleted the cache_aware_customization branch September 26, 2025 19:34

	For models which support multiple lookaheads, the default is the first one in the list of model.encoder.att_context_size. To change it, you may use att_context_size, for example att_context_size=§[70,1].
	For models which support multiple lookaheads, the default is the first one in the list of model.encoder.att_context_size. To change it, you may use att_context_size, for example att_context_size=[70,1].

		processed_signal, processed_signal_length = streaming_buffer.get_all_audios()
		processed_signal = processed_signal.to(compute_dtype)


		## To evaluate a model in cache-aware streaming mode on a single audio file:

		python speech_to_text_streaming_infer.py \

Uh oh!

Add Customization Capabilities to Cache-Aware Models #14757

Add Customization Capabilities to Cache-Aware Models #14757

Uh oh!

Conversation

artbataev commented Sep 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do ?

Changelog

Usage

GitHub Actions CI

Before your PR is "Ready for review"

Who can review?

Additional Information

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Sep 25, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Sep 25, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Sep 25, 2025

Choose a reason for hiding this comment

Uh oh!

nithinraok left a comment

Choose a reason for hiding this comment

Uh oh!

nithinraok commented Sep 25, 2025

Uh oh!

KunalDhawan left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

artbataev commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Sep 25, 2025

Uh oh!

KunalDhawan left a comment

Choose a reason for hiding this comment

Uh oh!

nithinraok commented Sep 26, 2025

Uh oh!

Uh oh!

artbataev commented Sep 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

artbataev commented Sep 18, 2025 •

edited

Loading

artbataev commented Sep 25, 2025 •

edited

Loading