Skip to content

Conversation

@bdevorem
Copy link
Contributor

@bdevorem bdevorem commented Oct 17, 2025

First round in addressing #4334

Motivation

rocMLIR has implemented split kv and GQA, which enables us to implement flash decoding. Now we need to add this to the migraphx side.

Technical Details

This supports rewriting attention with a specified number of splits along the KV sequence length dimension, such that those chunks of the attention can be computed in parallel. Afterwards, the results are rescaled and combined. In practice, for migraphx, that rewriting looks like:

Q -> [B, M, k]
K -> [B, k, N]
V -> [B, N, D]

S = dot(Q, K)
P = softmax(S)
O = dot(P, V) # [B, M, D]

rewritten into

Q -> [B, G, M, k] 
K -> [B, G, k, N/G]
V -> [B, G, N/G, D]

# first kernel
S = dot(Q, K)
P = softmax(S, axis=-1)
L = LSE(S) # [B, G, M, 1]
O' = dot(P, V) # [B, G, M, D]

# second kernel
scale = softmax(L, axis=1) # [B, G, M, 1]
R = mul(O', broadcast(scale)) # [B, G, M, D]
O = sum(R, axis=1) # [B, 1, M, D]

This PR also supports multi-headed attention, i.e. starting with Batch and Heads dimensions, instead of just Batch.

I have added an env var MIGRAPHX_ENABLE_FLASH_DECODING where we can set explicitly the number of splits, e.g. MIGRAPHX_ENABLE_FLASH_DECODING=2, but also for testing, the splits can be directly set through the fuse_attention class.

Currently, this does not rewrite attention if it is LSE, nor if there is input fusion (which might not even happen at this stage, I've saved this problem for later). It derives the starting and rewritten shapes based on the gemm sizes, and then does the following:

  1. inserts reshapes/broadcasts for Q, K, V
  2. creates a new attention submodule with the new (reshaped) inputs
  3. rebuilds the original attention submodule with updated reduction operators
  4. inserts LSE at the end of the new submodule
  5. returns both the partial out and the LSE
  6. adds in the second kernel after unpacking the new submodule results
  7. reshapes the final output back to what is expected

The logic for rebuilding the submodule could use some fine-tuning.

Future work

    • if input fusion is allowed on attention, then need to handle this
    • if this needs to work for LSE attention, then need to handle this
    • split-kv case

Changelog Category

    • Added: New functionality.
    • Changed: Changes to existing functionality.
    • Removed: Functionality or support that has been removed. (Compared to a previous release)
    • Optimized: Component performance that has been optimized or improved.
    • Resolved Issues: Known issues from a previous version that have been resolved.
    • Not Applicable: This PR is not to be included in the changelog.

@bdevorem bdevorem self-assigned this Oct 17, 2025
@bdevorem bdevorem changed the title [DRAFT] flash decoding round 1 flash decoding round 1 Oct 20, 2025
@codecov
Copy link

codecov bot commented Oct 20, 2025

Codecov Report

❌ Patch coverage is 97.28261% with 5 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/fuse_attention.cpp 97.27% 5 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #4393      +/-   ##
===========================================
+ Coverage    92.17%   92.21%   +0.04%     
===========================================
  Files          560      560              
  Lines        26609    26796     +187     
===========================================
+ Hits         24526    24708     +182     
- Misses        2083     2088       +5     
Files with missing lines Coverage Δ
src/include/migraphx/fuse_attention.hpp 100.00% <100.00%> (ø)
src/fuse_attention.cpp 97.45% <97.27%> (-0.37%) ⬇️

... and 2 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@bdevorem bdevorem changed the title flash decoding round 1 Flash decoding round 1 Oct 22, 2025
@bdevorem bdevorem marked this pull request as ready for review October 22, 2025 10:08
@bdevorem bdevorem requested review from a team and causten as code owners October 22, 2025 10:08
@bdevorem bdevorem requested a review from pfultz2 October 22, 2025 10:08
@migraphx-bot
Copy link
Collaborator

Test Batch Rate new
af612a
Rate old
0b9f80
Diff Compare
torchvision-resnet50 64 3,162.87 3,132.39 0.97%
torchvision-resnet50_fp16 64 6,548.94 6,526.99 0.34%
torchvision-densenet121 32 2,416.12 2,399.32 0.70%
torchvision-densenet121_fp16 32 4,045.90 4,025.31 0.51%
torchvision-inceptionv3 32 1,655.83 1,646.23 0.58%
torchvision-inceptionv3_fp16 32 2,503.34 2,497.02 0.25%
cadene-inceptionv4 16 791.24 789.54 0.21%
cadene-resnext64x4 16 804.45 791.55 1.63%
slim-mobilenet 64 8,179.52 8,104.05 0.93%
slim-nasnetalarge 64 221.85 218.39 1.58%
slim-resnet50v2 64 3,277.68 3,252.50 0.77%
bert-mrpc-onnx 8 1,138.60 1,137.74 0.08%
bert-mrpc-tf 1 493.31 472.44 4.42% 🔆
pytorch-examples-wlang-gru 1 332.42 334.54 -0.63%
pytorch-examples-wlang-lstm 1 466.74 456.66 2.21%
torchvision-resnet50_1 1 776.14 725.16 7.03% 🔆
cadene-dpn92_1 1 439.41 422.92 3.90% 🔆
cadene-resnext101_1 1 357.05 354.77 0.64%
onnx-taau-downsample 1 312.52 313.53 -0.32%
dlrm-criteoterabyte 1 31.94 31.53 1.31%
dlrm-criteoterabyte_fp16 1 51.23 51.20 0.06%
agentmodel 1 10,909.67 9,850.47 10.75% 🔆
unet_fp16 2 59.36 59.37 -0.02%
resnet50v1_fp16 1 1,011.72 986.79 2.53%
resnet50v1_int8 1 996.99 948.55 5.11% 🔆
bert_base_cased_fp16 64 1,110.18 1,109.24 0.08%
bert_large_uncased_fp16 32 344.24 344.17 0.02%
bert_large_fp16 1 197.56 197.47 0.05%
distilgpt2_fp16 16 2,101.42 2,100.80 0.03%
yolov5s 1 583.52 559.32 4.33% 🔆
tinyllama 1 43.80 43.84 -0.10%
vicuna-fastchat 1 45.23 45.22 0.03%
whisper-tiny-encoder 1 411.33 409.82 0.37%
whisper-tiny-decoder 1 414.86 402.69 3.02% 🔆
llama2_7b 1 19.09 19.09 -0.01%
qwen1.5-7b 1 23.41 23.41 0.01%
phi3-3.8b 1 26.59 26.61 -0.08%
mask-rcnn 1 12.23 12.29 -0.54%
llama3-8b 1 21.65 21.66 -0.07%
whisper-large-encoder 1 10.17 10.17 0.03%
whisper-large-decoder 1 101.07 100.90 0.17%
mistral-7b 1 23.67 23.66 0.04%
FLUX.1-schnell 1 761.17 720.58 5.63% 🔆
nan nan nan nan nan%

This build is not recommended to merge 🔴

@migraphx-bot
Copy link
Collaborator


     ✅ bert-mrpc-onnx: PASSED: MIGraphX meets tolerance

❌bert-mrpc-tf: ERROR - check error outputTraceback (most recent call last):
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 359, in
main()
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 295, in main
import tensorflow as tf
File "/usr/local/lib/python3.10/dist-packages/tensorflow/init.py", line 40, in
from tensorflow.python import pywrap_tensorflow as _pywrap_tensorflow # pylint: disable=unused-import
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 34, in
self_check.preload_check()
File "/usr/local/lib/python3.10/dist-packages/tensorflow/python/platform/self_check.py", line 63, in preload_check
from tensorflow.python.platform import _pywrap_cpu_feature_guard
ImportError: libamdhip64.so.6: cannot open shared object file: No such file or directory


     ✅ pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance

     ✅ pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance

     ✅ dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance

     ✅ agentmodel: PASSED: MIGraphX meets tolerance

     ✅ unet: PASSED: MIGraphX meets tolerance

     ✅ resnet50v1: PASSED: MIGraphX meets tolerance

❌bert_base_cased_fp16: ERROR - check error output�[1;31m2025-10-22 18:37:38.054540615 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185711, index: 63, mask: {64, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.064041216 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185740, index: 92, mask: {93, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.068046483 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185734, index: 86, mask: {87, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.068064718 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185720, index: 72, mask: {73, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.068078123 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185722, index: 74, mask: {75, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.064048330 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185730, index: 82, mask: {83, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.064062807 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185735, index: 87, mask: {88, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.064060513 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185729, index: 81, mask: {82, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.064064911 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185748, index: 100, mask: {101, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.064065712 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185741, index: 93, mask: {94, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.064074780 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185750, index: 102, mask: {103, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.064078356 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185742, index: 94, mask: {95, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.064079989 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185737, index: 89, mask: {90, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.064081843 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185747, index: 99, mask: {100, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.064084047 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185736, index: 88, mask: {89, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.064091321 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185749, index: 101, mask: {102, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.068048727 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185712, index: 64, mask: {65, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.068053026 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185715, index: 67, mask: {68, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.068057604 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185728, index: 80, mask: {81, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.068058666 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185724, index: 76, mask: {77, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.068059468 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185727, index: 79, mask: {80, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.068055190 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185716, index: 68, mask: {69, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.068063676 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185723, index: 75, mask: {76, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.068068395 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185719, index: 71, mask: {72, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.068068865 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185718, index: 70, mask: {71, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.068069797 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185755, index: 107, mask: {108, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.068072102 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185754, index: 106, mask: {107, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.068068495 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185725, index: 77, mask: {78, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.068068685 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185731, index: 83, mask: {84, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.068072592 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185751, index: 103, mask: {104, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.068072973 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185745, index: 97, mask: {98, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.068075538 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185738, index: 90, mask: {91, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.064048300 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185744, index: 96, mask: {97, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.068076730 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185753, index: 105, mask: {106, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.068078153 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185713, index: 65, mask: {66, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.068072633 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185714, index: 66, mask: {67, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.068079365 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185743, index: 95, mask: {96, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.068079345 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185746, index: 98, mask: {99, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.068079395 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185721, index: 73, mask: {74, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.068079305 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185717, index: 69, mask: {70, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.068083894 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185732, index: 84, mask: {85, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.068084395 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185733, index: 85, mask: {86, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.080666894 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185774, index: 126, mask: {127, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.080958764 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185766, index: 118, mask: {119, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.080977730 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185773, index: 125, mask: {126, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.081373045 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185765, index: 117, mask: {118, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.081416206 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185772, index: 124, mask: {125, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.081482391 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185771, index: 123, mask: {124, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.081902132 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185770, index: 122, mask: {123, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.082095967 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185769, index: 121, mask: {122, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.082261288 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185768, index: 120, mask: {121, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.082456526 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185764, index: 116, mask: {117, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.082638349 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185761, index: 113, mask: {114, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.082675879 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185762, index: 114, mask: {115, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.082714282 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185757, index: 109, mask: {110, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.082727176 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185758, index: 110, mask: {111, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.082767342 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185759, index: 111, mask: {112, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.082785747 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185760, index: 112, mask: {113, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.082827816 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185756, index: 108, mask: {109, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.082978420 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185767, index: 119, mask: {120, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.064055824 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185739, index: 91, mask: {92, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.064041286 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185726, index: 78, mask: {79, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.076033323 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185752, index: 104, mask: {105, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:38.096029983 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 185763, index: 115, mask: {116, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:37:39.464159051 [E:onnxruntime:, inference_session.cc:2544 operator()] Exception during initialization: /onnxruntime_src/onnxruntime/core/graph/graph_utils.cc:29 int onnxruntime::graph_utils::GetIndexFromName(const onnxruntime::Node&, const std::string&, bool) itr != node_args.end() was false. Attempting to get index by a name which does not exist:InsertedPrecisionFreeCast_onnx::Pow_1623for node: Mul_53/SimplifiedLayerNormFusion/
�[m
Traceback (most recent call last):
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 359, in
main()
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 278, in main
sess = ort.InferenceSession(model_name,
File "/usr/local/lib/python3.10/dist-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 485, in init
self._create_inference_session(providers, provider_options, disabled_optimizers)
File "/usr/local/lib/python3.10/dist-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 584, in _create_inference_session
sess.initialize_session(providers, provider_options, disabled_optimizers)
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Exception during initialization: /onnxruntime_src/onnxruntime/core/graph/graph_utils.cc:29 int onnxruntime::graph_utils::GetIndexFromName(const onnxruntime::Node&, const std::string&, bool) itr != node_args.end() was false. Attempting to get index by a name which does not exist:InsertedPrecisionFreeCast_onnx::Pow_1623for node: Mul_53/SimplifiedLayerNormFusion/


❌bert_large_uncased_fp16: ERROR - check error output�[1;31m2025-10-22 18:38:22.236084421 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192715, index: 81, mask: {82, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.236084852 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192698, index: 64, mask: {65, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.238242848 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192697, index: 63, mask: {64, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.236093048 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192701, index: 67, mask: {68, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.236094420 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192704, index: 70, mask: {71, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.238362704 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192708, index: 74, mask: {75, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.238385016 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192714, index: 80, mask: {81, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.238393172 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192717, index: 83, mask: {84, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.238479074 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192721, index: 87, mask: {88, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.238564284 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192728, index: 94, mask: {95, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.238582118 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192735, index: 101, mask: {102, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.239113930 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192747, index: 113, mask: {114, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.239143015 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192720, index: 86, mask: {87, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.239159375 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192742, index: 108, mask: {109, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.239164996 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192713, index: 79, mask: {80, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.236092246 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192760, index: 126, mask: {127, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.236093599 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192738, index: 104, mask: {105, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.236097877 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192756, index: 122, mask: {123, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.236100762 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192750, index: 116, mask: {117, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.236103808 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192736, index: 102, mask: {103, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.236105591 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192741, index: 107, mask: {108, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.236111122 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192749, index: 115, mask: {116, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.236112124 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192757, index: 123, mask: {124, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.236112544 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192743, index: 109, mask: {110, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.236112815 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192729, index: 95, mask: {96, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.236115209 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192744, index: 110, mask: {111, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.236115340 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192748, index: 114, mask: {115, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.236115280 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192702, index: 68, mask: {69, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.236116742 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192722, index: 88, mask: {89, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.239558357 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192753, index: 119, mask: {120, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.239602641 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192725, index: 91, mask: {92, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.239671570 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192739, index: 105, mask: {106, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.239805062 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192712, index: 78, mask: {79, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.239873451 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192731, index: 97, mask: {98, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.239932362 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192745, index: 111, mask: {112, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.239953241 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192711, index: 77, mask: {78, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.239983769 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192752, index: 118, mask: {119, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.239999078 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192723, index: 89, mask: {90, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.240036077 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192716, index: 82, mask: {83, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.240049362 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192710, index: 76, mask: {77, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.236121231 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192705, index: 71, mask: {72, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.236123505 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192758, index: 124, mask: {125, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.236119287 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192726, index: 92, mask: {93, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.236119337 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192754, index: 120, mask: {121, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.236119357 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192740, index: 106, mask: {107, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.236119317 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192719, index: 85, mask: {86, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.236127713 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192703, index: 69, mask: {70, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.236123094 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192718, index: 84, mask: {85, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.236123124 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192732, index: 98, mask: {99, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.236123064 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192746, index: 112, mask: {113, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.236134085 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192700, index: 66, mask: {67, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.236118025 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192706, index: 72, mask: {73, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.236139726 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192755, index: 121, mask: {122, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.236140036 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192707, index: 73, mask: {74, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.236095973 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192737, index: 103, mask: {104, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.239144437 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192734, index: 100, mask: {101, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.236092206 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192751, index: 117, mask: {118, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.236135327 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192699, index: 65, mask: {66, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.236097606 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192724, index: 90, mask: {91, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.239177830 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192759, index: 125, mask: {126, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.236094160 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192727, index: 93, mask: {94, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.239246149 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192733, index: 99, mask: {100, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.240039684 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192709, index: 75, mask: {76, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:22.236118736 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 192730, index: 96, mask: {97, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:38:26.417173862 [E:onnxruntime:, inference_session.cc:2544 operator()] Exception during initialization: /onnxruntime_src/onnxruntime/core/graph/graph_utils.cc:29 int onnxruntime::graph_utils::GetIndexFromName(const onnxruntime::Node&, const std::string&, bool) itr != node_args.end() was false. Attempting to get index by a name which does not exist:InsertedPrecisionFreeCast_onnx::Pow_3183for node: Mul_53/SimplifiedLayerNormFusion/
�[m
Traceback (most recent call last):
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 359, in
main()
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 278, in main
sess = ort.InferenceSession(model_name,
File "/usr/local/lib/python3.10/dist-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 485, in init
self._create_inference_session(providers, provider_options, disabled_optimizers)
File "/usr/local/lib/python3.10/dist-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 584, in _create_inference_session
sess.initialize_session(providers, provider_options, disabled_optimizers)
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Exception during initialization: /onnxruntime_src/onnxruntime/core/graph/graph_utils.cc:29 int onnxruntime::graph_utils::GetIndexFromName(const onnxruntime::Node&, const std::string&, bool) itr != node_args.end() was false. Attempting to get index by a name which does not exist:InsertedPrecisionFreeCast_onnx::Pow_3183for node: Mul_53/SimplifiedLayerNormFusion/


     ✅ bert_large: PASSED: MIGraphX meets tolerance

     ✅ yolov5s: PASSED: MIGraphX meets tolerance

     ✅ tinyllama: PASSED: MIGraphX meets tolerance

     ✅ vicuna-fastchat: PASSED: MIGraphX meets tolerance

     ✅ whisper-tiny-encoder: PASSED: MIGraphX meets tolerance

     ✅ whisper-tiny-decoder: PASSED: MIGraphX meets tolerance

❌distilgpt2_fp16: ERROR - check error output�[1;31m2025-10-22 18:41:26.695011338 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208702, index: 64, mask: {65, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.704039870 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208701, index: 63, mask: {64, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.708039196 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208706, index: 68, mask: {69, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.708055918 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208713, index: 75, mask: {76, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.708063883 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208718, index: 80, mask: {81, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.704058906 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208711, index: 73, mask: {74, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.704059638 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208712, index: 74, mask: {75, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.704060419 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208704, index: 66, mask: {67, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.708039266 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208707, index: 69, mask: {70, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.708040549 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208708, index: 70, mask: {71, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.708041240 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208717, index: 79, mask: {80, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.708042733 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208722, index: 84, mask: {85, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.708054204 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208728, index: 90, mask: {91, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.712039334 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208731, index: 93, mask: {94, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.708053784 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208727, index: 89, mask: {90, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.708054024 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208715, index: 77, mask: {78, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.708054735 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208719, index: 81, mask: {82, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.708054705 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208726, index: 88, mask: {89, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.708061939 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208725, index: 87, mask: {88, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.708061909 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208729, index: 91, mask: {92, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.708061879 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208714, index: 76, mask: {77, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.708063752 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208724, index: 86, mask: {87, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.708063722 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208720, index: 82, mask: {83, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.704044790 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208709, index: 71, mask: {72, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.708069974 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208716, index: 78, mask: {79, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.712039424 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208723, index: 85, mask: {86, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.712043272 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208721, index: 83, mask: {84, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.712044354 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208730, index: 92, mask: {93, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.704039920 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208703, index: 65, mask: {66, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.704044239 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208705, index: 67, mask: {68, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.720951898 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208746, index: 108, mask: {109, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.721045033 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208758, index: 120, mask: {121, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.721354005 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208733, index: 95, mask: {96, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.721425480 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208738, index: 100, mask: {101, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.721446129 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208744, index: 106, mask: {107, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.721485533 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208749, index: 111, mask: {112, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.721531620 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208755, index: 117, mask: {118, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.721569031 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208761, index: 123, mask: {124, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.721854889 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208736, index: 98, mask: {99, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.721864988 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208742, index: 104, mask: {105, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.721896337 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208748, index: 110, mask: {111, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.721922887 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208754, index: 116, mask: {117, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.721954908 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208760, index: 122, mask: {123, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.722093889 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208759, index: 121, mask: {122, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.722263689 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208741, index: 103, mask: {104, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.722283627 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208747, index: 109, mask: {110, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.722333671 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208753, index: 115, mask: {116, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.722386561 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208735, index: 97, mask: {98, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.722740598 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208734, index: 96, mask: {97, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.722803837 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208739, index: 101, mask: {102, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.722824867 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208751, index: 113, mask: {114, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.722874791 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208757, index: 119, mask: {120, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.722960102 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208764, index: 126, mask: {127, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.722965993 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208745, index: 107, mask: {108, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.723225281 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208756, index: 118, mask: {119, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.723259335 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208762, index: 124, mask: {125, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.723326352 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208732, index: 94, mask: {95, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.723374132 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208737, index: 99, mask: {100, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.721081362 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208752, index: 114, mask: {115, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.723387487 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208743, index: 105, mask: {106, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.723406793 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208750, index: 112, mask: {113, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.723488217 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208740, index: 102, mask: {103, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.716033230 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208710, index: 72, mask: {73, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:26.721065582 [E:onnxruntime:Default, env.cc:226 ThreadMain] pthread_setaffinity_np failed for thread: 208763, index: 125, mask: {126, }, error code: 22 error msg: Invalid argument. Specify the number of threads explicitly so the affinity is not set.�[m
�[1;31m2025-10-22 18:41:27.579270294 [E:onnxruntime:, inference_session.cc:2544 operator()] Exception during initialization: /onnxruntime_src/onnxruntime/core/graph/graph_utils.cc:29 int onnxruntime::graph_utils::GetIndexFromName(const onnxruntime::Node&, const std::string&, bool) itr != node_args.end() was false. Attempting to get index by a name which does not exist:InsertedPrecisionFreeCast_onnx::Pow_1551for node: Mul_33/SimplifiedLayerNormFusion/
�[m
Traceback (most recent call last):
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 359, in
main()
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 278, in main
sess = ort.InferenceSession(model_name,
File "/usr/local/lib/python3.10/dist-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 485, in init
self._create_inference_session(providers, provider_options, disabled_optimizers)
File "/usr/local/lib/python3.10/dist-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 584, in _create_inference_session
sess.initialize_session(providers, provider_options, disabled_optimizers)
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Exception during initialization: /onnxruntime_src/onnxruntime/core/graph/graph_utils.cc:29 int onnxruntime::graph_utils::GetIndexFromName(const onnxruntime::Node&, const std::string&, bool) itr != node_args.end() was false. Attempting to get index by a name which does not exist:InsertedPrecisionFreeCast_onnx::Pow_1551for node: Mul_33/SimplifiedLayerNormFusion/


     ✅ llama2_7b: PASSED: MIGraphX meets tolerance

     ✅ qwen1.5-7b: PASSED: MIGraphX meets tolerance

     ✅ phi3-3.8b: PASSED: MIGraphX meets tolerance

🔴mask-rcnn: FAILED: MIGraphX is not within tolerance - check verbose output


     ✅ llama3-8b: PASSED: MIGraphX meets tolerance

     ✅ whisper-large-decoder: PASSED: MIGraphX meets tolerance

     ✅ mistral-7b: PASSED: MIGraphX meets tolerance

     ✅ FLUX.1-schnell: PASSED: MIGraphX meets tolerance


std::size_t get_num_splits()
{
std::string value = string_value_of(MIGRAPHX_ENABLE_FLASH_DECODING{}, "0");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value_of function can be used to get an integer.

auto expected_inputs = submod->get_inputs(map_param_to_main);
assert(expected_inputs == group_inputs and "Mapped inputs don't match group inputs");
return map_param_to_main;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can use get_ins_param_map for this function.

std::unordered_map<instruction_ref, instruction_ref> map_old_to_new = param_map;
std::unordered_map<std::string, instruction_ref> softmax_parts;

for(auto it = source_mod.begin(); it != source_mod.end(); ++it)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use insert_instructions or add_instructions to copy the instructions over. It will take care of the mapping and traversal. There is an inserter function that can be used to change the operators as you insert them in.

auto partial_output_o_prime = map_old_to_new.at(orig_return_ins);

// calculate LSE = max(S) + log(sum(exp(S - max(S))))
assert(contains(softmax_parts, "max") and contains(softmax_parts, "sum_exp"));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just do a find_if for reduce_max and then insert the LSE operators. Then we can apply CSE to remove the duplicate calculations with softmax. This should be simpler then finding all the parts of softmax.

map_old_params_to_new[v_param] = new_v_param;

// don't simply fuse previous attn submod, need to rebuild all the ops
rebuild_attention_submodule(m_flash_decode, *submod, map_old_params_to_new);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually you can use the fuse method here. The inserter function can be used to change the operators to handle the extra dimension, and the parameters will be inserted correctly based on the inputs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah that's much better, thanks will do

std::optional<std::size_t> flash_decoding_num_splits = std::nullopt;

fuse_attention() = default;
explicit fuse_attention(std::size_t num_splits) : flash_decoding_num_splits(num_splits) {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If its possible, I would remove the constructor here, so we can use the name to initialize as it can be clearer.

migraphx::run_passes(p, {migraphx::fuse_attention{}, migraphx::dead_code_elimination{}});
}

static void run_pass(migraphx::program& p, std::size_t num_splits)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for an extra overload. I would just add the pass as a default parameter:

static void run_pass(migraphx::program& p, migraphx::fuse_attention fa = {})
{
    migraphx::run_passes(p, {fa, migraphx::dead_code_elimination{}});
}

And then you can run this with run_pass(p1, {.flash_decoding_num_splits=2}).

{{"axes", {static_cast<int64_t>(new_input_shape.lens().size() - 1)}}});
}
// TODO make less reliant on ops around it
else if(op.name() == "multibroadcast")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We know which axis we inserted the groups so we can just insert that into the out_lens and that should work most of the time. For more robust solution is to either rewrite this to use 2 args version and then remove it afterwards or add an optional parameter to the inserter for the old_ins so we can find the second argument on the fly.

@causten causten requested a review from Copilot October 28, 2025 21:06
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds flash decoding support for attention fusion operations. Flash decoding is a technique to optimize attention computation by splitting the key-value sequence dimension into multiple groups and processing them in parallel.

Key Changes:

  • Added a configurable num_splits parameter to the fuse_attention pass to enable flash decoding
  • Implemented find_flash_decoding matcher that transforms fused attention operations into flash decoding variants
  • Added environment variable MIGRAPHX_ENABLE_FLASH_DECODING to control flash decoding at runtime

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 8 comments.

File Description
src/include/migraphx/fuse_attention.hpp Added optional flash_decoding_num_splits member and constructor overload to accept num_splits parameter
src/fuse_attention.cpp Implemented flash decoding transformation logic with find_flash_decoding struct and environment variable handling
test/fuse_attention.cpp Added three test cases covering flash decoding scenarios: 4D shapes, 3D shapes, and disabled flash decoding
docs/reference/MIGraphX-dev-env-vars.rst Documented the new MIGRAPHX_ENABLE_FLASH_DECODING environment variable

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +367 to +370
assert(original_axes.front() ==
static_cast<int64_t>(ins->inputs().front()->get_shape().lens().size() -
1) or
original_axes.front() == -1);
Copy link

Copilot AI Oct 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assertion condition at lines 367-370 is complex and difficult to understand. Consider extracting this logic into a helper function with a descriptive name like is_last_axis_reduction() or adding a comment explaining why these specific conditions validate the reduction axis.

Copilot uses AI. Check for mistakes.
Comment on lines +164 to +169
| Turns on flash decoding for attention fusion and sets the number of splits.
- | ``0``: flash decoding is turned off (i.e. number of splits is 0)
| Number of splits must be an integer.
| Default: flash decoding is turned off.
Copy link

Copilot AI Oct 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation is incomplete. It only describes the disabled case (0) but doesn't explain what valid positive integer values do or provide examples (e.g., 2, 4, etc.). Consider adding: 'N (where N > 1): Enables flash decoding with N splits along the key-value sequence dimension.'

Suggested change
| Turns on flash decoding for attention fusion and sets the number of splits.
- | ``0``: flash decoding is turned off (i.e. number of splits is 0)
| Number of splits must be an integer.
| Default: flash decoding is turned off.
| Turns on flash decoding for attention fusion and sets the number of splits along the key-value sequence dimension.
- | ``0``: Flash decoding is turned off (i.e., number of splits is 0).
| ``N`` (where N > 1): Enables flash decoding with N splits along the key-value sequence dimension. For example, ``2`` enables flash decoding with 2 splits, ``4`` with 4 splits, etc.
| Default: Flash decoding is turned off.

Copilot uses AI. Check for mistakes.
size_t g = groups;

// TODO: do we wanna support this differently?
assert(n % g == 0 and "N must be divisible by G");
Copy link

Copilot AI Oct 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assertion message uses generic variable names 'N' and 'G' that may not be immediately clear to users. Consider using more descriptive terms like 'Key-value sequence length must be divisible by number of splits' to improve debuggability.

Suggested change
assert(n % g == 0 and "N must be divisible by G");
assert(n % g == 0 and "Key-value sequence length must be divisible by number of splits/groups");

Copilot uses AI. Check for mistakes.
// Default behavior: read from the env var (for non-test use)
num_splits = get_num_splits();
}
if(num_splits > 1)
Copy link

Copilot AI Oct 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition num_splits > 1 means flash decoding is skipped when num_splits equals 1, but this case (splitting into 1 group) could still be valid and shouldn't silently skip the transformation. Consider changing to num_splits > 0 and adding logic to handle the num_splits == 1 case appropriately, or document why 1 is treated as disabled.

Suggested change
if(num_splits > 1)
// Apply flash decoding transformation when num_splits > 0.
// This includes the case where num_splits == 1 (i.e., no actual split),
// which may be a valid configuration.
if(num_splits > 0)

Copilot uses AI. Check for mistakes.
if(it->name() == "dot")
gemms.push_back(it);
}
assert(gemms.size() == 2 and "Expected exactly 2 gemm operations in attention submodule");
Copy link

Copilot AI Oct 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assertion message could be more helpful by indicating what was actually found. Consider: 'Expected exactly 2 gemm operations in attention submodule, found ' + std::to_string(gemms.size())

Copilot uses AI. Check for mistakes.
op.from_value(
{{"axes", {static_cast<int64_t>(new_input_shape.lens().size() - 1)}}});
}
// TODO make less reliant on ops around it
Copy link

Copilot AI Oct 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This TODO comment is vague and doesn't provide actionable context. Consider expanding it to explain what specific dependency needs to be removed and why, e.g., 'TODO: Eliminate dependency on parent operator type by computing broadcast shape directly from tensor dimensions instead of relying on 'sub'/'div' sibling inspection.'

Suggested change
// TODO make less reliant on ops around it
// TODO: Eliminate dependency on parent 'sub'/'div' operator by computing broadcast target shape directly from tensor dimensions, rather than inspecting sibling inputs. This will make the code more robust to graph structure changes.

Copilot uses AI. Check for mistakes.
auto attn_group_ins = r.instructions["group"];
auto* submod = attn_group_ins->module_inputs().front();

// TODO: for this pass of flash decoding, if LSE attn, do not do flash decoding
Copy link

Copilot AI Oct 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This TODO comment is unclear. What does 'LSE attn' mean? Consider clarifying: 'TODO: Skip flash decoding if the attention module already returns LSE (log-sum-exp) values, as this indicates the attention is already in a format compatible with flash decoding.'

Suggested change
// TODO: for this pass of flash decoding, if LSE attn, do not do flash decoding
// TODO: Skip flash decoding if the attention module already returns LSE (log-sum-exp) values,
// as this indicates the attention is already in a format compatible with flash decoding.

Copilot uses AI. Check for mistakes.
// get gemm1 and gemm2
auto [gemm1, gemm2] = get_gemms(submod);

// TODO: for this first pass of flash decoding, assuming no input fusion / not suppporting
Copy link

Copilot AI Oct 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected spelling of 'suppporting' to 'supporting'.

Suggested change
// TODO: for this first pass of flash decoding, assuming no input fusion / not suppporting
// TODO: for this first pass of flash decoding, assuming no input fusion / not supporting

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants