Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
72 commits
Select commit Hold shift + click to select a range
36c0dc1
plugin TRT EP init
chilo-ms Jun 23, 2025
ed65a9f
clean up GetCapabilityImpl and make it pass compiler for now
chilo-ms Jun 23, 2025
3269f73
Clean up CompileImpl
chilo-ms Jun 23, 2025
4da9f90
update ep factory
chilo-ms Jun 25, 2025
1928767
update ep factory
chilo-ms Jun 25, 2025
4f5ffcb
update ep factory
chilo-ms Jun 25, 2025
bc64bdc
clean up and add back onnx_ctx_model_helper.cc
chilo-ms Jun 25, 2025
c4437a2
clean up
chilo-ms Jun 25, 2025
a5a294e
remove onnxruntime namespace
chilo-ms Jun 25, 2025
f990a7b
update
chilo-ms Jun 25, 2025
7851a1c
Add TRTEpNodeComputeInfo
chilo-ms Jun 28, 2025
be453b1
add allocator and data transfer
chilo-ms Jul 2, 2025
3d6fa57
fix a lot of compile errors
chilo-ms Jul 2, 2025
c8e3d6f
call EpDevice_AddAllocatorInfo in GetSupportedDevicesImpl
chilo-ms Jul 2, 2025
3c43029
temporary way to get provider option without proper API
chilo-ms Jul 3, 2025
549b29d
Clean up cmake file to remove dependencies that built with ORT
chilo-ms Jul 7, 2025
3ad7736
Update CompileImpl
chilo-ms Jul 8, 2025
3ced4cf
add ort_graph_to_proto.h and leverage OrtGraphToProto utilities
chilo-ms Jul 8, 2025
081de36
update EP context model helper
chilo-ms Jul 10, 2025
75240a4
Convert onnxruntime::Status to OrtStatus
chilo-ms Jul 10, 2025
f73420f
remove unused files
chilo-ms Jul 10, 2025
938a3fe
use GetSessionOptionsConfigEntries to get provider options
chilo-ms Jul 10, 2025
731ed72
fix a bunch of compile errors
chilo-ms Jul 11, 2025
30e0f91
update memory info and data transfer in TRT EP's factor to accommodat…
chilo-ms Jul 14, 2025
f443a33
update cuda/pinned allocator to make compiler happy
chilo-ms Jul 14, 2025
95dd71e
add GetVersionImpl in factory
chilo-ms Jul 14, 2025
35b0cf1
update data transfer initialization in TRT EP
chilo-ms Jul 14, 2025
a65908f
Fix compile errors/issues
chilo-ms Jul 14, 2025
c77391f
fix to use correct API
chilo-ms Jul 14, 2025
c5363e6
fix bug for gpu data transfer implementation
chilo-ms Jul 15, 2025
09138ee
clean up
chilo-ms Jul 15, 2025
a8dde45
remove unnecessary files
chilo-ms Jul 15, 2025
b911754
Temporarily manually creates cudaStream to run
chilo-ms Jul 15, 2025
0c817ac
Temporary make plugin TRT links against the protobuf, onnx, flatbuffe…
chilo-ms Jul 15, 2025
da729f9
fix the issue of error LNK2038: mismatch detected for 'RuntimeLibrary…
chilo-ms Jul 15, 2025
6fd38c3
refactor memory info stored in factory
chilo-ms Jul 15, 2025
7467c65
update as onnxruntime_ep_c_api.h changes
chilo-ms Jul 16, 2025
da0f9c6
Add support for dump and run EP Context model
chilo-ms Jul 23, 2025
ccf20da
update and sync with latest ep c api
chilo-ms Jul 23, 2025
cca956d
remove delete resource in TRTEpDataTransfer::ReleaseImpl
chilo-ms Jul 24, 2025
404cd4e
update cmake file to force dynamic release CRT globally for all depen…
chilo-ms Jul 29, 2025
c58130b
use updated Value_GetMemoryDevice API
chilo-ms Aug 11, 2025
5828e10
update ort to graph util
chilo-ms Aug 11, 2025
832a7f4
Add EP API Stream support
chilo-ms Aug 11, 2025
edd4b34
Update CMakeLists.txt
chilo-ms Aug 11, 2025
5f46b68
fix mem leak for OrtAllocator
chilo-ms Aug 12, 2025
e81d395
add missing header file
chilo-ms Aug 19, 2025
1211cd6
fix build issue on Linux
chilo-ms Aug 20, 2025
0a8be0d
lintrunner -a
chilo-ms Aug 22, 2025
e4c2405
Update to use new API OpAttr_GetTensorAttributeAsOrtValue
chilo-ms Aug 29, 2025
2472a15
remove unnecessary files
chilo-ms Sep 8, 2025
ab8cd70
Add default logger for TRT logger
chilo-ms Sep 10, 2025
12d2306
Add default logger for TRT EP
chilo-ms Sep 10, 2025
c6ae7b6
update include path in utility function header
chilo-ms Sep 10, 2025
6b180a4
Add default logger for TRT EP (cont.)
chilo-ms Sep 10, 2025
b3ac797
put code under namespace trt_ep
chilo-ms Sep 16, 2025
632d224
remove unnecessary files
chilo-ms Sep 18, 2025
4d32867
update GetCapabilityImpl()
chilo-ms Sep 25, 2025
ae9686f
Add code for updating cache path for EPContext node
chilo-ms Sep 26, 2025
c8a6ae6
add onnx_external_data_bytestream support for refitting the engine
chilo-ms Sep 26, 2025
5f17a2b
address reviewer's comments
chilo-ms Sep 30, 2025
c103394
Add try/catch for c++ API that throws Ort::Exception
chilo-ms Sep 30, 2025
bd3899d
Set node_fusion_options.drop_constant_initializers to true for node_f…
chilo-ms Sep 30, 2025
6fd05ef
remove unused code
chilo-ms Sep 30, 2025
0b5c65c
add missing trt_ep namespace
chilo-ms Sep 30, 2025
c69dd60
remove the remaining commented code
chilo-ms Sep 30, 2025
5ab50ac
address reviewer's comments
chilo-ms Oct 6, 2025
486fa63
Add initial test infra
chilo-ms Oct 8, 2025
e745237
add script to build plugin TRT EP
chilo-ms Oct 13, 2025
86e6056
copy windows_tensorrt.yml from onnxruntime repo
chilo-ms Oct 13, 2025
340a7ce
Upate test
chilo-ms Oct 13, 2025
471fe0b
update tensorrt ep pipeline YAML to test
chilo-ms Oct 15, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
242 changes: 242 additions & 0 deletions .github/workflows/windows_tensorrt.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,242 @@
name: Windows GPU TensorRT CI Pipeline

on:
push:
branches:
- main
pull_request:

concurrency:
group: ${{ github.head_ref || github.run_id }}
cancel-in-progress: true

jobs:
build:
name: Windows GPU TensorRT CI Pipeline
runs-on: windows-2022
steps:
- uses: actions/checkout@v5
with:
fetch-depth: 0
submodules: 'none'

- uses: actions/setup-python@v6
with:
python-version: '3.12'
architecture: x64

- name: Download CUDA SDK v12.2
working-directory: ${{ runner.temp }}
run: |
azcopy.exe cp --recursive "https://lotusscus.blob.core.windows.net/models/cuda_sdk/v12.2" .
dir
shell: pwsh

- name: Download TensorRT-10.9.0.34.Windows10.x86_64.cuda-12.8
run: 'azcopy.exe cp --recursive "https://lotusscus.blob.core.windows.net/models/local/TensorRT-10.9.0.34.Windows10.x86_64.cuda-12.8" ${{ runner.temp }}'
shell: pwsh

- name: Add CUDA to PATH
shell: powershell
run: |
Write-Host "Adding CUDA to PATH"
Write-Host "CUDA Path: $env:RUNNER_TEMP\v12.2\bin"
Add-Content -Path $env:GITHUB_PATH -Value "$env:RUNNER_TEMP\v12.2\bin"
Add-Content -Path $env:GITHUB_PATH -Value "$env:RUNNER_TEMP\v12.2\extras\CUPTI\lib64"
Add-Content -Path $env:GITHUB_PATH -Value "$env:RUNNER_TEMP\TensorRT-10.9.0.34.Windows10.x86_64.cuda-12.8\lib"

- uses: actions/setup-node@v5
with:
node-version: '20.x'

- uses: actions/setup-java@v5
with:
distribution: 'temurin'
java-version: '17'
architecture: x64

- uses: actions/cache@v4
id: onnx-node-tests-cache
with:
path: ${{ github.workspace }}/js/test/
key: onnxnodetests-${{ hashFiles('js/scripts/prepare-onnx-node-tests.ts') }}

- name: API Documentation Check and generate
run: |
set ORT_DOXY_SRC=${{ github.workspace }}
set ORT_DOXY_OUT=${{ runner.temp }}\build\RelWithDebInfo\RelWithDebInfo
mkdir %ORT_DOXY_SRC%
mkdir %ORT_DOXY_OUT%
"C:\Program Files\doxygen\bin\doxygen.exe" ${{ github.workspace }}\tools\ci_build\github\Doxyfile_csharp.cfg
working-directory: ${{ github.workspace }}
shell: cmd

- uses: actions/setup-dotnet@v5
env:
PROCESSOR_ARCHITECTURE: x64
with:
dotnet-version: '8.x'

- name: Use Nuget 6.x
uses: nuget/setup-nuget@v2
with:
nuget-version: '6.x'

- name: NuGet restore
run: nuget restore ${{ github.workspace }}\packages.config -ConfigFile ${{ github.workspace }}\NuGet.config -PackagesDirectory ${{ runner.temp }}\build\RelWithDebInfo
shell: cmd

- name: Set OnnxRuntimeBuildDirectory
shell: pwsh
run: |
$buildDir = Join-Path ${{ runner.temp }} "build"
echo "OnnxRuntimeBuildDirectory=$buildDir" >> $env:GITHUB_ENV

- name: Build and Clean Binaries
working-directory: ${{ runner.temp }}
run: |
npm install -g typescript
if ($lastExitCode -ne 0) {
exit $lastExitCode
}
# Execute the build process
python ${{ github.workspace }}\tools\ci_build\build.py --config RelWithDebInfo --parallel --use_binskim_compliant_compile_flags --build_dir build --skip_submodule_sync --build_shared_lib --build --update --cmake_generator "Visual Studio 17 2022" --build_wheel --enable_onnx_tests --use_tensorrt --tensorrt_home="${{ runner.temp }}\TensorRT-10.9.0.34.Windows10.x86_64.cuda-12.8" --cuda_home="${{ runner.temp }}\v12.2" --use_vcpkg --use_vcpkg_ms_internal_asset_cache --cmake_extra_defines CMAKE_CUDA_ARCHITECTURES=86
if ($lastExitCode -ne 0) {
exit $lastExitCode
}

# Clean up the output directory before uploading artifacts
$outputDir = "${{ runner.temp }}\build\RelWithDebInfo"
Write-Host "Cleaning up files from $outputDir..."

Remove-Item -Path "$outputDir\onnxruntime" -Recurse -Force -ErrorAction SilentlyContinue
Remove-Item -Path "$outputDir\pybind11" -Recurse -Force -ErrorAction SilentlyContinue
Remove-Item -Path "$outputDir\models" -Recurse -Force -ErrorAction SilentlyContinue
Remove-Item -Path "$outputDir\vcpkg_installed" -Recurse -Force -ErrorAction SilentlyContinue
Remove-Item -Path "$outputDir\_deps" -Recurse -Force -ErrorAction SilentlyContinue
Remove-Item -Path "$outputDir\CMakeCache.txt" -Force -ErrorAction SilentlyContinue
Remove-Item -Path "$outputDir\CMakeFiles" -Recurse -Force -ErrorAction SilentlyContinue
# Remove intermediate object files as in the original script
Remove-Item -Path $outputDir -Include "*.obj" -Recurse
shell: pwsh

- name: Upload build artifacts
uses: actions/upload-artifact@v4
with:
name: build-artifacts
path: ${{ runner.temp }}\build
env:
OrtPackageId: Microsoft.ML.OnnxRuntime.Gpu
DOTNET_SKIP_FIRST_TIME_EXPERIENCE: true
setVcvars: true
ALLOW_RELEASED_ONNX_OPSET_ONLY: '0'
DocUpdateNeeded: false
ONNXRUNTIME_TEST_GPU_DEVICE_ID: '0'
AZCOPY_AUTO_LOGIN_TYPE: MSI
AZCOPY_MSI_CLIENT_ID: 63b63039-6328-442f-954b-5a64d124e5b4

test:
Comment on lines +15 to +138

Check warning

Code scanning / CodeQL

Workflow does not contain permissions Medium

Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {contents: read}

Copilot Autofix

AI 13 days ago

To fix the problem, you should add an explicit permissions block at the root of the workflow file (.github/workflows/windows_tensorrt.yml), which will apply to all jobs unless overridden at the job level. The principle of least privilege dictates starting with the minimum required, and for most CI workflows that do not interact with repository contents except for checking out code, contents: read is usually sufficient. If actions in your workflow require additional permissions (like creating issues or pull requests), you can add those explicitly, but based on the provided steps, only contents: read is needed. Insert the following block after the workflow name, before on::

permissions:
  contents: read

No imports or new package installations are required for this YAML change.

Suggested changeset 1
.github/workflows/windows_tensorrt.yml

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/.github/workflows/windows_tensorrt.yml b/.github/workflows/windows_tensorrt.yml
--- a/.github/workflows/windows_tensorrt.yml
+++ b/.github/workflows/windows_tensorrt.yml
@@ -1,4 +1,6 @@
 name: Windows GPU TensorRT CI Pipeline
+permissions:
+  contents: read
 
 on:
   push:
EOF
@@ -1,4 +1,6 @@
name: Windows GPU TensorRT CI Pipeline
permissions:
contents: read

on:
push:
Copilot is powered by AI and may make mistakes. Always verify output.
name: Windows GPU TensorRT CI Pipeline Test Job
needs: build
timeout-minutes: 300
runs-on: ["self-hosted", "1ES.Pool=onnxruntime-github-Win2022-GPU-A10"]
steps:
- uses: actions/checkout@v5
with:
fetch-depth: 0
submodules: 'none'

- name: Download build artifacts
uses: actions/download-artifact@v5
with:
name: build-artifacts
path: ${{ runner.temp }}\build

- uses: actions/setup-python@v6
with:
python-version: '3.12'
architecture: x64

- uses: actions/setup-node@v5
with:
node-version: '20.x'

- uses: actions/setup-java@v5
with:
distribution: 'temurin'
java-version: '17'
architecture: x64

- name: Locate vcvarsall and Setup Env
uses: ./.github/actions/locate-vcvarsall-and-setup-env
with:
architecture: x64

- name: Install python modules
run: python -m pip install -r .\tools\ci_build\github\windows\python\requirements.txt
working-directory: ${{ github.workspace }}
shell: cmd

- name: Download CUDA SDK v12.2
working-directory: ${{ runner.temp }}
run: |
azcopy.exe cp --recursive "https://lotusscus.blob.core.windows.net/models/cuda_sdk/v12.2" .
dir
shell: pwsh

- name: Download TensorRT-10.9.0.34.Windows10.x86_64.cuda-12.8
run: 'azcopy.exe cp --recursive "https://lotusscus.blob.core.windows.net/models/local/TensorRT-10.9.0.34.Windows10.x86_64.cuda-12.8" ${{ runner.temp }}'
shell: pwsh

- name: Add CUDA to PATH
shell: powershell
run: |
Write-Host "Adding CUDA to PATH"
Write-Host "CUDA Path: $env:RUNNER_TEMP\v12.2\bin"
Add-Content -Path $env:GITHUB_PATH -Value "$env:RUNNER_TEMP\v12.2\bin"
Add-Content -Path $env:GITHUB_PATH -Value "$env:RUNNER_TEMP\v12.2\extras\CUPTI\lib64"
Add-Content -Path $env:GITHUB_PATH -Value "$env:RUNNER_TEMP\TensorRT-10.9.0.34.Windows10.x86_64.cuda-12.8\lib"

- name: Set OnnxRuntimeBuildDirectory
shell: pwsh
run: |
$buildDir = Join-Path ${{ runner.temp }} "build"
echo "OnnxRuntimeBuildDirectory=$buildDir" >> $env:GITHUB_ENV

- name: Install ONNX Runtime Wheel
uses: ./.github/actions/install-onnxruntime-wheel
with:
whl-directory: ${{ runner.temp }}\build\RelWithDebInfo\RelWithDebInfo\dist

- name: Run Tests
working-directory: ${{ runner.temp }}
run: |
npm install -g typescript
if ($lastExitCode -ne 0) {
exit $lastExitCode
}

python.exe ${{ github.workspace }}\tools\python\update_ctest_path.py "${{ runner.temp }}\build\RelWithDebInfo\CTestTestfile.cmake" "${{ runner.temp }}\build\RelWithDebInfo"
if ($lastExitCode -ne 0) {
exit $lastExitCode
}

python ${{ github.workspace }}\tools\ci_build\build.py --config RelWithDebInfo --parallel --use_binskim_compliant_compile_flags --build_dir build --skip_submodule_sync --build_shared_lib --test --cmake_generator "Visual Studio 17 2022" --build_wheel --enable_onnx_tests --use_tensorrt --tensorrt_home="${{ runner.temp }}\TensorRT-10.9.0.34.Windows10.x86_64.cuda-12.8" --cuda_home="${{ runner.temp }}\v12.2" --use_vcpkg --use_vcpkg_ms_internal_asset_cache --cmake_extra_defines CMAKE_CUDA_ARCHITECTURES=86
if ($lastExitCode -ne 0) {
exit $lastExitCode
}
shell: pwsh

- name: Validate C# native delegates
run: python tools\ValidateNativeDelegateAttributes.py
working-directory: ${{ github.workspace }}\csharp
shell: cmd
env:
OrtPackageId: Microsoft.ML.OnnxRuntime.Gpu
DOTNET_SKIP_FIRST_TIME_EXPERIENCE: true
setVcvars: true
ALLOW_RELEASED_ONNX_OPSET_ONLY: '0'
DocUpdateNeeded: false
ONNXRUNTIME_TEST_GPU_DEVICE_ID: '0'
AZCOPY_AUTO_LOGIN_TYPE: MSI
AZCOPY_MSI_CLIENT_ID: 63b63039-6328-442f-954b-5a64d124e5b4
Comment on lines +139 to +242

Check warning

Code scanning / CodeQL

Workflow does not contain permissions Medium

Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {contents: read}
Loading
Loading