-
Couldn't load subscription status.
- Fork 394
[Plugin TRT EP] Add pipelines to build and test plugin TRT EP #540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
chilo-ms
wants to merge
72
commits into
main
Choose a base branch
from
chi/plugin_trt_ep_test
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from all commits
Commits
Show all changes
72 commits
Select commit
Hold shift + click to select a range
36c0dc1
plugin TRT EP init
chilo-ms ed65a9f
clean up GetCapabilityImpl and make it pass compiler for now
chilo-ms 3269f73
Clean up CompileImpl
chilo-ms 4da9f90
update ep factory
chilo-ms 1928767
update ep factory
chilo-ms 4f5ffcb
update ep factory
chilo-ms bc64bdc
clean up and add back onnx_ctx_model_helper.cc
chilo-ms c4437a2
clean up
chilo-ms a5a294e
remove onnxruntime namespace
chilo-ms f990a7b
update
chilo-ms 7851a1c
Add TRTEpNodeComputeInfo
chilo-ms be453b1
add allocator and data transfer
chilo-ms 3d6fa57
fix a lot of compile errors
chilo-ms c8e3d6f
call EpDevice_AddAllocatorInfo in GetSupportedDevicesImpl
chilo-ms 3c43029
temporary way to get provider option without proper API
chilo-ms 549b29d
Clean up cmake file to remove dependencies that built with ORT
chilo-ms 3ad7736
Update CompileImpl
chilo-ms 3ced4cf
add ort_graph_to_proto.h and leverage OrtGraphToProto utilities
chilo-ms 081de36
update EP context model helper
chilo-ms 75240a4
Convert onnxruntime::Status to OrtStatus
chilo-ms f73420f
remove unused files
chilo-ms 938a3fe
use GetSessionOptionsConfigEntries to get provider options
chilo-ms 731ed72
fix a bunch of compile errors
chilo-ms 30e0f91
update memory info and data transfer in TRT EP's factor to accommodat…
chilo-ms f443a33
update cuda/pinned allocator to make compiler happy
chilo-ms 95dd71e
add GetVersionImpl in factory
chilo-ms 35b0cf1
update data transfer initialization in TRT EP
chilo-ms a65908f
Fix compile errors/issues
chilo-ms c77391f
fix to use correct API
chilo-ms c5363e6
fix bug for gpu data transfer implementation
chilo-ms 09138ee
clean up
chilo-ms a8dde45
remove unnecessary files
chilo-ms b911754
Temporarily manually creates cudaStream to run
chilo-ms 0c817ac
Temporary make plugin TRT links against the protobuf, onnx, flatbuffe…
chilo-ms da729f9
fix the issue of error LNK2038: mismatch detected for 'RuntimeLibrary…
chilo-ms 6fd38c3
refactor memory info stored in factory
chilo-ms 7467c65
update as onnxruntime_ep_c_api.h changes
chilo-ms da0f9c6
Add support for dump and run EP Context model
chilo-ms ccf20da
update and sync with latest ep c api
chilo-ms cca956d
remove delete resource in TRTEpDataTransfer::ReleaseImpl
chilo-ms 404cd4e
update cmake file to force dynamic release CRT globally for all depen…
chilo-ms c58130b
use updated Value_GetMemoryDevice API
chilo-ms 5828e10
update ort to graph util
chilo-ms 832a7f4
Add EP API Stream support
chilo-ms edd4b34
Update CMakeLists.txt
chilo-ms 5f46b68
fix mem leak for OrtAllocator
chilo-ms e81d395
add missing header file
chilo-ms 1211cd6
fix build issue on Linux
chilo-ms 0a8be0d
lintrunner -a
chilo-ms e4c2405
Update to use new API OpAttr_GetTensorAttributeAsOrtValue
chilo-ms 2472a15
remove unnecessary files
chilo-ms ab8cd70
Add default logger for TRT logger
chilo-ms 12d2306
Add default logger for TRT EP
chilo-ms c6ae7b6
update include path in utility function header
chilo-ms 6b180a4
Add default logger for TRT EP (cont.)
chilo-ms b3ac797
put code under namespace trt_ep
chilo-ms 632d224
remove unnecessary files
chilo-ms 4d32867
update GetCapabilityImpl()
chilo-ms ae9686f
Add code for updating cache path for EPContext node
chilo-ms c8a6ae6
add onnx_external_data_bytestream support for refitting the engine
chilo-ms 5f17a2b
address reviewer's comments
chilo-ms c103394
Add try/catch for c++ API that throws Ort::Exception
chilo-ms bd3899d
Set node_fusion_options.drop_constant_initializers to true for node_f…
chilo-ms 6fd05ef
remove unused code
chilo-ms 0b5c65c
add missing trt_ep namespace
chilo-ms c69dd60
remove the remaining commented code
chilo-ms 5ab50ac
address reviewer's comments
chilo-ms 486fa63
Add initial test infra
chilo-ms e745237
add script to build plugin TRT EP
chilo-ms 86e6056
copy windows_tensorrt.yml from onnxruntime repo
chilo-ms 340a7ce
Upate test
chilo-ms 471fe0b
update tensorrt ep pipeline YAML to test
chilo-ms File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,242 @@ | ||
| name: Windows GPU TensorRT CI Pipeline | ||
|
|
||
| on: | ||
| push: | ||
| branches: | ||
| - main | ||
| pull_request: | ||
|
|
||
| concurrency: | ||
| group: ${{ github.head_ref || github.run_id }} | ||
| cancel-in-progress: true | ||
|
|
||
| jobs: | ||
| build: | ||
| name: Windows GPU TensorRT CI Pipeline | ||
| runs-on: windows-2022 | ||
| steps: | ||
| - uses: actions/checkout@v5 | ||
| with: | ||
| fetch-depth: 0 | ||
| submodules: 'none' | ||
|
|
||
| - uses: actions/setup-python@v6 | ||
| with: | ||
| python-version: '3.12' | ||
| architecture: x64 | ||
|
|
||
| - name: Download CUDA SDK v12.2 | ||
| working-directory: ${{ runner.temp }} | ||
| run: | | ||
| azcopy.exe cp --recursive "https://lotusscus.blob.core.windows.net/models/cuda_sdk/v12.2" . | ||
| dir | ||
| shell: pwsh | ||
|
|
||
| - name: Download TensorRT-10.9.0.34.Windows10.x86_64.cuda-12.8 | ||
| run: 'azcopy.exe cp --recursive "https://lotusscus.blob.core.windows.net/models/local/TensorRT-10.9.0.34.Windows10.x86_64.cuda-12.8" ${{ runner.temp }}' | ||
| shell: pwsh | ||
|
|
||
| - name: Add CUDA to PATH | ||
| shell: powershell | ||
| run: | | ||
| Write-Host "Adding CUDA to PATH" | ||
| Write-Host "CUDA Path: $env:RUNNER_TEMP\v12.2\bin" | ||
| Add-Content -Path $env:GITHUB_PATH -Value "$env:RUNNER_TEMP\v12.2\bin" | ||
| Add-Content -Path $env:GITHUB_PATH -Value "$env:RUNNER_TEMP\v12.2\extras\CUPTI\lib64" | ||
| Add-Content -Path $env:GITHUB_PATH -Value "$env:RUNNER_TEMP\TensorRT-10.9.0.34.Windows10.x86_64.cuda-12.8\lib" | ||
|
|
||
| - uses: actions/setup-node@v5 | ||
| with: | ||
| node-version: '20.x' | ||
|
|
||
| - uses: actions/setup-java@v5 | ||
| with: | ||
| distribution: 'temurin' | ||
| java-version: '17' | ||
| architecture: x64 | ||
|
|
||
| - uses: actions/cache@v4 | ||
| id: onnx-node-tests-cache | ||
| with: | ||
| path: ${{ github.workspace }}/js/test/ | ||
| key: onnxnodetests-${{ hashFiles('js/scripts/prepare-onnx-node-tests.ts') }} | ||
|
|
||
| - name: API Documentation Check and generate | ||
| run: | | ||
| set ORT_DOXY_SRC=${{ github.workspace }} | ||
| set ORT_DOXY_OUT=${{ runner.temp }}\build\RelWithDebInfo\RelWithDebInfo | ||
| mkdir %ORT_DOXY_SRC% | ||
| mkdir %ORT_DOXY_OUT% | ||
| "C:\Program Files\doxygen\bin\doxygen.exe" ${{ github.workspace }}\tools\ci_build\github\Doxyfile_csharp.cfg | ||
| working-directory: ${{ github.workspace }} | ||
| shell: cmd | ||
|
|
||
| - uses: actions/setup-dotnet@v5 | ||
| env: | ||
| PROCESSOR_ARCHITECTURE: x64 | ||
| with: | ||
| dotnet-version: '8.x' | ||
|
|
||
| - name: Use Nuget 6.x | ||
| uses: nuget/setup-nuget@v2 | ||
| with: | ||
| nuget-version: '6.x' | ||
|
|
||
| - name: NuGet restore | ||
| run: nuget restore ${{ github.workspace }}\packages.config -ConfigFile ${{ github.workspace }}\NuGet.config -PackagesDirectory ${{ runner.temp }}\build\RelWithDebInfo | ||
| shell: cmd | ||
|
|
||
| - name: Set OnnxRuntimeBuildDirectory | ||
| shell: pwsh | ||
| run: | | ||
| $buildDir = Join-Path ${{ runner.temp }} "build" | ||
| echo "OnnxRuntimeBuildDirectory=$buildDir" >> $env:GITHUB_ENV | ||
|
|
||
| - name: Build and Clean Binaries | ||
| working-directory: ${{ runner.temp }} | ||
| run: | | ||
| npm install -g typescript | ||
| if ($lastExitCode -ne 0) { | ||
| exit $lastExitCode | ||
| } | ||
| # Execute the build process | ||
| python ${{ github.workspace }}\tools\ci_build\build.py --config RelWithDebInfo --parallel --use_binskim_compliant_compile_flags --build_dir build --skip_submodule_sync --build_shared_lib --build --update --cmake_generator "Visual Studio 17 2022" --build_wheel --enable_onnx_tests --use_tensorrt --tensorrt_home="${{ runner.temp }}\TensorRT-10.9.0.34.Windows10.x86_64.cuda-12.8" --cuda_home="${{ runner.temp }}\v12.2" --use_vcpkg --use_vcpkg_ms_internal_asset_cache --cmake_extra_defines CMAKE_CUDA_ARCHITECTURES=86 | ||
| if ($lastExitCode -ne 0) { | ||
| exit $lastExitCode | ||
| } | ||
|
|
||
| # Clean up the output directory before uploading artifacts | ||
| $outputDir = "${{ runner.temp }}\build\RelWithDebInfo" | ||
| Write-Host "Cleaning up files from $outputDir..." | ||
|
|
||
| Remove-Item -Path "$outputDir\onnxruntime" -Recurse -Force -ErrorAction SilentlyContinue | ||
| Remove-Item -Path "$outputDir\pybind11" -Recurse -Force -ErrorAction SilentlyContinue | ||
| Remove-Item -Path "$outputDir\models" -Recurse -Force -ErrorAction SilentlyContinue | ||
| Remove-Item -Path "$outputDir\vcpkg_installed" -Recurse -Force -ErrorAction SilentlyContinue | ||
| Remove-Item -Path "$outputDir\_deps" -Recurse -Force -ErrorAction SilentlyContinue | ||
| Remove-Item -Path "$outputDir\CMakeCache.txt" -Force -ErrorAction SilentlyContinue | ||
| Remove-Item -Path "$outputDir\CMakeFiles" -Recurse -Force -ErrorAction SilentlyContinue | ||
| # Remove intermediate object files as in the original script | ||
| Remove-Item -Path $outputDir -Include "*.obj" -Recurse | ||
| shell: pwsh | ||
|
|
||
| - name: Upload build artifacts | ||
| uses: actions/upload-artifact@v4 | ||
| with: | ||
| name: build-artifacts | ||
| path: ${{ runner.temp }}\build | ||
| env: | ||
| OrtPackageId: Microsoft.ML.OnnxRuntime.Gpu | ||
| DOTNET_SKIP_FIRST_TIME_EXPERIENCE: true | ||
| setVcvars: true | ||
| ALLOW_RELEASED_ONNX_OPSET_ONLY: '0' | ||
| DocUpdateNeeded: false | ||
| ONNXRUNTIME_TEST_GPU_DEVICE_ID: '0' | ||
| AZCOPY_AUTO_LOGIN_TYPE: MSI | ||
| AZCOPY_MSI_CLIENT_ID: 63b63039-6328-442f-954b-5a64d124e5b4 | ||
|
|
||
| test: | ||
| name: Windows GPU TensorRT CI Pipeline Test Job | ||
| needs: build | ||
| timeout-minutes: 300 | ||
| runs-on: ["self-hosted", "1ES.Pool=onnxruntime-github-Win2022-GPU-A10"] | ||
| steps: | ||
| - uses: actions/checkout@v5 | ||
| with: | ||
| fetch-depth: 0 | ||
| submodules: 'none' | ||
|
|
||
| - name: Download build artifacts | ||
| uses: actions/download-artifact@v5 | ||
| with: | ||
| name: build-artifacts | ||
| path: ${{ runner.temp }}\build | ||
|
|
||
| - uses: actions/setup-python@v6 | ||
| with: | ||
| python-version: '3.12' | ||
| architecture: x64 | ||
|
|
||
| - uses: actions/setup-node@v5 | ||
| with: | ||
| node-version: '20.x' | ||
|
|
||
| - uses: actions/setup-java@v5 | ||
| with: | ||
| distribution: 'temurin' | ||
| java-version: '17' | ||
| architecture: x64 | ||
|
|
||
| - name: Locate vcvarsall and Setup Env | ||
| uses: ./.github/actions/locate-vcvarsall-and-setup-env | ||
| with: | ||
| architecture: x64 | ||
|
|
||
| - name: Install python modules | ||
| run: python -m pip install -r .\tools\ci_build\github\windows\python\requirements.txt | ||
| working-directory: ${{ github.workspace }} | ||
| shell: cmd | ||
|
|
||
| - name: Download CUDA SDK v12.2 | ||
| working-directory: ${{ runner.temp }} | ||
| run: | | ||
| azcopy.exe cp --recursive "https://lotusscus.blob.core.windows.net/models/cuda_sdk/v12.2" . | ||
| dir | ||
| shell: pwsh | ||
|
|
||
| - name: Download TensorRT-10.9.0.34.Windows10.x86_64.cuda-12.8 | ||
| run: 'azcopy.exe cp --recursive "https://lotusscus.blob.core.windows.net/models/local/TensorRT-10.9.0.34.Windows10.x86_64.cuda-12.8" ${{ runner.temp }}' | ||
| shell: pwsh | ||
|
|
||
| - name: Add CUDA to PATH | ||
| shell: powershell | ||
| run: | | ||
| Write-Host "Adding CUDA to PATH" | ||
| Write-Host "CUDA Path: $env:RUNNER_TEMP\v12.2\bin" | ||
| Add-Content -Path $env:GITHUB_PATH -Value "$env:RUNNER_TEMP\v12.2\bin" | ||
| Add-Content -Path $env:GITHUB_PATH -Value "$env:RUNNER_TEMP\v12.2\extras\CUPTI\lib64" | ||
| Add-Content -Path $env:GITHUB_PATH -Value "$env:RUNNER_TEMP\TensorRT-10.9.0.34.Windows10.x86_64.cuda-12.8\lib" | ||
|
|
||
| - name: Set OnnxRuntimeBuildDirectory | ||
| shell: pwsh | ||
| run: | | ||
| $buildDir = Join-Path ${{ runner.temp }} "build" | ||
| echo "OnnxRuntimeBuildDirectory=$buildDir" >> $env:GITHUB_ENV | ||
|
|
||
| - name: Install ONNX Runtime Wheel | ||
| uses: ./.github/actions/install-onnxruntime-wheel | ||
| with: | ||
| whl-directory: ${{ runner.temp }}\build\RelWithDebInfo\RelWithDebInfo\dist | ||
|
|
||
| - name: Run Tests | ||
| working-directory: ${{ runner.temp }} | ||
| run: | | ||
| npm install -g typescript | ||
| if ($lastExitCode -ne 0) { | ||
| exit $lastExitCode | ||
| } | ||
|
|
||
| python.exe ${{ github.workspace }}\tools\python\update_ctest_path.py "${{ runner.temp }}\build\RelWithDebInfo\CTestTestfile.cmake" "${{ runner.temp }}\build\RelWithDebInfo" | ||
| if ($lastExitCode -ne 0) { | ||
| exit $lastExitCode | ||
| } | ||
|
|
||
| python ${{ github.workspace }}\tools\ci_build\build.py --config RelWithDebInfo --parallel --use_binskim_compliant_compile_flags --build_dir build --skip_submodule_sync --build_shared_lib --test --cmake_generator "Visual Studio 17 2022" --build_wheel --enable_onnx_tests --use_tensorrt --tensorrt_home="${{ runner.temp }}\TensorRT-10.9.0.34.Windows10.x86_64.cuda-12.8" --cuda_home="${{ runner.temp }}\v12.2" --use_vcpkg --use_vcpkg_ms_internal_asset_cache --cmake_extra_defines CMAKE_CUDA_ARCHITECTURES=86 | ||
| if ($lastExitCode -ne 0) { | ||
| exit $lastExitCode | ||
| } | ||
| shell: pwsh | ||
|
|
||
| - name: Validate C# native delegates | ||
| run: python tools\ValidateNativeDelegateAttributes.py | ||
| working-directory: ${{ github.workspace }}\csharp | ||
| shell: cmd | ||
| env: | ||
| OrtPackageId: Microsoft.ML.OnnxRuntime.Gpu | ||
| DOTNET_SKIP_FIRST_TIME_EXPERIENCE: true | ||
| setVcvars: true | ||
| ALLOW_RELEASED_ONNX_OPSET_ONLY: '0' | ||
| DocUpdateNeeded: false | ||
| ONNXRUNTIME_TEST_GPU_DEVICE_ID: '0' | ||
| AZCOPY_AUTO_LOGIN_TYPE: MSI | ||
| AZCOPY_MSI_CLIENT_ID: 63b63039-6328-442f-954b-5a64d124e5b4 | ||
|
Comment on lines
+139
to
+242
Check warningCode scanning / CodeQL Workflow does not contain permissions Medium
Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {contents: read}
|
||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Check warning
Code scanning / CodeQL
Workflow does not contain permissions Medium
Copilot Autofix
AI 13 days ago
To fix the problem, you should add an explicit
permissionsblock at the root of the workflow file (.github/workflows/windows_tensorrt.yml), which will apply to all jobs unless overridden at the job level. The principle of least privilege dictates starting with the minimum required, and for most CI workflows that do not interact with repository contents except for checking out code,contents: readis usually sufficient. If actions in your workflow require additional permissions (like creating issues or pull requests), you can add those explicitly, but based on the provided steps, onlycontents: readis needed. Insert the following block after the workflow name, beforeon::No imports or new package installations are required for this YAML change.