-
Notifications
You must be signed in to change notification settings - Fork 807
LLVM and SPIRV-LLVM-Translator pulldown (WW36 2025) #19997
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
It will get expanded into MOVPRFX_ZZ and EXT_ZZI by the AArch64ExpandPseudo pass. This instruction takes a single Z register as input, as opposed to the existing destructive EXT_ZZI instruction. Note this patch only defines the pseudo, it isn't used in any ISel pattern yet. It will later be used for vector.extract.
…584) This reverts commit 14cd133. The buildbot failure seems to have been a cmake issue which has been discussed in more detail in this Discourse post: https://discourse.llvm.org/t/cmake-doesnt-regenerate-all-tablegen-target-files/87901 If any buildbots fail to select arbitrary intrinsics with this patch, it's worth considering using clean builds with ccache instead of incremental builds, as recommended here: https://llvm.org/docs/HowToAddABuilder.html#:~:text=Use%20CCache%20and%20NOT%20incremental%20builds The original commit message for this patch: Add the llvm.amdgcn.call.whole.wave intrinsic for calling whole wave functions. This will take as its first argument the callee with the amdgpu_gfx_whole_wave calling convention, followed by the call parameters which must match the signature of the callee except for the first function argument (the i1 original EXEC mask, which doesn't need to be passed in). Indirect calls are not allowed. Make direct calls to amdgpu_gfx_whole_wave functions a verifier error. Tail calls are handled in a future patch.
…orms (#153381) This currently generates many linker warnings of this form, due to defining mem(cpy|move|set) in every object file: ``` ld: warning: '.../build/projects/compiler-rt/lib/interception/CMakeFiles/RTInterception.ios.dir/interception_linux.cpp.o' has malformed LC_DYSYMTAB, expected 6 undefined symbols to start at index 1, found 3 undefined symbols starting at index 1 ``` In order for this to actually replace these symbols on mach-o, they would need a leading underscore, e.g. `.set _memcpy, ___sanitizer_internal_memcpy`. However doing so does not fix the warnings, and furthermore it ends up replacing `REAL(memcpy)` calls with `__sanitizer_internal_memcpy` in places such as `__asan::Allocator::Reallocate`. There is no way on Apple platforms to recreate the intended behaviour, so let's just disable this on them to reduce warning noise. rdar://123771479
…1134) Similarly to llvm/llvm-project#131538, we can also try and check if a predicate is known to wrap given the backedge taken count. For now, this just checks directly when we try to create predicated AddRecs. This both helps to avoid spending compile-time on optimizations where we know the predicate is false, and can also help to allow additional vectorization (e.g. by deciding to scalarize memory accesses when otherwise we would try to create a predicated AddRec with a predicate that's always false). The initial version is quite restricted, but can be extended in follow-ups to cover more cases. PR: llvm/llvm-project#151134
Add an `olLaunchHostFunction` method that allows enqueueing host work to the stream.
…64-bit test coverage
First step in introducing the wasm-import target to mlir-translate. This is the first PR to introduce the pass, with this PR, there is very little support for the actual WebAssembly language, it's mostly there to introduce the skeleton of the importer. A follow-up will come with support for a wider range of operators. It was split to make it easier to review, since it's a good chunk of work. --------- Co-authored-by: Luc Forget <[email protected]> Co-authored-by: Ferdinand Lemaire <[email protected]> Co-authored-by: Jessica Paquette <[email protected]> Co-authored-by: Luc Forget <[email protected]>
…153379) Add a new unit attribute to allow for unsigned integer comparison. Example: ```mlir scf.for unsigned %iv_32 = %lb_32 to %ub_32 step %step_32 : i32 { // body } ``` Discussion: https://discourse.llvm.org/t/scf-should-scf-for-support-unsigned-comparison/84655
These variants require a different exception table that requires a bit of initialisation. This allows us to enable testing for these variants downstream.
…ilure (#153605) Prior to this PR, the default behaviour of a conversion pattern which receives operands of a 1:N is to abort the compilation. This has historically been useful when the 1:N type conversion got merged into the dialect conversion as it allowed us to easily find patterns that should be capable of handling 1:N type conversions but didn't. However, this behaviour has the disadvantage of being non-composable: While the pattern in question cannot handle the 1:N type conversion, another pattern part of the set might, but doesn't get the chance as compilation is aborted. This PR fixes this behaviour by failing to match and instead of aborting, giving other patterns the chance to legalize an op. The implementation uses a reusable function called `dispatchTo1To1` to allow derived conversion patterns to also implement the behaviour.
Pull the logic to compute bit attributes from `filterProcessor()` to its caller to avoid recomputing them on the second call.
If there is a relocation for a particular FDE, print it as well. This is mainly meant for human consumption (otherwise, there's no way to tell which function a given (relocatable) FDE refers to). For testing of relocation generation, I'd still recommend using the regular relocation dumper, as this code will not detect (e.g.) any superfluous relocations. I've considered handling relocations inside the SFrameParser class, but I couldn't find an elegant way to do that. Right now, I don't have a use case for resolving relocations there as lldb (my other use case for SFrameParser) will always operate on linked objects.
The script copies `ReleaseNotesTemplate.txt` to corresponding `ReleaseNotes.rst`/`.md` to clear release notes. The suffix of `ReleaseNotesTemplate.txt` must be `.txt`. If it is `.rst`/`.md`, it will be treated as a documentation source file when building documentation.
…153616) Call `recordInliningWithCalleeDeleted` before dropping the contents of the Callee. Otherwise the handlers don't have access to e.g. the DebugLoc, so the Callee DebugLoc was missing in inlining remarks for functions with internal linkage. The test is the same as `optimization-remarks-passed-yaml.ll` except that the function `foo` has internal linkage instead of external linkage.
Inspired by #151893
…ins.c - add C/C++ and 32/64-bit test coverage
They use extract shuffles for fixed vectors, and llvm.vector.splice intrinsics for scalable vectors. In the previous tests using ld+extract+st, the extract was optimized away and replaced by a smaller load at the right offset. This meant we didn't really test the vector_splice ISD node.
This test has been flakey on our bot: https://lab.llvm.org/buildbot/#/builders/18/builds/20410 ``` ====================================================================== FAIL: test_extra_launch_commands (TestDAP_launch.TestDAP_launch) Tests the "launchCommands" with extra launching settings ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/tcwg-buildbot/worker/lldb-arm-ubuntu/llvm-project/lldb/test/API/tools/lldb-dap/launch/TestDAP_launch.py", line 482, in test_extra_launch_commands self.verify_commands("stopCommands", output, stopCommands) File "/home/tcwg-buildbot/worker/lldb-arm-ubuntu/llvm-project/lldb/packages/Python/lldbsuite/test/tools/lldb-dap/lldbdap_testcase.py", line 228, in verify_commands self.assertTrue( AssertionError: False is not true : verify 'frame variable' found in console output for 'stopCommands' Config=arm-/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/bin/clang ---------------------------------------------------------------------- ``` Likely a timing issue waiting for the command output on a slower machine. General tracking issue - llvm/llvm-project#137660
Use a debug log instead for now.
Fixes 4f65345 Yet again I forgot it's skip[I]f.
…ompressstore}` (#153063) * Add `requiresArgsAndResultsAttr` to `LLVM_OneResultIntrOp` * Add `args_attrs` to `llvm.intr.masked.{expandload,compressstore}` The LLVM intrinsics [`llvm.intr.masked.expandload`](https://llvm.org/docs/LangRef.html#llvm-masked-expandload-intrinsics) and [`llvm.intr.masked.compressstore`](https://llvm.org/docs/LangRef.html#llvm-masked-compressstore-intrinsics) both allow an optional align parameter attribute to be set which defaults to one. Inlining the documentation below for [`llvm.intr.masked.expandload` 's ](https://llvm.org/docs/LangRef.html#id1522) and [`llvm.intr.masked.compressstore`'s](https://llvm.org/docs/LangRef.html#id1522) arguments respectively > The `align` parameter attribute can be provided for the first argument. The pointer alignment defaults to 1. > The `align` parameter attribute can be provided for the second argument. The pointer alignment defaults to 1.
Pulled out of #151893 to show 32/64-bit target coverage
Helps check quality of legality codegen (all we had was x86 i64 handling)
In current DebugLoc coverage builds, the output for any reasonably large build can become very large if any missing DebugLocs are present; this happens because single errors in LLVM may result in many errors being reported in the output report. The main cause of this is that the empty locations attached to instructions may be propagated to other instructions in later passes, which will each be reported as new errors. This patch prevents this by adding an "unknown" annotation to instructions after reporting them once, ensuring that any other DebugLocs copied or derived from the original empty location will not be marked as new errors. As a separate but related change, this patch updates the report generation script to deduplicate results using the recorded stacktrace if they are available, instead of the pass+instruction combination. This reduces the size of the reduction, but makes the reduction highly reliable, as the stacktrace allows us to very precisely identify when two bugs have originated from the same place.
…0.0 tests. NFC" This reverts commit 16314eb as the test cases are failing under EXPENSIVE_CHECKS. Scalar vecreduce.fadd are not valid in GISel.
The patch adds patterns to select the EXT_ZZI_CONSTRUCTIVE pseudo instead of the EXT_ZZI destructive instruction for vector_splice. This only works when the two inputs to vector_splice are identical. Given that registers aren't tied anymore, this gives the register allocator more freedom and a lot of MOVs get replaced with MOVPRFX. In some cases however, we could have just chosen the same input and output register, but regalloc preferred not to. This means we end up with some test cases now having more instructions: there is now a MOVPRFX while no MOV was previously needed.
CUDA/HIP fixes and XFAILS.
@intel/llvm-reviewers-cuda Driver XFAILS:
@intel/dpcpp-clang-driver-reviewers LLVM-SPIRV fixes and XFAILS:
@intel/dpcpp-spirv-reviewers |
This failed Assert/assert_in_multiple_tus.cpp test should NOT be executed in sycl-rel-6_2 (https://github.com/intel/llvm/actions/runs/18040835304/job/51343001440?pr=19997), since it's marked as "UNSUPPORTED: level_zero" in line 13.
Log
|
`-fsycl-targets=` is now an alias for `--offload-targets=` and hence `--offload-targets=` must also be added to the unsupported arg list as the Clang driver code checks for matching ID for `--offload-targets=` as well. Fixes #20127
@intel/llvm-gatekeepers I think this is ready for merge. The remaining 2 failures are common to other PR. |
/merge |
Fri 26 Sep 2025 10:04:13 PM UTC --- Start to merge the commit into sycl branch. It will take several minutes. |
Fri 26 Sep 2025 10:15:20 PM UTC --- Merge the branch in this PR to base automatically. Will close the PR later. |
LLVM: llvm/llvm-project@76bb987
SPIRV-LLVM-Translator: KhronosGroup/SPIRV-LLVM-Translator@c8eef9121a7f26b