Skip to content

Conversation

iclsrc
Copy link
Collaborator

@iclsrc iclsrc commented Sep 6, 2025

gbossu and others added 30 commits August 15, 2025 09:05
It will get expanded into MOVPRFX_ZZ and EXT_ZZI by the
AArch64ExpandPseudo pass. This instruction takes a single Z register as
input, as opposed to the existing destructive EXT_ZZI instruction.

Note this patch only defines the pseudo, it isn't used in any ISel
pattern yet. It will later be used for vector.extract.
…584)

This reverts commit 14cd133. The
buildbot failure seems to have been a cmake issue which has been
discussed in more detail in this Discourse post:

https://discourse.llvm.org/t/cmake-doesnt-regenerate-all-tablegen-target-files/87901

If any buildbots fail to select arbitrary intrinsics with this patch,
it's worth considering using clean builds with ccache instead of
incremental builds, as recommended here:

https://llvm.org/docs/HowToAddABuilder.html#:~:text=Use%20CCache%20and%20NOT%20incremental%20builds

The original commit message for this patch:
Add the llvm.amdgcn.call.whole.wave intrinsic for calling whole wave
functions. This will take as its first argument the callee with the
amdgpu_gfx_whole_wave calling convention, followed by the call
parameters which must match the signature of the callee except for the
first function argument (the i1 original EXEC mask, which doesn't need
to be passed in). Indirect calls are not allowed.

Make direct calls to amdgpu_gfx_whole_wave functions a verifier error.

Tail calls are handled in a future patch.
…orms (#153381)

This currently generates many linker warnings of this form, due to
defining mem(cpy|move|set) in every object file:
```
ld: warning: '.../build/projects/compiler-rt/lib/interception/CMakeFiles/RTInterception.ios.dir/interception_linux.cpp.o' has malformed LC_DYSYMTAB, expected 6 undefined symbols to start at index 1, found 3 undefined symbols starting at index 1
```

In order for this to actually replace these symbols on mach-o, they
would need a leading underscore, e.g. `.set _memcpy,
___sanitizer_internal_memcpy`. However doing so does not fix the
warnings, and furthermore it ends up replacing `REAL(memcpy)` calls with
`__sanitizer_internal_memcpy` in places such as
`__asan::Allocator::Reallocate`. There is no way on Apple platforms to
recreate the intended behaviour, so let's just disable this on them to
reduce warning noise.

rdar://123771479
…1134)

Similarly to llvm/llvm-project#131538, we can
also try and check if a predicate is known to wrap given the backedge
taken count.

For now, this just checks directly when we try to create predicated
AddRecs. This both helps to avoid spending compile-time on optimizations
where we know the predicate is false, and can also help to allow
additional vectorization (e.g. by deciding to scalarize memory accesses
when otherwise we would try to create a predicated AddRec with a
predicate that's always false).

The initial version is quite restricted, but can be extended in
follow-ups to cover more cases.

PR: llvm/llvm-project#151134
Add an `olLaunchHostFunction` method that allows enqueueing host work
to the stream.
First step in introducing the wasm-import target to mlir-translate. 
This is the first PR to introduce the pass, with this PR, there is very
little support for the actual WebAssembly language, it's mostly there to
introduce the skeleton of the importer. A follow-up will come with
support for a wider range of operators. It was split to make it easier
to review, since it's a good chunk of work.

---------

Co-authored-by: Luc Forget <[email protected]>
Co-authored-by: Ferdinand Lemaire <[email protected]>
Co-authored-by: Jessica Paquette <[email protected]>
Co-authored-by: Luc Forget <[email protected]>
…153379)

Add a new unit attribute to allow for unsigned integer comparison.

Example:
```mlir
scf.for unsigned %iv_32 = %lb_32 to %ub_32 step %step_32 : i32 {
  // body
}
```

Discussion:
https://discourse.llvm.org/t/scf-should-scf-for-support-unsigned-comparison/84655
These variants require a different exception table that requires a bit
of initialisation.

This allows us to enable testing for these variants downstream.
…ilure (#153605)

Prior to this PR, the default behaviour of a conversion pattern which
receives operands of a 1:N is to abort the compilation. This has
historically been useful when the 1:N type conversion got merged into
the dialect conversion as it allowed us to easily find patterns that
should be capable of handling 1:N type conversions but didn't.

However, this behaviour has the disadvantage of being non-composable:
While the pattern in question cannot handle the 1:N type conversion,
another pattern part of the set might, but doesn't get the chance as
compilation is aborted.

This PR fixes this behaviour by failing to match and instead of
aborting, giving other patterns the chance to legalize an op. The
implementation uses a reusable function called `dispatchTo1To1` to allow
derived conversion patterns to also implement the behaviour.
Pull the logic to compute bit attributes from `filterProcessor()` to its
caller to avoid recomputing them on the second call.
If there is a relocation for a particular FDE, print it as well. This is
mainly meant for human consumption (otherwise, there's no way to tell
which function a given (relocatable) FDE refers to). For testing of
relocation generation, I'd still recommend using the regular relocation
dumper, as this code will not detect (e.g.) any superfluous relocations.

I've considered handling relocations inside the SFrameParser class, but
I couldn't find an elegant way to do that. Right now, I don't have a use
case for resolving relocations there as lldb (my other use case for
SFrameParser) will always operate on linked objects.
The script copies `ReleaseNotesTemplate.txt` to corresponding
`ReleaseNotes.rst`/`.md` to clear release notes.

The suffix of `ReleaseNotesTemplate.txt` must be `.txt`. If it is
`.rst`/`.md`, it will be treated as a documentation source file when
building documentation.
…153616)

Call `recordInliningWithCalleeDeleted` before dropping the contents of
the Callee. Otherwise the handlers don't have access to e.g. the
DebugLoc, so the Callee DebugLoc was missing in inlining remarks for
functions with internal linkage.

The test is the same as `optimization-remarks-passed-yaml.ll` except
that the function `foo` has internal linkage instead of external linkage.
…ins.c - add C/C++ and 32/64-bit test coverage
They use extract shuffles for fixed vectors, and
llvm.vector.splice intrinsics for scalable vectors.

In the previous tests using ld+extract+st, the extract was optimized
away and replaced by a smaller load at the right offset. This meant
we didn't really test the vector_splice ISD node.
This test has been flakey on our bot:
https://lab.llvm.org/buildbot/#/builders/18/builds/20410

```
======================================================================
FAIL: test_extra_launch_commands (TestDAP_launch.TestDAP_launch)
    Tests the "launchCommands" with extra launching settings
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/tcwg-buildbot/worker/lldb-arm-ubuntu/llvm-project/lldb/test/API/tools/lldb-dap/launch/TestDAP_launch.py", line 482, in test_extra_launch_commands
    self.verify_commands("stopCommands", output, stopCommands)
  File "/home/tcwg-buildbot/worker/lldb-arm-ubuntu/llvm-project/lldb/packages/Python/lldbsuite/test/tools/lldb-dap/lldbdap_testcase.py", line 228, in verify_commands
    self.assertTrue(
AssertionError: False is not true : verify 'frame variable' found in console output for 'stopCommands'
Config=arm-/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/bin/clang
----------------------------------------------------------------------
```

Likely a timing issue waiting for the command output on a slower
machine.

General tracking issue - llvm/llvm-project#137660
Fixes 4f65345

Yet again I forgot it's skip[I]f.
…ompressstore}` (#153063)

* Add `requiresArgsAndResultsAttr` to `LLVM_OneResultIntrOp`
* Add `args_attrs` to `llvm.intr.masked.{expandload,compressstore}`

The LLVM intrinsics
[`llvm.intr.masked.expandload`](https://llvm.org/docs/LangRef.html#llvm-masked-expandload-intrinsics)
and
[`llvm.intr.masked.compressstore`](https://llvm.org/docs/LangRef.html#llvm-masked-compressstore-intrinsics)
both allow an optional align parameter attribute to be set which
defaults to one.

Inlining the documentation below for [`llvm.intr.masked.expandload` 's
](https://llvm.org/docs/LangRef.html#id1522) and
[`llvm.intr.masked.compressstore`'s](https://llvm.org/docs/LangRef.html#id1522)
arguments respectively

> The `align` parameter attribute can be provided for the first
argument. The pointer alignment defaults to 1.

> The `align` parameter attribute can be provided for the second
argument. The pointer alignment defaults to 1.
Pulled out of #151893 to show 32/64-bit target coverage
Helps check quality of legality codegen (all we had was x86 i64 handling)
In current DebugLoc coverage builds, the output for any reasonably large
build can become very large if any missing DebugLocs are present; this
happens because single errors in LLVM may result in many errors being
reported in the output report. The main cause of this is that the empty
locations attached to instructions may be propagated to other
instructions in later passes, which will each be reported as new errors.
This patch prevents this by adding an "unknown" annotation to
instructions after reporting them once, ensuring that any other
DebugLocs copied or derived from the original empty location will not be
marked as new errors.

As a separate but related change, this patch updates the report
generation script to deduplicate results using the recorded stacktrace
if they are available, instead of the pass+instruction combination. This
reduces the size of the reduction, but makes the reduction highly
reliable, as the stacktrace allows us to very precisely identify when
two bugs have originated from the same place.
…0.0 tests. NFC"

This reverts commit 16314eb as the test cases
are failing under EXPENSIVE_CHECKS. Scalar vecreduce.fadd are not valid in
GISel.
The patch adds patterns to select the EXT_ZZI_CONSTRUCTIVE pseudo
instead of the EXT_ZZI destructive instruction for vector_splice. This
only works when the two inputs to vector_splice are identical.

Given that registers aren't tied anymore, this gives the register
allocator more freedom and a lot of MOVs get replaced with MOVPRFX.

In some cases however, we could have just chosen the same input and
output register, but regalloc preferred not to. This means we end up
with some test cases now having more instructions: there is now a
MOVPRFX while no MOV was previously needed.
@fzou1
Copy link
Contributor

fzou1 commented Sep 26, 2025

This failed Assert/assert_in_multiple_tus.cpp test should NOT be executed in sycl-rel-6_2 (https://github.com/intel/llvm/actions/runs/18040835304/job/51343001440?pr=19997), since it's marked as "UNSUPPORTED: level_zero" in line 13.

// L0 does not currently abort after synchronizing with a failing kernel.
// UNSUPPORTED: level_zero
// UNSUPPORTED-TRACKER: GSD-11097

Log

FAIL: SYCL :: Assert/assert_in_multiple_tus.cpp (100 of 2223)

  ******************** TEST 'SYCL :: Assert/assert_in_multiple_tus.cpp' FAILED ********************

  Exit Code: 1



  Command Output (stderr):

  --

  RUN: at line 16: env ONEAPI_DEVICE_SELECTOR=level_zero:gpu  /__w/llvm/llvm/build-e2e/Assert/Output/assert_in_multiple_tus.cpp.tmp.out &> /__w/llvm/llvm/build-e2e/Assert/Output/assert_in_multiple_tus.cpp.tmp.txt ; /__w/llvm/llvm/toolchain/bin/FileCheck /__w/llvm/llvm/llvm/sycl/test-e2e/Assert/assert_in_multiple_tus.cpp --input-file /__w/llvm/llvm/build-e2e/Assert/Output/assert_in_multiple_tus.cpp.tmp.txt

  + env ONEAPI_DEVICE_SELECTOR=level_zero:gpu /__w/llvm/llvm/build-e2e/Assert/Output/assert_in_multiple_tus.cpp.tmp.out

  /__w/llvm/llvm/build-e2e/Assert/Output/assert_in_multiple_tus.cpp.script: line 1:  3067 Aborted                 (core dumped) env ONEAPI_DEVICE_SELECTOR=level_zero:gpu /__w/llvm/llvm/build-e2e/Assert/Output/assert_in_multiple_tus.cpp.tmp.out &> /__w/llvm/llvm/build-e2e/Assert/Output/assert_in_multiple_tus.cpp.tmp.txt

  + /__w/llvm/llvm/toolchain/bin/FileCheck /__w/llvm/llvm/llvm/sycl/test-e2e/Assert/assert_in_multiple_tus.cpp --input-file /__w/llvm/llvm/build-e2e/Assert/Output/assert_in_multiple_tus.cpp.tmp.txt

  /__w/llvm/llvm/llvm/sycl/test-e2e/Assert/assert_in_multiple_tus.cpp:23:15: error: CHECK-NOT: excluded string found in input

  // CHECK-NOT: this message from file2

                ^

  /__w/llvm/llvm/build-e2e/Assert/Output/assert_in_multiple_tus.cpp.tmp.txt:4:154: note: found here

  /__w/llvm/llvm/llvm/sycl/test-e2e/Assert/Inputs/kernels_in_file2.cpp:19: void check_nil(int): global id: [0,0,0], local id: [0,0,0] Assertion `value && "this message from file2"` failed.

                                                                                                                                                           ^~~~~~~~~~~~~~~~~~~~~~~



  Input file: /__w/llvm/llvm/build-e2e/Assert/Output/assert_in_multiple_tus.cpp.tmp.txt

  Check file: /__w/llvm/llvm/llvm/sycl/test-e2e/Assert/assert_in_multiple_tus.cpp



  -dump-input=help explains the following input dump.



  Input was:

  <<<<<<

          1: AssertHandler::printMessage

          2: /__w/llvm/llvm/llvm/sycl/test-e2e/Assert/Inputs/kernels_in_file2.cpp:15: int calculus(int): global id: [5,0,0], local id: [1,0,0] Assertion `X && "this message from calculus"` failed.

          3: /__w/llvm/llvm/llvm/sycl/test-e2e/Assert/assert_in_multiple_tus.hpp:23: int checkFunction(): global id: [5,0,0], local id: [1,0,0] Assertion `X && "Nil in result"` failed.

          4: /__w/llvm/llvm/llvm/sycl/test-e2e/Assert/Inputs/kernels_in_file2.cpp:19: void check_nil(int): global id: [0,0,0], local id: [0,0,0] Assertion `value && "this message from file2"` failed.

  not:23                                                                                                                                                              !~~~~~~~~~~~~~~~~~~~~~~            error: no match expected

          5: /__w/llvm/llvm/llvm/sycl/test-e2e/Assert/Inputs/kernels_in_file2.cpp:19: void check_nil(int): global id: [1,0,0], local id: [1,0,0] Assertion `value && "this message from file2"` failed.

          6: /__w/llvm/llvm/llvm/sycl/test-e2e/Assert/Inputs/kernels_in_file2.cpp:19: void check_nil(int): global id: [2,0,0], local id: [2,0,0] Assertion `value && "this message from file2"` failed.

          7: /__w/llvm/llvm/llvm/sycl/test-e2e/Assert/Inputs/kernels_in_file2.cpp:19: void check_nil(int): global id: [3,0,0], local id: [3,0,0] Assertion `value && "this message from file2"` failed.

  >>>>>>

`-fsycl-targets=` is now an alias for `--offload-targets=` and hence
`--offload-targets=` must also be added to the unsupported arg list as
the Clang driver code checks for matching ID for `--offload-targets=` as
well.

Fixes #20127
@jsji
Copy link
Contributor

jsji commented Sep 26, 2025

@intel/llvm-gatekeepers I think this is ready for merge. The remaining 2 failures are common to other PR.

@uditagarwal97
Copy link
Contributor

/merge

@bb-sycl
Copy link
Contributor

bb-sycl commented Sep 26, 2025

Fri 26 Sep 2025 10:04:13 PM UTC --- Start to merge the commit into sycl branch. It will take several minutes.

@bb-sycl
Copy link
Contributor

bb-sycl commented Sep 26, 2025

Fri 26 Sep 2025 10:15:20 PM UTC --- Merge the branch in this PR to base automatically. Will close the PR later.

@bb-sycl bb-sycl merged commit 36363e6 into sycl Sep 26, 2025
80 of 84 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disable-lint Skip linter check step and proceed with build jobs
Projects
None yet
Development

Successfully merging this pull request may close these issues.