Skip to content

Conversation

@pull
Copy link

@pull pull bot commented Oct 31, 2025

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

@pull pull bot locked and limited conversation to collaborators Oct 31, 2025
@pull pull bot added the ⤵️ pull label Oct 31, 2025
philnik777 and others added 28 commits November 2, 2025 10:27
…is_within_lifetime`) (#165243)

<https://wg21.link/P2641R4>

Implements the C++26 function in `<type_traits>` [meta.const.eval] (and
the corresponding feature test macro `__cpp_lib_is_within_lifetime`)

```c++
template<class T>
  consteval bool is_within_lifetime(const T*) noexcept;
```

This is done with the `__builtin_is_within_lifetime` builtin added to
Clang 20 by #91895 / 2a07509. This is
not (currently) available with GCC.

This implementation has provisions for LWG4138
<https://cplusplus.github.io/LWG/issue4138> where it is ill-formed to
instantiate `is_within_lifetime<T>` with a function type `T`.

Closes #105381

Co-authored-by: Mital Ashok <[email protected]>
…#163021)

At the moment, the effectivness of guards that contain divisibility
information (A % B == 0 ) depends on the order of the conditions.

This patch makes using divisibility information independent of the
order, by collecting and applying the divisibility information
separately.

We first collect all conditions in a vector, then collect the
divisibility information from all guards.

When processing other guards, we apply divisibility info collected
earlier.

After all guards have been processed, we add the divisibility info,
rewriting the existing rewrite. This ensures we apply the divisibility
info to the largest rewrite expression.

This helps to improve results in a few cases, one in
dtcxzyw/llvm-opt-benchmark#2921 and another one
in a different large C/C++ based IR corpus.

PR: #163021
Suggest the initializer_list overload instead.

For more context, see #163117.
Identified with modernize-use-equals-default.
In C++17, static constexpr members are implicitly inline, so they no
longer require an out-of-line definition.

This patch removes the unnecessary "const" from the in-class
declaration as well.

Identified with readability-redundant-declaration.
Identified with readability-redundant-typename.
Previously this only checked for OpBuilder usage, but it could also be
invoked via pointer. Also change how call range is calculated to avoid
false overlaps which limits rewriting builder calls inside arguments of
builder calls.

---------

Co-authored-by: EugeneZelenko <[email protected]>
Co-authored-by: Victor Chernyakin <[email protected]>
This reverts commit 2253413.

This caused a bunch of buildbot failures:
1. https://lab.llvm.org/buildbot/#/builders/17/builds/12328 - fixed-shadow.c
uses parantheses, almost seems flaky.
2. https://lab.llvm.org/buildbot/#/builders/186/builds/13651 - log-path_test.cpp
is unresolved on some builders.
3. https://lab.llvm.org/buildbot/#/builders/64/builds/6289 - unset is not found
4. https://lab.llvm.org/buildbot/#/builders/42/builds/6809 - ulimit issues
5. https://lab.llvm.org/buildbot/#/builders/169/builds/16628 - flaky JIT failures
that are now a lot more frequent.
Macos sincos stret was not well covered, and 64-bit wasn't
checked for vector sincos cases.
BranchOnCount and BranchOnCond do not read memory, but cannot be moved.

Mark them as having side-effects, but not reading/writing memory, which
more accurately models that above. This allows removing some special
checking for branches both in the current code and future patches.
In C++17, static constexpr members are implicitly inline, so they no
longer require an out-of-line definition.

Identified with readability-redundant-declaration.
In C++17, static constexpr members are implicitly inline, so they no
longer require an out-of-line definition.

Identified with readability-redundant-declaration.
Identified with readability-redundant-declaration.
Identified with readability-redundant-typename.
This was probably left over from one of my debugging sessions. Remove it
to clean up the code slightly.
jayfoad and others added 30 commits November 4, 2025 12:00
Also add a corresponding intrinsic property that can be used to mark
intrinsics that do not introduce poison, for example simple arithmetic
intrinsics that propagate poison just like a simple arithmetic
instruction.

As a smoke test this patch adds the new property to
llvm.amdgcn.fmul.legacy.
Fails with:
```
RUN: at line 4
split-file C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\test\Shell\Recognizer\registration-unique.test C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\tools\lldb\test\Shell\Recognizer\Output\registration-unique.test.tmp
executed command: split-file 'C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\test\Shell\Recognizer\registration-unique.test' 'C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\tools\lldb\test\Shell\Recognizer\Output\registration-unique.test.tmp'
note: command had no output on stdout or stderr
RUN: at line 6
c:\users\tcwg\llvm-worker\lldb-aarch64-windows\build\bin\clang.exe --target=specify-a-target-or-use-a-_host-substitution --target=aarch64-pc-windows-msvc -fmodules-cache-path=C:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/lldb-test-build.noindex/module-cache-clang\lldb-shell C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\tools\lldb\test\Shell\Recognizer\Output\registration-unique.test.tmp/main.cpp -g -o C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\tools\lldb\test\Shell\Recognizer\Output\registration-unique.test.tmp/cpp.out
executed command: 'c:\users\tcwg\llvm-worker\lldb-aarch64-windows\build\bin\clang.exe' --target=specify-a-target-or-use-a-_host-substitution --target=aarch64-pc-windows-msvc '-fmodules-cache-path=C:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/lldb-test-build.noindex/module-cache-clang\lldb-shell' 'C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\tools\lldb\test\Shell\Recognizer\Output\registration-unique.test.tmp/main.cpp' -g -o 'C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\tools\lldb\test\Shell\Recognizer\Output\registration-unique.test.tmp/cpp.out'
.---command stderr------------
| clang: warning: argument unused during compilation: '-fmodules-cache-path=C:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/lldb-test-build.noindex/module-cache-clang\lldb-shell' [-Wunused-command-line-argument]
`-----------------------------
RUN: at line 7
c:\users\tcwg\llvm-worker\lldb-aarch64-windows\build\bin\lldb.exe --no-lldbinit -S C:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/tools/lldb\test\Shell\lit-lldb-init-quiet -b -s C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\tools\lldb\test\Shell\Recognizer\Output\registration-unique.test.tmp/commands.input C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\tools\lldb\test\Shell\Recognizer\Output\registration-unique.test.tmp/cpp.out | c:\users\tcwg\llvm-worker\lldb-aarch64-windows\build\bin\filecheck.exe C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\test\Shell\Recognizer\registration-unique.test
executed command: 'c:\users\tcwg\llvm-worker\lldb-aarch64-windows\build\bin\lldb.exe' --no-lldbinit -S 'C:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/tools/lldb\test\Shell\lit-lldb-init-quiet' -b -s 'C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\tools\lldb\test\Shell\Recognizer\Output\registration-unique.test.tmp/commands.input' 'C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\tools\lldb\test\Shell\Recognizer\Output\registration-unique.test.tmp/cpp.out'
note: command had no output on stdout or stderr
executed command: 'c:\users\tcwg\llvm-worker\lldb-aarch64-windows\build\bin\filecheck.exe' 'C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\test\Shell\Recognizer\registration-unique.test'
.---command stderr------------
| C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\test\Shell\Recognizer\registration-unique.test:45:10: error: CHECK: expected string not found in input
| # CHECK: Assert StackFrame Recognizer
|          ^
| <stdin>:20:38: note: scanning from here
| 1: Verbose Trap StackFrame Recognizer, demangled symbol regex ^__clang_trap_msg
|                                      ^
| <stdin>:34:10: note: possible intended match here
| 3: Verbose Trap StackFrame Recognizer, demangled symbol regex ^__clang_trap_msg
|          ^
|
| Input file: <stdin>
| Check file: C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\test\Shell\Recognizer\registration-unique.test
|
| -dump-input=help explains the following input dump.
```
…t if all extracts would lead to UB on poison. (#164683)

This change aims to avoid inserting a freeze instruction between the
load and bitcast when scalarizing extend-extract. This is particularly
useful in combination with
#164682, which can then
potentially further scalarize, provided there is no freeze.

alive2 proof: https://alive2.llvm.org/ce/z/W-GD88
AMDGCN flavoured SPIR-V has slightly different defaults from what the BE
adopts: it assumes all extensions are enabled, and expects nonsemantic
debug info to be generated. Furthermore, it is necessary to encode in
the resulting SPIR-V binary that what was generated was AMDGCN
flavoured, which we do by setting the Generator Version to `UINT16_MAX`
(which matches what we expect to see at reverse translation). We will
register this generator version at
<https://github.com/KhronosGroup/SPIRV-Headers>. This is a preliminary
patch out of a series of patches that are needed for adopting the BE for
AMDGCN flavoured SPIR-V generation.
…ansfer_write to new offsets syntax (#162095)

Changes the `VectorToXeGPU` pass to generate `xegpu.load_nd/store_nd`
ops using new syntax with where offsets are specified at the load/store
ops level.
```mlir
// from this
%desc = xegpu.create_nd_tdesc %src[%off1, %off2]: memref<8x16xf16> -> !xegpu.tensor_desc<8x16xf16>
%res = xegpu.load_nd %desc : !xegpu.tensor_desc<8x16xf16> -> vector<8x16xf16>

// to this
%desc = xegpu.create_nd_tdesc %src: memref<8x16xf16> -> !xegpu.tensor_desc<8x16xf16>
%res = xegpu.load_nd %desc[%off1, %off2] : !xegpu.tensor_desc<8x16xf16> -> vector<8x16xf16>
```

In order to support cases with dimension reduction at the
`create_nd_tdesc` level (e.g. `memref<8x8x16xf16> ->
tensor_desc<8x16xf16>` it was decided to insert a memref.subview that
collapses the source shape to 2d, for example:

```mlir
// input:
%0 = vector.load %source[%off0, %off1, %off2] : memref<8x16x32xf32>, vector<8x16xf32>

// --vector-to-xegpu (old)
%tdesc = xegpu.create_nd_tdesc %source[%off0, %off1, %off2] : memref<8x16x32xf32> -> tdesc<8x32xf32>
%vec = xegpu.load_nd %tdesc

// --vector-to-xegpu (new)
%collapsed = memref.subview %source[%off0, 0, 0] [1, 16, 32] [1, 1, 1] :
    memref<8x16x32xf32> -> memref<16x32xf32, strided<[32, 1], offset: ?>>
%tdesc = xegpu.create_nd_tdesc %collapsed : memref<16x32xf32, ...> -> tdesc<8x32xf32>
%vec = xegpu.load_nd %tdesc[%off1, %off2]
```

<details><summary>Why we need to change that?</summary>

```mlir
// reduce dim and apply all 3 offsets at load_nd
%desc = xegpu.create_nd_tdesc %source : memref<8x16x32xf32> -> !xegpu.tensor_desc<16x32xf32>
// error: xegpu.load_nd len(offsets) != desc.rank
%res = xegpu.load_nd %desc[%off, %off, %off] : !xegpu.tensor_desc<16x32xf32> -> vector<8x16xf32>
```

</details>

---------

Signed-off-by: dchigarev <[email protected]>
…n of conditions (#165748)

In simplifycfg/cvp/sccp, we eliminate dead edges of switches according
to the knownbits/range info of conditions. However, these approximations
may not meet the real-world needs when the domain of condition values is
sparse. For example, if the condition can only be either -3 or 3, we
cannot prove that the condition never evaluates to 1 (knownbits:
???????1, range: [-3, 4)).
This patch adds a helper function `collectPossibleValues` to enumerate
all the possible values of V. To fix the motivating issue,
`eliminateDeadSwitchCases` will use the result to remove dead edges.

Note: In
https://discourse.llvm.org/t/missed-optimization-due-to-overflow-check/88700
I proposed a new value lattice kind to represent such values. But I find
it hard to apply because the transition becomes much complicated.

Compile-time impact looks neutral:
https://llvm-compile-time-tracker.com/compare.php?from=32d6b2139a6c8f79e074e8c6cfe0cc9e79c4c0c8&to=e47c26e3f1bf9eb062684dda4fafce58438e994b&stat=instructions:u
This patch removes many dead error-handling codes:
dtcxzyw/llvm-opt-benchmark#3012
Closes #165179.
In #165720 we started using a
DWARF API (`llvm::dwarf::getTag`) from `BinaryFormat`. This patch makes
dwarfdump link against the necessary LLVM component.

This fixes following linker error that started occurring on some of the
bots:
```
[7758/8172] Linking CXX executable bin/llvm-dwarfdump
FAILED: bin/llvm-dwarfdump
: && /usr/bin/c++ -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-uninitialized -Wno-nonnull -Wno-class-memaccess -Wno-redundant-move -Wno-pessimizing-move -Wno-array-bounds -Wno-stringop-overread -Wno-noexcept-type -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wno-misleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG -Wl,-rpath-link,/home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/build/./lib  -Wl,--gc-sections tools/llvm-dwarfdump/CMakeFiles/llvm-dwarfdump.dir/SectionSizes.cpp.o tools/llvm-dwarfdump/CMakeFiles/llvm-dwarfdump.dir/Statistics.cpp.o tools/llvm-dwarfdump/CMakeFiles/llvm-dwarfdump.dir/llvm-dwarfdump.cpp.o -o bin/llvm-dwarfdump  -Wl,-rpath,"\$ORIGIN/../lib:/home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/build/lib:"  lib/libLLVMAMDGPUDesc.so.22.0git  lib/libLLVMSPIRVDesc.so.22.0git  lib/libLLVMX86Desc.so.22.0git  lib/libLLVMAMDGPUInfo.so.22.0git  lib/libLLVMSPIRVInfo.so.22.0git  lib/libLLVMX86Info.so.22.0git  lib/libLLVMDebugInfoDWARF.so.22.0git  lib/libLLVMObject.so.22.0git  lib/libLLVMMC.so.22.0git  lib/libLLVMDebugInfoDWARFLowLevel.so.22.0git  lib/libLLVMTargetParser.so.22.0git  lib/libLLVMSupport.so.22.0git  -Wl,-rpath-link,/home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/build/lib && :
/usr/bin/ld: tools/llvm-dwarfdump/CMakeFiles/llvm-dwarfdump.dir/llvm-dwarfdump.cpp.o: undefined reference to symbol '_ZN4llvm5dwarf6getTagENS_9StringRefE'
/usr/bin/ld: /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/build/./lib/libLLVMBinaryFormat.so.22.0git: error adding symbols: DSO missing from command line
collect2: error: ld returned 1 exit status

```
-MG is supposed to suppress "file not found" diagnostics and instead
treat those as generated files for purposes of dependency scanning.
Clang was previously emitting the diagnostic instead of emitting the
name of the embedded file.

Fixes #165632
…ent (#163816)

This was left out of the original patch (#142392) to simplify the
initial implementation. However, after refactoring the SVE
prologue/epilogue code in #162253, it's not much of an extension to
support this case.

The main change here is when restoring the SP from the FP for the SVE
restores, we may need an additional frame offset to move from the start
of the ZPR callee-saves to the start of the PPR callee-saves.

This patch also fixes a previously latent bug where we'd add the
`RealignmentPadding` when allocating the PPR locals, then again for the
ZPR locals. This was unnecessary as the stack only needs to be realigned
after all SVE allocations.
Add a customizable `visitBlockTransfer` method to dense forward and
backward dataflow analyses, allowing subclasses to customize lattice
propagation behavior along control flow edges between blocks. Default
implementation preserves existing join/meet semantics.

This change mirrors the exiting structure of both dense dataflow
classes, where `RegionBranchOpInterface` and callables are allowed to be
customized by subclasses.

The use case motivating this change is dense liveness analysis.
Currently, without the customization hook the block transfer function
produces incorrect results. The issue is the current logic doesn't
remove the successor block arguments from the live set, as it only meets
the successor state with the predecessor state (ie. set union).
With this change is now possible to compute the correct result by
specifying the correct logic in `visitBlockTransfer`.

Signed-off-by: Fabian Mora <[email protected]>
The verifier of `xegpu.{load/store/prefetch}_nd` op fails if `offset` a
mix of static and dynamic values, e.g. `offset = [0, %c0]`. In this case
the length of dynamic offsets is 1 and the check `offsetSize !=
tDescRank` (=2) fails. Instead, we should check the length of
`getMixedOffsets()`.
…5791)

Update MMA tests to add run line for `cpu=future` to ensure MMA
functionality is not broken with the new `wacc` register classes
introduced. Previous commit have added def for using the new `wacc`
registers, this just add in testing and fixes a few patterns that was
missing .
…kernel_attributes` (#165891)

This adds BE support for the
[`SPV_INTEL_kernel_attributes`](https://github.khronos.org/SPIRV-Registry/extensions/INTEL/SPV_INTEL_kernel_attributes.html)
extension. The extension is necessary to encode the rather useful
`max_work_group_size` kernel attribute, via `OpExecutionMode
MaxWorkgroupSizeINTEL`, which is the only Execution Mode added by the
extension that this patch adds full processing for. Future patches will
add the other Execution Modes and Capabilities. The test is adapted from
the equivalent Translator test; it depends on #165815.
…165606)

`clang -x hip foo.c --offload-arch=amdgcnspirv --offload-new-driver
-save-temps` was crashing with the following error:

```
/usr/bin/ld: input file 'foo-x86_64-unknown-linux-gnu.o' is the same as output file
build/bin/clang-linker-wrapper: error: 'ld' failed
```

The `LinkerWrapperJobAction` [is
created](https://github.com/llvm/llvm-project/blob/957598f71bd8baa029d886e59ed9aed60e6e9bb9/clang/lib/Driver/Driver.cpp#L4888)
with `types::TY_Object` which makes `Driver::GetNamedOutputPath` assign
the same name as the assembler's output and thus causing the crash.
Replace the current `ADRRelaxationPass` with `AArch64RelaxationPass`,
which, besides the existing ADR relaxation, will also run LDR relaxation
that for now only handles these two forms of LDR instructions:
`ldr Xt, [label]` and `ldr Wt, [label]`.
According to discussion of
#153600 (comment)

add LLVM_ABI to function getMemcmp declaration
NDD ADD is only supported on 64 bit, but `LEA32` has
`Requires<[Not64BitMode]>`. The reason it doesnt fail upstream is that
the predicates check is commented out on `X86MCInstLower.cpp`:

```
  // FIXME: Enable feature predicate checks once all the test pass.
  // X86_MC::verifyInstructionPredicates(MI->getOpcode(),
  //                                     Subtarget->getFeatureBits());

```

Introduced by: #158254
Emit empty line after a namespace scope is opened and before its closed.
Adjust DirectiveEmitter code empty line emission in response to this to
avoid lot of unit test changes.
This patch moves llvm::to_address to STLForwardCompat.h, a collection
of backports from C++20 and beyond.
In C++17, static constexpr members are implicitly inline, so they no
longer require an out-of-line definition.

Once we remove the redundant declarations, Minidump.cpp becomes
effectively empty, so this patch removes the file.

Identified with readability-redundant-declaration.
This patch replaces:

  using Foo = enum { A, B, C };

with the more conventional:

  enum Foo { A, B, C };

These two enum declaration styles are not identical, but their
difference does not matter in these .cpp files.  With the "using Foo"
style, the enum is unnamed and cannot be forward-declared, whereas the
conventional style creates a named enum that can be.  Since these
changes are confined to .cpp files, this distinction has no practical
impact here.
Some followups after #131687 switched to the "runtimes build".

- The `check-sanitizer` build target doesn't exist in the runtimes build; use `check-runtimes` instead.
- ASan is not supported on 32-bit windows. Pass `-DCOMPILER_RT_BUILD_SANITIZERS=OFF`
- `check-runtimes` includes the orcjit tests, which never passed on windows; build with `-DCOMPILER_RT_BUILD_ORC=OFF`
- Various asan and libfuzzer tests fail; suppress them with `LIT_FILTER_OUT`
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.