forked from llvm/llvm-project
    
        
        - 
                Notifications
    
You must be signed in to change notification settings  - Fork 2
 
[pull] main from llvm:main #5651
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
          
     Open
      
      
            pull
  wants to merge
  484
  commits into
  Ericsson:main
  
    
      
        
          
  
    
      Choose a base branch
      
     
    
      
        
      
      
        
          
          
        
        
          
            
              
              
              
  
           
        
        
          
            
              
              
           
        
       
     
  
        
          
            
          
            
          
        
       
    
      
from
llvm:main
  
      
      
   
  
    
  
  
  
 
  
      
    base: main
Could not load branches
            
              
  
    Branch not found: {{ refName }}
  
            
                
      Loading
              
            Could not load tags
            
            
              Nothing to show
            
              
  
            
                
      Loading
              
            Are you sure you want to change the base?
            Some commits from the old base branch may be removed from the timeline,
            and old review comments may become outdated.
          
          
                
     Open
            
            
          
      
        
          +90,340
        
        
          −25,155
        
        
          
        
      
    
  
Conversation
  
    
      This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
      Learn more about bidirectional Unicode characters
    
  
  
    
    …is_within_lifetime`) (#165243) <https://wg21.link/P2641R4> Implements the C++26 function in `<type_traits>` [meta.const.eval] (and the corresponding feature test macro `__cpp_lib_is_within_lifetime`) ```c++ template<class T> consteval bool is_within_lifetime(const T*) noexcept; ``` This is done with the `__builtin_is_within_lifetime` builtin added to Clang 20 by #91895 / 2a07509. This is not (currently) available with GCC. This implementation has provisions for LWG4138 <https://cplusplus.github.io/LWG/issue4138> where it is ill-formed to instantiate `is_within_lifetime<T>` with a function type `T`. Closes #105381 Co-authored-by: Mital Ashok <[email protected]>
…#163021) At the moment, the effectivness of guards that contain divisibility information (A % B == 0 ) depends on the order of the conditions. This patch makes using divisibility information independent of the order, by collecting and applying the divisibility information separately. We first collect all conditions in a vector, then collect the divisibility information from all guards. When processing other guards, we apply divisibility info collected earlier. After all guards have been processed, we add the divisibility info, rewriting the existing rewrite. This ensures we apply the divisibility info to the largest rewrite expression. This helps to improve results in a few cases, one in dtcxzyw/llvm-opt-benchmark#2921 and another one in a different large C/C++ based IR corpus. PR: #163021
Suggest the initializer_list overload instead. For more context, see #163117.
Identified with modernize-use-equals-default.
In C++17, static constexpr members are implicitly inline, so they no longer require an out-of-line definition. This patch removes the unnecessary "const" from the in-class declaration as well. Identified with readability-redundant-declaration.
Identified with readability-redundant-typename.
Previously this only checked for OpBuilder usage, but it could also be invoked via pointer. Also change how call range is calculated to avoid false overlaps which limits rewriting builder calls inside arguments of builder calls. --------- Co-authored-by: EugeneZelenko <[email protected]> Co-authored-by: Victor Chernyakin <[email protected]>
This reverts commit 2253413. This caused a bunch of buildbot failures: 1. https://lab.llvm.org/buildbot/#/builders/17/builds/12328 - fixed-shadow.c uses parantheses, almost seems flaky. 2. https://lab.llvm.org/buildbot/#/builders/186/builds/13651 - log-path_test.cpp is unresolved on some builders. 3. https://lab.llvm.org/buildbot/#/builders/64/builds/6289 - unset is not found 4. https://lab.llvm.org/buildbot/#/builders/42/builds/6809 - ulimit issues 5. https://lab.llvm.org/buildbot/#/builders/169/builds/16628 - flaky JIT failures that are now a lot more frequent.
Tested with `man lld/docs/ld.lld.1`
Macos sincos stret was not well covered, and 64-bit wasn't checked for vector sincos cases.
BranchOnCount and BranchOnCond do not read memory, but cannot be moved. Mark them as having side-effects, but not reading/writing memory, which more accurately models that above. This allows removing some special checking for branches both in the current code and future patches.
In C++17, static constexpr members are implicitly inline, so they no longer require an out-of-line definition. Identified with readability-redundant-declaration.
In C++17, static constexpr members are implicitly inline, so they no longer require an out-of-line definition. Identified with readability-redundant-declaration.
Identified with readability-redundant-declaration.
Identified with readability-redundant-typename.
This was probably left over from one of my debugging sessions. Remove it to clean up the code slightly.
Also add a corresponding intrinsic property that can be used to mark intrinsics that do not introduce poison, for example simple arithmetic intrinsics that propagate poison just like a simple arithmetic instruction. As a smoke test this patch adds the new property to llvm.amdgcn.fmul.legacy.
To follow-up on a post-commit review.
Fails with: ``` RUN: at line 4 split-file C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\test\Shell\Recognizer\registration-unique.test C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\tools\lldb\test\Shell\Recognizer\Output\registration-unique.test.tmp executed command: split-file 'C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\test\Shell\Recognizer\registration-unique.test' 'C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\tools\lldb\test\Shell\Recognizer\Output\registration-unique.test.tmp' note: command had no output on stdout or stderr RUN: at line 6 c:\users\tcwg\llvm-worker\lldb-aarch64-windows\build\bin\clang.exe --target=specify-a-target-or-use-a-_host-substitution --target=aarch64-pc-windows-msvc -fmodules-cache-path=C:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/lldb-test-build.noindex/module-cache-clang\lldb-shell C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\tools\lldb\test\Shell\Recognizer\Output\registration-unique.test.tmp/main.cpp -g -o C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\tools\lldb\test\Shell\Recognizer\Output\registration-unique.test.tmp/cpp.out executed command: 'c:\users\tcwg\llvm-worker\lldb-aarch64-windows\build\bin\clang.exe' --target=specify-a-target-or-use-a-_host-substitution --target=aarch64-pc-windows-msvc '-fmodules-cache-path=C:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/lldb-test-build.noindex/module-cache-clang\lldb-shell' 'C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\tools\lldb\test\Shell\Recognizer\Output\registration-unique.test.tmp/main.cpp' -g -o 'C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\tools\lldb\test\Shell\Recognizer\Output\registration-unique.test.tmp/cpp.out' .---command stderr------------ | clang: warning: argument unused during compilation: '-fmodules-cache-path=C:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/lldb-test-build.noindex/module-cache-clang\lldb-shell' [-Wunused-command-line-argument] `----------------------------- RUN: at line 7 c:\users\tcwg\llvm-worker\lldb-aarch64-windows\build\bin\lldb.exe --no-lldbinit -S C:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/tools/lldb\test\Shell\lit-lldb-init-quiet -b -s C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\tools\lldb\test\Shell\Recognizer\Output\registration-unique.test.tmp/commands.input C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\tools\lldb\test\Shell\Recognizer\Output\registration-unique.test.tmp/cpp.out | c:\users\tcwg\llvm-worker\lldb-aarch64-windows\build\bin\filecheck.exe C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\test\Shell\Recognizer\registration-unique.test executed command: 'c:\users\tcwg\llvm-worker\lldb-aarch64-windows\build\bin\lldb.exe' --no-lldbinit -S 'C:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/tools/lldb\test\Shell\lit-lldb-init-quiet' -b -s 'C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\tools\lldb\test\Shell\Recognizer\Output\registration-unique.test.tmp/commands.input' 'C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\tools\lldb\test\Shell\Recognizer\Output\registration-unique.test.tmp/cpp.out' note: command had no output on stdout or stderr executed command: 'c:\users\tcwg\llvm-worker\lldb-aarch64-windows\build\bin\filecheck.exe' 'C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\test\Shell\Recognizer\registration-unique.test' .---command stderr------------ | C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\test\Shell\Recognizer\registration-unique.test:45:10: error: CHECK: expected string not found in input | # CHECK: Assert StackFrame Recognizer | ^ | <stdin>:20:38: note: scanning from here | 1: Verbose Trap StackFrame Recognizer, demangled symbol regex ^__clang_trap_msg | ^ | <stdin>:34:10: note: possible intended match here | 3: Verbose Trap StackFrame Recognizer, demangled symbol regex ^__clang_trap_msg | ^ | | Input file: <stdin> | Check file: C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\test\Shell\Recognizer\registration-unique.test | | -dump-input=help explains the following input dump. ```
…t if all extracts would lead to UB on poison. (#164683) This change aims to avoid inserting a freeze instruction between the load and bitcast when scalarizing extend-extract. This is particularly useful in combination with #164682, which can then potentially further scalarize, provided there is no freeze. alive2 proof: https://alive2.llvm.org/ce/z/W-GD88
AMDGCN flavoured SPIR-V has slightly different defaults from what the BE adopts: it assumes all extensions are enabled, and expects nonsemantic debug info to be generated. Furthermore, it is necessary to encode in the resulting SPIR-V binary that what was generated was AMDGCN flavoured, which we do by setting the Generator Version to `UINT16_MAX` (which matches what we expect to see at reverse translation). We will register this generator version at <https://github.com/KhronosGroup/SPIRV-Headers>. This is a preliminary patch out of a series of patches that are needed for adopting the BE for AMDGCN flavoured SPIR-V generation.
…ansfer_write to new offsets syntax (#162095) Changes the `VectorToXeGPU` pass to generate `xegpu.load_nd/store_nd` ops using new syntax with where offsets are specified at the load/store ops level. ```mlir // from this %desc = xegpu.create_nd_tdesc %src[%off1, %off2]: memref<8x16xf16> -> !xegpu.tensor_desc<8x16xf16> %res = xegpu.load_nd %desc : !xegpu.tensor_desc<8x16xf16> -> vector<8x16xf16> // to this %desc = xegpu.create_nd_tdesc %src: memref<8x16xf16> -> !xegpu.tensor_desc<8x16xf16> %res = xegpu.load_nd %desc[%off1, %off2] : !xegpu.tensor_desc<8x16xf16> -> vector<8x16xf16> ``` In order to support cases with dimension reduction at the `create_nd_tdesc` level (e.g. `memref<8x8x16xf16> -> tensor_desc<8x16xf16>` it was decided to insert a memref.subview that collapses the source shape to 2d, for example: ```mlir // input: %0 = vector.load %source[%off0, %off1, %off2] : memref<8x16x32xf32>, vector<8x16xf32> // --vector-to-xegpu (old) %tdesc = xegpu.create_nd_tdesc %source[%off0, %off1, %off2] : memref<8x16x32xf32> -> tdesc<8x32xf32> %vec = xegpu.load_nd %tdesc // --vector-to-xegpu (new) %collapsed = memref.subview %source[%off0, 0, 0] [1, 16, 32] [1, 1, 1] : memref<8x16x32xf32> -> memref<16x32xf32, strided<[32, 1], offset: ?>> %tdesc = xegpu.create_nd_tdesc %collapsed : memref<16x32xf32, ...> -> tdesc<8x32xf32> %vec = xegpu.load_nd %tdesc[%off1, %off2] ``` <details><summary>Why we need to change that?</summary> ```mlir // reduce dim and apply all 3 offsets at load_nd %desc = xegpu.create_nd_tdesc %source : memref<8x16x32xf32> -> !xegpu.tensor_desc<16x32xf32> // error: xegpu.load_nd len(offsets) != desc.rank %res = xegpu.load_nd %desc[%off, %off, %off] : !xegpu.tensor_desc<16x32xf32> -> vector<8x16xf32> ``` </details> --------- Signed-off-by: dchigarev <[email protected]>
…n of conditions (#165748) In simplifycfg/cvp/sccp, we eliminate dead edges of switches according to the knownbits/range info of conditions. However, these approximations may not meet the real-world needs when the domain of condition values is sparse. For example, if the condition can only be either -3 or 3, we cannot prove that the condition never evaluates to 1 (knownbits: ???????1, range: [-3, 4)). This patch adds a helper function `collectPossibleValues` to enumerate all the possible values of V. To fix the motivating issue, `eliminateDeadSwitchCases` will use the result to remove dead edges. Note: In https://discourse.llvm.org/t/missed-optimization-due-to-overflow-check/88700 I proposed a new value lattice kind to represent such values. But I find it hard to apply because the transition becomes much complicated. Compile-time impact looks neutral: https://llvm-compile-time-tracker.com/compare.php?from=32d6b2139a6c8f79e074e8c6cfe0cc9e79c4c0c8&to=e47c26e3f1bf9eb062684dda4fafce58438e994b&stat=instructions:u This patch removes many dead error-handling codes: dtcxzyw/llvm-opt-benchmark#3012 Closes #165179.
In #165720 we started using a DWARF API (`llvm::dwarf::getTag`) from `BinaryFormat`. This patch makes dwarfdump link against the necessary LLVM component. This fixes following linker error that started occurring on some of the bots: ``` [7758/8172] Linking CXX executable bin/llvm-dwarfdump FAILED: bin/llvm-dwarfdump : && /usr/bin/c++ -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-uninitialized -Wno-nonnull -Wno-class-memaccess -Wno-redundant-move -Wno-pessimizing-move -Wno-array-bounds -Wno-stringop-overread -Wno-noexcept-type -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wno-misleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG -Wl,-rpath-link,/home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/build/./lib -Wl,--gc-sections tools/llvm-dwarfdump/CMakeFiles/llvm-dwarfdump.dir/SectionSizes.cpp.o tools/llvm-dwarfdump/CMakeFiles/llvm-dwarfdump.dir/Statistics.cpp.o tools/llvm-dwarfdump/CMakeFiles/llvm-dwarfdump.dir/llvm-dwarfdump.cpp.o -o bin/llvm-dwarfdump -Wl,-rpath,"\$ORIGIN/../lib:/home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/build/lib:" lib/libLLVMAMDGPUDesc.so.22.0git lib/libLLVMSPIRVDesc.so.22.0git lib/libLLVMX86Desc.so.22.0git lib/libLLVMAMDGPUInfo.so.22.0git lib/libLLVMSPIRVInfo.so.22.0git lib/libLLVMX86Info.so.22.0git lib/libLLVMDebugInfoDWARF.so.22.0git lib/libLLVMObject.so.22.0git lib/libLLVMMC.so.22.0git lib/libLLVMDebugInfoDWARFLowLevel.so.22.0git lib/libLLVMTargetParser.so.22.0git lib/libLLVMSupport.so.22.0git -Wl,-rpath-link,/home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/build/lib && : /usr/bin/ld: tools/llvm-dwarfdump/CMakeFiles/llvm-dwarfdump.dir/llvm-dwarfdump.cpp.o: undefined reference to symbol '_ZN4llvm5dwarf6getTagENS_9StringRefE' /usr/bin/ld: /home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/build/./lib/libLLVMBinaryFormat.so.22.0git: error adding symbols: DSO missing from command line collect2: error: ld returned 1 exit status ```
-MG is supposed to suppress "file not found" diagnostics and instead treat those as generated files for purposes of dependency scanning. Clang was previously emitting the diagnostic instead of emitting the name of the embedded file. Fixes #165632
…RMW store chain AND its stored value (#166366)
…ent (#163816) This was left out of the original patch (#142392) to simplify the initial implementation. However, after refactoring the SVE prologue/epilogue code in #162253, it's not much of an extension to support this case. The main change here is when restoring the SP from the FP for the SVE restores, we may need an additional frame offset to move from the start of the ZPR callee-saves to the start of the PPR callee-saves. This patch also fixes a previously latent bug where we'd add the `RealignmentPadding` when allocating the PPR locals, then again for the ZPR locals. This was unnecessary as the stack only needs to be realigned after all SVE allocations.
Add a customizable `visitBlockTransfer` method to dense forward and backward dataflow analyses, allowing subclasses to customize lattice propagation behavior along control flow edges between blocks. Default implementation preserves existing join/meet semantics. This change mirrors the exiting structure of both dense dataflow classes, where `RegionBranchOpInterface` and callables are allowed to be customized by subclasses. The use case motivating this change is dense liveness analysis. Currently, without the customization hook the block transfer function produces incorrect results. The issue is the current logic doesn't remove the successor block arguments from the live set, as it only meets the successor state with the predecessor state (ie. set union). With this change is now possible to compute the correct result by specifying the correct logic in `visitBlockTransfer`. Signed-off-by: Fabian Mora <[email protected]>
The verifier of `xegpu.{load/store/prefetch}_nd` op fails if `offset` a
mix of static and dynamic values, e.g. `offset = [0, %c0]`. In this case
the length of dynamic offsets is 1 and the check `offsetSize !=
tDescRank` (=2) fails. Instead, we should check the length of
`getMixedOffsets()`.
    …5791) Update MMA tests to add run line for `cpu=future` to ensure MMA functionality is not broken with the new `wacc` register classes introduced. Previous commit have added def for using the new `wacc` registers, this just add in testing and fixes a few patterns that was missing .
…kernel_attributes` (#165891) This adds BE support for the [`SPV_INTEL_kernel_attributes`](https://github.khronos.org/SPIRV-Registry/extensions/INTEL/SPV_INTEL_kernel_attributes.html) extension. The extension is necessary to encode the rather useful `max_work_group_size` kernel attribute, via `OpExecutionMode MaxWorkgroupSizeINTEL`, which is the only Execution Mode added by the extension that this patch adds full processing for. Future patches will add the other Execution Modes and Capabilities. The test is adapted from the equivalent Translator test; it depends on #165815.
…165606) `clang -x hip foo.c --offload-arch=amdgcnspirv --offload-new-driver -save-temps` was crashing with the following error: ``` /usr/bin/ld: input file 'foo-x86_64-unknown-linux-gnu.o' is the same as output file build/bin/clang-linker-wrapper: error: 'ld' failed ``` The `LinkerWrapperJobAction` [is created](https://github.com/llvm/llvm-project/blob/957598f71bd8baa029d886e59ed9aed60e6e9bb9/clang/lib/Driver/Driver.cpp#L4888) with `types::TY_Object` which makes `Driver::GetNamedOutputPath` assign the same name as the assembler's output and thus causing the crash.
Replace the current `ADRRelaxationPass` with `AArch64RelaxationPass`, which, besides the existing ADR relaxation, will also run LDR relaxation that for now only handles these two forms of LDR instructions: `ldr Xt, [label]` and `ldr Wt, [label]`.
According to discussion of #153600 (comment) add LLVM_ABI to function getMemcmp declaration
NDD ADD is only supported on 64 bit, but `LEA32` has `Requires<[Not64BitMode]>`. The reason it doesnt fail upstream is that the predicates check is commented out on `X86MCInstLower.cpp`: ``` // FIXME: Enable feature predicate checks once all the test pass. // X86_MC::verifyInstructionPredicates(MI->getOpcode(), // Subtarget->getFeatureBits()); ``` Introduced by: #158254
Emit empty line after a namespace scope is opened and before its closed. Adjust DirectiveEmitter code empty line emission in response to this to avoid lot of unit test changes.
This patch moves llvm::to_address to STLForwardCompat.h, a collection of backports from C++20 and beyond.
In C++17, static constexpr members are implicitly inline, so they no longer require an out-of-line definition. Once we remove the redundant declarations, Minidump.cpp becomes effectively empty, so this patch removes the file. Identified with readability-redundant-declaration.
This patch replaces:
  using Foo = enum { A, B, C };
with the more conventional:
  enum Foo { A, B, C };
These two enum declaration styles are not identical, but their
difference does not matter in these .cpp files.  With the "using Foo"
style, the enum is unnamed and cannot be forward-declared, whereas the
conventional style creates a named enum that can be.  Since these
changes are confined to .cpp files, this distinction has no practical
impact here.
    Some followups after #131687 switched to the "runtimes build". - The `check-sanitizer` build target doesn't exist in the runtimes build; use `check-runtimes` instead. - ASan is not supported on 32-bit windows. Pass `-DCOMPILER_RT_BUILD_SANITIZERS=OFF` - `check-runtimes` includes the orcjit tests, which never passed on windows; build with `-DCOMPILER_RT_BUILD_ORC=OFF` - Various asan and libfuzzer tests fail; suppress them with `LIT_FILTER_OUT`
  
      Sign up for free
      to subscribe to this conversation on GitHub.
      Already have an account?
      Sign in.
  
      
  Add this suggestion to a batch that can be applied as a single commit.
  This suggestion is invalid because no changes were made to the code.
  Suggestions cannot be applied while the pull request is closed.
  Suggestions cannot be applied while viewing a subset of changes.
  Only one suggestion per line can be applied in a batch.
  Add this suggestion to a batch that can be applied as a single commit.
  Applying suggestions on deleted lines is not supported.
  You must change the existing code in this line in order to create a valid suggestion.
  Outdated suggestions cannot be applied.
  This suggestion has been applied or marked resolved.
  Suggestions cannot be applied from pending reviews.
  Suggestions cannot be applied on multi-line comments.
  Suggestions cannot be applied while the pull request is queued to merge.
  Suggestion cannot be applied right now. Please check back later.
  
    
  
    
See Commits and Changes for more details.
Created by
 pull[bot] (v2.0.0-alpha.4)
Can you help keep this open source service alive? 💖 Please sponsor : )