forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 2
[pull] main from llvm:main #5634
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
pull
wants to merge
735
commits into
Ericsson:main
Choose a base branch
from
llvm:main
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
+104,417
−59,191
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This PR refactors alignment validation in MLIR's MemRef and SPIRV dialects: - Use `IntValidAlignment` for consistent type safety across MemRef and SPIRV dialects - Eliminate duplicate validation logic in `MemRefOps.cpp` - Adjust error messages in `invalid.mlir` to match improved validation This is the first of two PRs addressing issue #155677.
Summary: Currently we have this `__tgt_device_image` indirection which just takes a reference to some pointers. This was all find and good when the only usage of this was from a section of GPU code that came from an ELF constant section. However, we have expanded beyond that and now need to worry about managing lifetimes. We have code that references the image even after it was loaded internally. This patch changes the implementation to instaed copy the memory buffer and manage it locally. This PR reworks the JIT and other image handling to directly manage its own memory. We now don't need to duplicate this behavior externally at the Offload API level. Also we actually free these if the user unloads them. Upside, less likely to crash and burn. Downside, more latency when loading an image.
Extension of #158152 for MLIR. --------- Signed-off-by: Sarnie, Nick <[email protected]>
…rn a string (NFC) (#159089) These functions will see more uses in a future patch. This also resolves a FIXME.
…onDAG (#155256) Based on comment of #153600 (comment), Add a helper function isTailCall for getting libcall in SelectionDAG.
Fixes an issue in commit 3946c50, PR #135349. The DebugSSAUpdater class performs raw pointer allocations. It frees these properly in reset(), but does not do so in its destructor - as an immediate fix, this patch adds a destructor which frees the allocations correctly. I'll be merging this immediately to fix the issue, but will be open to post-commit review and/or producing a better fix in a follow-up commit.
The S_NOP instruction has an immediate operand which is one less than the number of cycles to delay for. The maximum value that may be encoded in this field was increased in GFX8 and again in GFX12.
…58026) IR has the `contract` to indicate that contraction is allowed. Testing shouldn't rely on global flag to perform contraction. This is a prerequisite before making backends rely only on the IR to perform contraction. See more here: https://discourse.llvm.org/t/allowfpopfusion-vs-sdnodeflags-hasallowcontract/80909/5
…oROCDLOps.cpp (NFC)
…lf (#158832) gpu.subgroup_mma_elementwise supports mulf op type. Add conversion for it.
) After replacing VGPR MFMAs with the AGPR form, we've alleviated VGPR pressure which may have triggered spills during allocation. Identify these spill slots, and try to reassign them to newly freed VGPRs, and replace the spill instructions with copies. Fixes #154260
This PR is a part of the effort to make the VFS used in the compiler more explicit and consistent. Instead of creating the VFS deep within the compiler (in `CompilerInstance::createFileManager()`), clients are now required to explicitly call `CompilerInstance::createVirtualFileSystem()` and provide the base VFS from the outside. This PR also helps in breaking up the dependency cycle where creating a properly configured `DiagnosticsEngine` requires a properly configured VFS, but creating properly configuring a VFS requires the `DiagnosticsEngine`. Both `CompilerInstance::create{FileManager,Diagnostics}()` now just use the VFS already in `CompilerInstance` instead of taking one as a parameter, making the VFS consistent across the instance sub-object.
This will be used to build hexagon-builtins for baremetal. Signed-off-by: Kushal Pal <[email protected]>
When the ARRAY has polymorphic type, its element type may not match the element type of BOUNDARY. Fixes #158382.
…), C)) (#155141) Hi, I compared the following LLVM IR with GCC and Clang, and there is a small difference between the two. The LLVM IR is: ``` define i64 @test_smin_neg_one(i64 %a) { %1 = tail call i64 @llvm.smin.i64(i64 %a, i64 -1) %retval.0 = xor i64 %1, -1 ret i64 %retval.0 } ``` GCC generates: ``` cmp x0, 0 csinv x0, xzr, x0, ge ret ``` Clang generates: ``` cmn x0, #1 csinv x8, x0, xzr, lt mvn x0, x8 ret ``` Clang keeps flipping x0 through x8 unnecessarily. So I added the following folds to DAGCombiner: fold (xor (smax(x, C), C)) -> select (x > C), xor(x, C), 0 fold (xor (smin(x, C), C)) -> select (x < C), xor(x, C), 0 alive2: https://alive2.llvm.org/ce/z/gffoir --------- Co-authored-by: Yui5427 <[email protected]> Co-authored-by: Matt Arsenault <[email protected]> Co-authored-by: Simon Pilgrim <[email protected]>
…159108) SmallSetVector is too optimistic, there are usually more than 16 unique decoders and predicates. Modernize `typedef` to `using` while here.
Elide bitcast combine to build_vector in case of i64 immediate that can be materialized through 64b mov
Ensure alias analyses mask out `errnomem` location, refining the resulting modref info, when the given access/location does not alias errno. This may occur either when TBAA proves there is no alias with errno (e.g., float TBAA for the same root would be disjoint with the int-only compatible TBAA node for errno); or if the memory access size is larger than the integer size, or when the underlying object is a potentially-escaping alloca. Previous discussion: https://discourse.llvm.org/t/rfc-modelling-errno-memory-effects/82972.
…sdot (#158310) This allows dot products with scalable 8xi16 vectors (and fixed-length vectors which are converted into a scalable vector) accumulating into a 4xi32 vector to lower into a single instruction (`udot`/`sdot`), rather than a sequence of `umlalb`s and `umlalt`s`.
There is a number of attributes that is expected to be set on functions by default. This patch implements setting more such attributes on the FMV resolver functions generated by Clang. On AArch64, this makes the resolver functions use the default PAC and BTI hardening settings.
…gs. NFC. (#159327) This avoids the following kind of warning with GCC: warning: control reaches end of non-void function [-Wreturn-type]
#159337) This fixes the following warning when compiled with GCC: ../lib/Target/AArch64/AArch64ISelLowering.cpp: In function ‘bool shouldLowerTailCallStackArg(const llvm::MachineFunction&, const llvm::CCValAssign&, llvm::SDValue, llvm::ISD::ArgFlagsTy, int)’: ../lib/Target/AArch64/AArch64ISelLowering.cpp:9310: warning: comparison of integer expressions of different signedness: ‘uint64_t’ {aka ‘long unsigned int’} and ‘int64_t’ {aka ‘long int’} [-Wsign-compare] 9310 | if (SizeInBits / 8 != MFI.getObjectSize(FI)) |
This avoids the following warnings from Clang: ../../lldb/source/Host/windows/Host.cpp:324:3: warning: default label in switch which covers all enumeration values [-Wcovered-switch-default] 324 | default: | ^ ../../lldb/source/Host/common/File.cpp:662:26: warning: cast from 'const void *' to 'char *' drops const qualifier [-Wcast-qual] 662 | .write((char *)buf, num_bytes); | ^
…ngs. NFC. (#159330) This avoids the following warnings: ../../clang/lib/AST/ExprConstant.cpp: In member function ‘bool {anonymous}::IntExprEvaluator::VisitBuiltinCallExpr(const clang::CallExpr*, unsigned int)’: ../../clang/lib/AST/ExprConstant.cpp:14104:3: warning: this statement may fall through [-Wimplicit-fallthrough=] 14104 | } | ^ ../../clang/lib/AST/ExprConstant.cpp:14105:3: note: here 14105 | case Builtin::BIstrlen: | ^~~~ ../../clang/lib/Driver/ToolChains/CommonArgs.cpp: In function ‘std::string clang::driver::tools::complexRangeKindToStr(clang::LangOptionsBase::ComplexRangeKind ’: ../../clang/lib/Driver/ToolChains/CommonArgs.cpp:3523:1: warning: control reaches end of non-void function [-Wreturn-type] 3523 | } | ^
This avoids the following kind of warning with GCC: ../tools/llvm-lipo/llvm-lipo.cpp: In function ‘void printInfo(llvm::LLVMContext&, llvm::ArrayRef<llvm::object::OwningBinary<llvm::object::Binary> >)’: ../tools/llvm-lipo/llvm-lipo.cpp:464:34: warning: suggest parentheses around ‘& ’ within ‘||’ [-Wparentheses] 464 | Binary->isArchive() && "expected MachO binary"); | ~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
…154410) Co-authored-by: Jay Foad <[email protected]> Co-authored-by: Jay Foad <[email protected]>
While debugging #145206 I found that a possible cause for the problem is the call to printf, which is variadic. In a musl environment VarArgs are treated like *non* VarArgs. The handling of this special case was bypassed by the commit a4f8551. The reason is that the arguement `TreatAsVarArg` is only set to `true` in an *non* musl env. `TreatAsVarArg` determines the result of `isVarArg()`. The `CCIfArgVarArg` class only checks each individual variable, but not whether `isVarArg()` is true. Without the special case, the unnamed arguments are always passed on the stack, as is standard calling convention. But with musl also unnamed arguments can be passed in registers. Possibly, this fixes #145206.
…subprogram DIEs (#159104) With this change, construction of abstract subprogram DIEs is split in two stages/functions: creation of DIE (in DwarfCompileUnit::getOrCreateAbstractSubprogramDIE) and its population with children (in DwarfCompileUnit::constructAbstractSubprogramScopeDIE). With that, abstract subprograms can be created/referenced from DwarfDebug::beginModule, which should solve the issue with static local variables DIE creation of inlined functons with optimized-out definitions. It fixes #29985. LexicalScopes class now stores mapping from DISubprograms to their corresponding llvm::Function's. It is supposed to be built before processing of each function (so, now LexicalScopes class has a method for "module initialization" alongside the method for "function initialization"). It is used by DwarfCompileUnit to determine whether a DISubprogram needs an abstract DIE before DwarfDebug::beginFunction is invoked. DwarfCompileUnit::getOrCreateSubprogramDIE method is added, which can create an abstract or a concrete DIE for a subprogram. It accepts llvm::Function* argument to determine whether a concrete DIE must be created. This is a temporary fix for #29985. Ideally, it will be fixed by moving global variables and types emission to DwarfDebug::endModule (https://reviews.llvm.org/D144007, https://reviews.llvm.org/D144005). Some code proposed by Ellis Hoag <[email protected]> in #90523 was taken for this commit.
Recognize an MTE tag fault Mach exception. A tag fault is an error reported by Arm's Memory Tagging Extension (MTE) when a memory access attempts to use a pointer with a tag that doesn't match the tag stored with the memory. LLDB will print the tag and address to make the issue easier to diagnose. This was hand tested by debugging an MTE enabled binary on an iPhone 17 running iOS 26. rdar://113575216
…144744) Fix `llvm::concat_iterator` for the case of `ValueT` being a pointer to a common base class to which the result of dereferencing any iterator in `ItersT` can be casted to.
There is no RISCV isel for bitcast between f16 and bf16 which will trigger "cannot select" fatal error. Co-authored-by: Ying Wang <[email protected]>
…#159218) Also turn the method into a static function so it can be used without an instance of the class.
…55431) The current chip guard fails to prevent scaling_extf/truncf patterns from being applied on gfx1100 which does not have scaling support. --------- Signed-off-by: Muzammiluddin Syed <[email protected]>
…C. (#159338) This avoids the following kind of warning when built with GCC: ../../clang/lib/Sema/SemaStmtAttr.cpp: In function ‘clang::Attr* ProcessStmtAttribute(clang::Sema&, clang::Stmt*, const clang::ParsedAttr&, clang::SourceRange)’: ../../clang/lib/Sema/SemaStmtAttr.cpp:677:30: warning: enumerated mismatch in conditional expression: ‘clang::diag::<unnamed enum>’ vs ‘clang::diag::<unnamed enum>’ [-Wenum-compare] 676 | S.Diag(A.getLoc(), A.isRegularKeywordAttribute() | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 677 | ? diag::err_keyword_not_supported_on_targe | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 678 | : diag::warn_unhandled_ms_attribute_ignore ) | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ These enums are non-overlapping, but due they are defined in different enum scopes due to how they are generated with tablegen.
Without this patch, we are doing a roundtrip on types. Specifically, if decltype(...) is well formed, std::is_same_v evaluates to a boolean value. We then pass the boolean value to std::enable_if_t, go through the sizeof(char)/sizeof(double) trick, and then come back to a boolean value. This patch simplifies all this by having test() return std::is_same<...>. The "caller" attaches ::value, so effectively we are using std::is_same<...>::value when decltype(...) is well formed, bypassing std::enable_if_t and the sizeof(char)/sizeof(double) trick. If we did not care about the return type of the shift operator, we could use llvm::is_detected, but the return type check doesn't allow us to simplify things that far.
Summary: Turns out the new CUDA ABI now applies retroactively to all the other SMs if you upgrade to CUDA 13.0. This patch changes the scheme, keeping all the SM flags consistent but using an offset. Fixes: #159088
A common idiom is the usage of the PatternMatch match function within a functional algorithm like all_of. Introduce a match functor to shorten this idiom. Co-authored-by: Luke Lau <[email protected]>
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.3)
Can you help keep this open source service alive? 💖 Please sponsor : )