[IR] Drop poison-generating return attributes when necessary (#89138)
Rename has/dropPoisonGeneratingFlagsOrMetadata to
has/dropPoisonGeneratingAnnotations and make it also handle
nonnull, align and range return attributes on calls, similar
to the existing handling for !nonnull, !align and !range metadata.
[libclc] Allow building with pre-built tools
Building the libclc project in-tree with debug tools can be very slow.
This commit adds an option for a user to specify a dierctory from which
to import (e.g., release-built) tools. All tools required by the project
must be imported from the same location, for simplicity.
Original patch downstream authored by @jchlanda.
[Docs] Fix FAQ and Lexicon links under design overview (#89027)
This patch updates the FAQ and lexicon links under design overview to
actually work instead of being incomplete and thus completely missing
from the output.
[mlir][gpu] Improve `gpu.shuffle` documentation. NFC. (#89168)
* Make the wording around lanes / threads / work items more consistent.
* Add examples for all shufle modes.
* Also clean up `gpu.subgroup_reduce`.
[RISCV] Use vnclip for scalable vector saturating truncation. (#88648)
Similar to #75145, but for scalable vectors.
Specifically, this patch works for the below optimization case:
## Source Code
```
define void @trunc_sat_i8i16_maxmin(ptr %x, ptr %y) {
%1 = load <vscale x 4 x i16>, ptr %x, align 16
%2 = tail call <vscale x 4 x i16> @llvm.smax.v4i16(<vscale x 4 x i16> %1, <vscale x 4 x i16> splat (i16 -128))
%3 = tail call <vscale x 4 x i16> @llvm.smin.v4i16(<vscale x 4 x i16> %2, <vscale x 4 x i16> splat (i16 127))
%4 = trunc <vscale x 4 x i16> %3 to <vscale x 4 x i8>
store <vscale x 4 x i8> %4, ptr %y, align 8
ret void
}
```
## Before this patch
[Compiler Explorer](https://godbolt.org/z/EKc9eGvo8)
[21 lines not shown]
[libc++][NFC] Centralize test for support of == and != in ranges (#78481)
Previously, tests for whether comparison using == was supported by
iterators derived from ranges adaptors was spread throughout the testing
codebase. This PR centralizes the implementation of those tests.
[libc++][NFC] Add additional tests for begin/end of std::ranges::take_view (#79085)
Add additional tests for `begin`/`end` of `std::ranges::take_view`.
In partial fulfillment of #72406.
[RISCV] Speed up RISCVRegisterInfo::needsFrameBaseReg when frame pointer isn't used. NFC (#89163)
The callee saved size is only used if there is a frame pointer. Sink the
code onto the frame pointer only path.
[C++20] [Modules] Avoid writing untouched DeclUpdates from GMF in
Reduced BMI
Mitigate https://github.com/llvm/llvm-project/issues/61447
The root cause of the above problem is that when we write a declaration,
we need to lookup all the redeclarations in the imported modules. Then
it will be pretty slow if there are too many redeclarations in different
modules. This patch doesn't solve the porblem.
What the patchs mitigated is, when we writing a named module, we shouldn't
write the declarations from GMF if it is unreferenced **in current
module unit**. The difference here is that, if the declaration is used
in the imported modules, we used to emit it as an update. But we
definitely want to avoid that after this patch.
For that reproducer in
https://github.com/llvm/llvm-project/issues/61447, it used to take 2.5s
to compile and now it only takes 0.49s to compile, which is a big win.
[clang-format] Revert breaking stream operators to previous default (#89016)
Reverts commit d68826dfbd98, which changes the previous default behavior
of always breaking before a stream insertion operator `<<` if both
operands are string literals.
Also reverts the related commits 27f547968cce and bf05be5b87fc.
See the discussion in #88483.
[PowerPC] `ANDI_rec_1_*` should define CR0 (#89034)
These pseudo instructions finally copy the result to CR0 so they should
define `CR0` in their definitions.