- 28 Jul, 2020 3 commits
-
-
Simon Pilgrim authored
An initial backend patch towards fixing the various poor HADD combines (PR34724, PR41813, PR45747 etc.). This extends isHorizontalBinOp to check if we have per-element horizontal ops (odd+even element pairs), but not in the expected serial order - in which case we build a "post shuffle mask" that we can apply to the HOP result, assuming we have fast-hops/optsize etc. The next step will be to extend the SHUFFLE(HOP(X,Y)) combines as suggested on PR41813 - accepting more post-shuffle masks even on slow-hop targets if we can fold it into another shuffle. Differential Revision: https://reviews.llvm.org/D83789 (cherry picked from commit 18211177)
-
Simon Pilgrim authored
(cherry picked from commit bfc4294e)
-
Craig Topper authored
[X86] Detect if EFLAGs is live across XBEGIN pseudo instruction. Add it as livein to the basic blocks created when expanding the pseudo XBEGIN causes several based blocks to be inserted. If flags are live across it we need to make eflags live in the new basic blocks to avoid machine verifier errors. Fixes PR46827 Reviewed By: ivanbaev Differential Revision: https://reviews.llvm.org/D84479 (cherry picked from commit 647e861e)
-
- 27 Jul, 2020 17 commits
-
-
Hans Wennborg authored
-
David Green authored
As shown in D82998, the basic-aa-recphi option can cause miscompiles for gep's with negative constants. The option checks for recursive phi, that recurse through a contant gep. If it finds one, it performs aliasing calculations using the other phi operands with an unknown size, to specify that an unknown number of elements after the initial value are potentially accessed. This works fine expect where the constant is negative, as the size is still considered to be positive. So this patch expands the check to make sure that the constant is also positive. Differential Revision: https://reviews.llvm.org/D83576 (cherry picked from commit 311fafd2)
-
David Green authored
(cherry picked from commit 30fa5766)
-
Martin Storsjö authored
For a weak symbol func in a comdat, the actual leader symbol ends up named like .weak.func.default*. Likewise, for stdcall on i386, the symbol may be named _func@4, while the section suffix only is "func", which the previous implementation didn't handle. This fixes unwinding through weak functions when using -ffunction-sections in mingw environments. Differential Revision: https://reviews.llvm.org/D84607 (cherry picked from commit 343ffa70)
-
Martin Storsjö authored
Previously, the test could pass with one part of c3b1d730 removed. (cherry picked from commit 8dc82039)
-
Roman Lebedev authored
SplitBlockPredecessors() can not split blocks that have such terminators, and in two other places we already ensure that we don't end up calling SplitBlockPredecessors() on such blocks. Do so in one more place. Fixes https://bugs.llvm.org/show_bug.cgi?id=46857 (cherry picked from commit 1da98345)
-
Nemanja Ivanovic authored
I mixed up the precedence of operators in the assert and thought I had it right since there was no compiler warning. This just adds the parentheses in the expression as needed. (cherry picked from commit cdead4f8)
-
Nemanja Ivanovic authored
Unfortunately this is another regression from my canonicalization patch (1fed1316). The patch contained two implicit assumptions: 1. That we would have a permuted load only if we are loading a partial vector 2. That a partial vector load would necessarily be as wide as the splat However, assumption 2 is not correct since it is possible to do a wider load and only splat a half of it. This patch corrects this assumption by simply checking if the load is permuted and adjusting the offset if it is. (cherry picked from commit 7d076e19)
-
Craig Topper authored
[LegalizeTypes] Teach DAGTypeLegalizer::GenWidenVectorLoads to pad with undef if needed when concatenating small or loads to match a larger load In the included test case the align 16 allowed the v23f32 load to handled as load v16f32, load v4f32, and load v4f32(one element not used). These loads all need to be concatenated together into a final vector. In this case we tried to concatenate the two v4f32 loads to match the type of the v16f32 load so we could do a second concat_vectors, but those loads alone only add up to v8f32. So we need to two v4f32 undefs to pad it. It appears we've tried to hack around a similar issue in this code before by adding undef padding to loads in one of the earlier loops in this function. Originally in r147964 by padding all loads narrower than previous loads to the same size. Later modifed to only the last load in r293088. This patch removes that earlier code and just handles it on demand where we know we need it. Fixes PR46820 Differential Revision: https://reviews.llvm.org/D84463 (cherry picked from commit 8131e190)
-
Alexey Bataev authored
Summary: If the variable is constrcutible, its copy is created by calling a constructor. Such variables are duplicated and thus, must be captured. Reviewers: jdoerfert Subscribers: yaxunl, guansong, cfe-commits, sstefan1, caomhin Tags: #clang Differential Revision: https://reviews.llvm.org/D83909 (cherry picked from commit 9840208d)
-
Martin Storsjö authored
This fixes PR 42837. Differential Revision: https://reviews.llvm.org/D84465 (cherry picked from commit 4d09ed95)
-
Martin Storsjö authored
For comdats (e.g. caused by -ffunction-sections), Section is already set here; make sure it's null, for the weak external symbol to be undefined. This fixes PR46779. Differential Revision: https://reviews.llvm.org/D84507 (cherry picked from commit 9e81d8bb)
-
lewis-revill authored
This patch provides optimization of bit manipulation operations by enabling the +experimental-b target feature. It adds matching of single block patterns of instructions to specific bit-manip instructions from the ternary subset (zbt subextension) of the experimental B extension of RISC-V. It adds also the correspondent codegen tests. This patch is based on Claire Wolf's proposal for the bit manipulation extension of RISCV: https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.92.pdf Differential Revision: https://reviews.llvm.org/D79875 (cherry picked from commit c9c955ad)
-
lewis-revill authored
This patch provides optimization of bit manipulation operations by enabling the +experimental-b target feature. It adds matching of single block patterns of instructions to specific bit-manip instructions from the single-bit subset (zbs subextension) of the experimental B extension of RISC-V. It adds also the correspondent codegen tests. This patch is based on Claire Wolf's proposal for the bit manipulation extension of RISCV: https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.92.pdf Differential Revision: https://reviews.llvm.org/D79874 (cherry picked from commit d4be3337)
-
lewis-revill authored
This patch provides optimization of bit manipulation operations by enabling the +experimental-b target feature. It adds matching of single block patterns of instructions to specific bit-manip instructions belonging to both the permutation and the base subsets of the experimental B extension of RISC-V. It adds also the correspondent codegen tests. This patch is based on Claire Wolf's proposal for the bit manipulation extension of RISCV: https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.92.pdf Differential Revision: https://reviews.llvm.org/D79873 (cherry picked from commit 6144f0a1)
-
lewis-revill authored
This patch provides optimization of bit manipulation operations by enabling the +experimental-b target feature. It adds matching of single block patterns of instructions to specific bit-manip instructions from the permutation subset (zbp subextension) of the experimental B extension of RISC-V. It adds also the correspondent codegen tests. This patch is based on Claire Wolf's proposal for the bit manipulation extension of RISCV: https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.92.pdf Differential Revision: https://reviews.llvm.org/D79871 (cherry picked from commit 31b52b43)
-
lewis-revill authored
This patch provides optimization of bit manipulation operations by enabling the +experimental-b target feature. It adds matching of single block patterns of instructions to specific bit-manip instructions from the base subset (zbb subextension) of the experimental B extension of RISC-V. It adds also the correspondent codegen tests. This patch is based on Claire Wolf's proposal for the bit manipulation extension of RISCV: https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.92.pdf Differential Revision: https://reviews.llvm.org/D79870 (cherry picked from commit e2692f0e)
-
- 24 Jul, 2020 2 commits
-
-
David Goldman authored
Summary: We need to detect when certain TypoExprs are not being transformed due to invalid trees, otherwise we risk endlessly trying to fix it. Reviewers: rsmith Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D84067 (cherry picked from commit dde98c82)
-
Tobias Hieta authored
Differential Revision: https://reviews.llvm.org/D81385 (cherry picked from commit a41af6e4)
-
- 23 Jul, 2020 7 commits
-
-
Simon Pilgrim authored
getTargetShuffleMask is used by the various "SimplifyDemanded" folds so we can't assume that the bypassed extract_subvector can be safely simplified - getFauxShuffleMask performs a more general decode that allows us to more safely catch many of these cases so the impact is minimal. (cherry picked from commit 5b5dc244)
-
Nikita Popov authored
Fixes https://bugs.llvm.org/show_bug.cgi?id=46680. Just like insertions through IRBuilder, InsertNewInstBefore() should be using the deferred worklist mechanism, so that processing of newly added instructions is prioritized. There's one side-effect of the worklist order change which could be classified as a regression. An add op gets pushed through a select that at the time is not a umax. We could add a reverse transform that tries to push adds in the reverse direction to restore a min/max, but that seems like a sure way of getting infinite loops... Seems like something that should best wait on min/max intrinsics. Differential Revision: https://reviews.llvm.org/D84109 (cherry picked from commit d12ec0f7)
-
Nikita Popov authored
(cherry picked from commit 13ae440d)
-
Hans Wennborg authored
since it's failing.
-
Max Kazantsev authored
This assert was added to verify assumption that GEP's SCEV will be of pointer type, basing on fact that it should be a SCEVAddExpr with (at least) last operand being pointer. Two notes: - GEP's SCEV does not have to be a SCEVAddExpr after all simplifications; - In current state, GEP's SCEV does not have to have at least one pointer operands (all of them can become int during the transforms). However, we might want to be at a point where it is true. We are currently removing this assert and will try to enumerate the cases where "is pointer" notion might be lost during the transforms. When all of them are fixed, we can return it. Differential Revision: https://reviews.llvm.org/D84294 Reviewed By: lebedev.ri (cherry picked from commit b96114c1)
-
Luboš Luňák authored
Using -fmodules-* options for PCHs is a bit confusing, so add -fpch-* variants. Having extra options also makes it simple to do a configure check for the feature. Also document the options in the release notes. Differential Revision: https://reviews.llvm.org/D83623 (cherry picked from commit 54eea612)
-
Luboš Luňák authored
This way should be the same like with a.pcm for modules. An alternative way is 'clang++ -c empty.cpp -include-pch a.pch -o a.o -Xclang -building-pch-with-obj', which is what clang-cl's /Yc does internally. Differential Revision: https://reviews.llvm.org/D83716 (cherry picked from commit 3895466e)
-
- 22 Jul, 2020 4 commits
-
-
Kai Luo authored
Current powerpc backend generates wrong code sequence if stack pointer has to realign if `-fstack-clash-protection` enabled. When probing dynamic stack allocation, current `PREPARE_PROBED_ALLOCA` takes `NegSizeReg` as input and returns `FinalStackPtr`. `FinalStackPtr=StackPtr+ActualNegSize` is calculated correctly, however code following `PREPARE_PROBED_ALLOCA` still uses value of `NegSizeReg`, which does not contain `ActualNegSize` if `MaxAlign > TargetAlign`, to calculate loop trip count and residual number of bytes. This patch is part of fix of https://bugs.llvm.org/show_bug.cgi?id=46759. Differential Revision: https://reviews.llvm.org/D84152 (cherry picked from commit c3f9697f)
-
Kai Luo authored
Current powerpc backend generates wrong code sequence if stack pointer has to realign if -fstack-clash-protection enabled. When probing in prologue, backend should generate a subtraction instruction rather than a `stux` instruction to realign the stack pointer. This patch is part of fix of https://bugs.llvm.org/show_bug.cgi?id=46759. Differential Revision: https://reviews.llvm.org/D84218 (cherry picked from commit 89122522)
-
Sylvain Audi authored
The "undefined symbol" error message from lld-link displays up to 3 references to that symbol, and the number of extra references not shown. This patch removes the computation of the strings for those extra references. It fixes a freeze of lld-link we accidentally encountered when activating asan on a large project, without linking with the asan library. In that case, __asan_report_load8 was referenced more than 2 million times, causing the computation of that many display strings, of which only 3 were used. Differential Revision: https://reviews.llvm.org/D83510 (cherry picked from commit 3a108ab2)
-
- 21 Jul, 2020 4 commits
-
-
Fangrui Song authored
(cherry picked from commit aa830e97)
-
Fangrui Song authored
This matches LLD and fixes https://sourceware.org/bugzilla/show_bug.cgi?id=26262#c1 .o is a bad choice for save-temps output because it is easy to override the bitcode file (*.o) ``` # Use bfd for the example, -fuse-ld=gold is similar. clang -flto -c a.c # generate bitcode file a.o clang -fuse-ld=bfd -flto a.o -o a -Wl,-plugin-opt=save-temps # override a.o # The user repeats the command but get surprised, because a.o is now a combined module. clang -fuse-ld=bfd -flto a.o -o a -Wl,-plugin-opt=save-temps ``` Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D84132 (cherry picked from commit 55fa315b)
-
Jameson Nash authored
The getAllOnesValue can only handle things that are bitcast from a ConstantInt, while here we bitcast through a pointer, so we may see more complex objects (like Array or Struct). Differential Revision: https://reviews.llvm.org/D83870 (cherry picked from commit 8b354cc8)
-
Martin Storsjö authored
Differential Revision: https://reviews.llvm.org/D84070 (cherry picked from commit f07ddbc9)
-
- 20 Jul, 2020 3 commits
-
-
Hans Wennborg authored
The test fails in 32-bit Windows builds for unclear reasons: ld.lld: error: failed to open C:\src\llvm_package_1100-rc1\build32_stage0\tools\lld\test\ELF\Output\arm-exidx-range.s.tmp: The parameter is incorrect. (cherry picked from commit 8a197e0b)
-
Eric Astor authored
Summary: Remove unused function Reviewed By: lbenes Differential Revision: https://reviews.llvm.org/D83898 (cherry picked from commit 47a3b85a)
-
Craig Topper authored
This matches GNU assembler behavior. Operand size is determined only from the destination register. (cherry picked from commit 71b49aa4)
-