- 09 Dec, 2014 15 commits
-
-
Hal Finkel authored
TM.getSubtargetImpl()->getRegisterInfo() needs to be TM.getRegisterInfo() in 3.5. llvm-svn: 223749
-
Hal Finkel authored
```--------------------------------------------------------------------- r223708 | hfinkel | 2014-12-08 22:54:22 +0000 (Mon, 08 Dec 2014) | 13 lines [PowerPC] Don't use a non-allocatable register to implement the 'cc' alias GCC accepts 'cc' as an alias for 'cr0', and we need to do the same when processing inline asm constraints. This had previously been implemented using a non-allocatable register, named 'cc', that was listed as an alias of 'cr0', but the infrastructure does not seem to support this properly (neither the register allocator nor the scheduler properly accounts for the alias). Instead, we can just process this as a naming alias inside of the inline asm constraint-processing code, so we'll do that instead. There are two regression tests, one where the post-RA scheduler did the wrong thing with the non-allocatable alias, and one where the register allocator did the wrong thing. Fixes PR21742. ``` --------------------------------------------------------------------- llvm-svn: 223748
-
Hal Finkel authored
```--------------------------------------------------------------------- r223328 | hfinkel | 2014-12-04 00:46:20 +0000 (Thu, 04 Dec 2014) | 8 lines [PowerPC] 'cc' should be an alias only to 'cr0' We had mistakenly believed that GCC's 'cc' referred to the entire condition-code register (cr0 through cr7) -- and implemented this in r205630 to fix PR19326, but 'cc' is actually an alias only to 'cr0'. This is causing LLVM to clobber too much with legacy code with inline asm using the 'cc' clobber. Fixes PR21451. ``` --------------------------------------------------------------------- llvm-svn: 223747
-
Hal Finkel authored
```--------------------------------------------------------------------- r223318 | hfinkel | 2014-12-03 23:40:13 +0000 (Wed, 03 Dec 2014) | 12 lines [PowerPC] Fix inline asm memory operands not to use r0 On PowerPC, inline asm memory operands might be expanded as 0($r), where $r is a register containing the address. As a result, this register cannot be r0, and we need to enforce this register subclass constraint to prevent miscompiling the code (we'd get this constraint for free with the usual instruction definitions, but that scheme has no knowledge of how we end up printing inline asm memory operands, and so here we need to do it 'by hand'). We can accomplish this within the current address-mode selection framework by introducing an explicit COPY_TO_REGCLASS node. Fixes PR21443. ``` --------------------------------------------------------------------- llvm-svn: 223746
-
Hal Finkel authored
```--------------------------------------------------------------------- r223220 | hfinkel | 2014-12-03 09:37:50 +0000 (Wed, 03 Dec 2014) | 23 lines [PowerPC] Print all inline-asm consts as signed numbers Almost all immediates in PowerPC assembly (both 32-bit and 64-bit) are signed numbers, and it is important that we print them as such. To make sure that happens, we change PPCTargetLowering::LowerAsmOperandForConstraint so that it does all intermediate checks on a signed-extended int64_t value, and then creates the resulting target constant using MVT::i64. This will ensure that all negative values are printed as negative values (mirroring what is done in other backends to achieve the same sign-extension effect). This came up in the context of inline assembly like this: "add%I2 %0,%0,%2", ..., "Ir"(-1ll) where we used to print: addi 3,3,4294967295 and gcc would print: addi 3,3,-1 and gas accepts both forms, but our builtin assembler (correctly) does not. Now we print -1 like gcc does. While here, I replaced a bunch of custom integer checks with isInt<16> and friends from MathExtras.h. Thanks to Paul Hargrove for the bug report. ``` --------------------------------------------------------------------- llvm-svn: 223745
-
Hal Finkel authored
```--------------------------------------------------------------------- r222996 | foad | 2014-12-01 09:42:32 +0000 (Mon, 01 Dec 2014) | 19 lines [PowerPC] Fix unwind info with dynamic stack realignment Summary: PowerPC DWARF unwind info defined CFA as SP + offset even in a function where the stack had been dynamically realigned. This clearly doesn't work because the offset from SP to CFA is not a constant. Fix it by defining CFA as BP instead. This was causing the AddressSanitizer null_deref test to fail 50% of the time, depending on whether SP happened to be 32-byte aligned on entry to a particular function or not. Reviewers: willschm, uweigand, hfinkel Reviewed By: hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6410 ``` --------------------------------------------------------------------- llvm-svn: 223744
-
Hal Finkel authored
```--------------------------------------------------------------------- r222672 | uweigand | 2014-11-24 18:09:47 +0000 (Mon, 24 Nov 2014) | 10 lines [PowerPC] Fix PR 21652 - copy st_other bits on symbol assignment When processing an assignment in the integrated assembler that sets a symbol to the value of another symbol, we need to copy the st_other bits that encode the local entry point offset. Modeled after MipsTargetELFStreamer::emitAssignment handling of the ELF::STO_MIPS_MICROMIPS flag. ``` --------------------------------------------------------------------- llvm-svn: 223743
-
Hal Finkel authored
```--------------------------------------------------------------------- r221703 | wschmidt | 2014-11-11 20:44:09 +0000 (Tue, 11 Nov 2014) | 48 lines [PowerPC] Replace foul hackery with real calls to __tls_get_addr My original support for the general dynamic and local dynamic TLS models contained some fairly obtuse hacks to generate calls to __tls_get_addr when lowering a TargetGlobalAddress. Rather than generating real calls, special GET_TLS_ADDR nodes were used to wrap the calls and only reveal them at assembly time. I attempted to provide correct parameter and return values by chaining CopyToReg and CopyFromReg nodes onto the GET_TLS_ADDR nodes, but this was also not fully correct. Problems were seen with two back-to-back stores to TLS variables, where the call sequences ended up overlapping with unhappy results. Additionally, since these weren't real calls, the proper register side effects of a call were not recorded, so clobbered values were kept live across the calls. The proper thing to do is to lower these into calls in the first place. This is relatively straightforward; see the changes to PPCTargetLowering::LowerGlobalTLSAddress() in PPCISelLowering.cpp. The changes here are standard call lowering, except that we need to track the fact that these calls will require a relocation. This is done by adding a machine operand flag of MO_TLSLD or MO_TLSGD to the TargetGlobalAddress operand that appears earlier in the sequence. The calls to LowerCallTo() eventually find their way to LowerCall_64SVR4() or LowerCall_32SVR4(), which call FinishCall(), which calls PrepareCall(). In PrepareCall(), we detect the calls to __tls_get_addr and immediately snag the TargetGlobalTLSAddress with the annotated relocation information. This becomes an extra operand on the call following the callee, which is expected for nodes of type tlscall. We change the call opcode to CALL_TLS for this case. Back in FinishCall(), we change it again to CALL_NOP_TLS for 64-bit only, since we require a TOC-restore nop following the call for the 64-bit ABIs. During selection, patterns in PPCInstrInfo.td and PPCInstr64Bit.td convert the CALL_TLS nodes into BL_TLS nodes, and convert the CALL_NOP_TLS nodes into BL8_NOP_TLS nodes. This replaces the code removed from PPCAsmPrinter.cpp, as the BL_TLS or BL8_NOP_TLS nodes can now be emitted normally using their patterns and the associated printTLSCall print method. Finally, as a result of these changes, all references to get-tls-addr in its various guises are no longer used, so they have been removed. There are existing TLS tests to verify the changes haven't messed anything up). I've added one new test that verifies that the problem with the original code has been fixed. ``` --------------------------------------------------------------------- llvm-svn: 223742
-
Hal Finkel authored
```--------------------------------------------------------------------- r220959 | uweigand | 2014-10-31 10:33:14 +0000 (Fri, 31 Oct 2014) | 13 lines [PowerPC] Load BlockAddress values from the TOC in 64-bit SVR4 code Since block address values can be larger than 2GB in 64-bit code, they cannot be loaded simply using an @l / @ha pair, but instead must be loaded from the TOC, just like GlobalAddress, ConstantPool, and JumpTable values are. The commit also fixes a bug in PPCLinuxAsmPrinter::doFinalization where temporary labels could not be used as TOC values, since code would attempt (and fail) to use GetOrCreateSymbol to create a symbol of the same name as the temporary label. ``` --------------------------------------------------------------------- llvm-svn: 223741
-
Hal Finkel authored
```--------------------------------------------------------------------- r219441 | sfantao | 2014-10-09 20:42:56 +0000 (Thu, 09 Oct 2014) | 3 lines Fix bug in GPR to FPR moves in PPC64LE. The current implementation of GPR->FPR register moves uses a stack slot. This mechanism writes a double word and reads a word. In big-endian the load address must be displaced by 4-bytes in order to get the right value. In little endian this is no longer required. This patch fixes the issue and adds LE regression tests to fast-isel-conversion which currently expose this problem. ``` --------------------------------------------------------------------- llvm-svn: 223740
-
Hal Finkel authored
```--------------------------------------------------------------------- r217993 | sfantao | 2014-09-17 23:25:06 +0000 (Wed, 17 Sep 2014) | 5 lines Fix FastISel bug in boolean returns for PowerPC. For PPC targets, FastISel does not take the sign extension information into account when selecting return instructions whose operands are constants. A consequence of this is that the return of boolean values is not correct. This patch fixes the problem by evaluating the sign extension information also for constants, forwarding this information to PPCMaterializeInt which takes this information to drive the sign extension during the materialization. ``` --------------------------------------------------------------------- llvm-svn: 223739
-
Hal Finkel authored
```--------------------------------------------------------------------- r216917 | samsonov | 2014-09-02 17:38:34 +0000 (Tue, 02 Sep 2014) | 4 lines Fix signed integer overflow in PPCInstPrinter. This bug was reported by UBSan. ``` --------------------------------------------------------------------- llvm-svn: 223738
-
Hal Finkel authored
```--------------------------------------------------------------------- r214517 | uweigand | 2014-08-01 14:35:58 +0000 (Fri, 01 Aug 2014) | 8 lines [PowerPC] PR20280 - Slots for byval parameters are not immutable Found by inspection while looking at PR20280: code would mark slots in the parameter save area where a byval parameter is passed as "immutable". This is not correct since code is allowed to modify byval parameters in place in the parameter save area. ``` --------------------------------------------------------------------- llvm-svn: 223736
-
Hal Finkel authored
```--------------------------------------------------------------------- r215793 | hfinkel | 2014-08-16 00:16:29 +0000 (Sat, 16 Aug 2014) | 9 lines [PowerPC] Darwin byval arguments are not immutable On PPC/Darwin, byval arguments occur at fixed stack offsets in the callee's frame, but are not immutable -- the pointer value is directly available to the higher-level code as the address of the argument, and the value of the byval argument can be modified at the IR level. This is necessary, but not sufficient, to fix PR20280. When PR20280 is fixed in a follow-up commit, its test case will cover this change. ``` --------------------------------------------------------------------- llvm-svn: 223735
-
Hal Finkel authored
```--------------------------------------------------------------------- r213960 | hfinkel | 2014-07-25 17:47:22 +0000 (Fri, 25 Jul 2014) | 3 lines [PowerPC] Support TLS on PPC32/ELF Patch by Justin Hibbits! ``` --------------------------------------------------------------------- llvm-svn: 223734
-
- 08 Dec, 2014 18 commits
-
-
Tom Stellard authored
```--------------------------------------------------------------------- r221832 | richard-llvm | 2014-11-12 18:38:38 -0500 (Wed, 12 Nov 2014) | 5 lines PR19372: Keep checking template arguments after we see an argument pack expansion into a parameter pack; we know that we're still filling in that parameter's arguments. Previously, if we hit this case for an alias template, we'd try to substitute using non-canonical template arguments. ``` --------------------------------------------------------------------- llvm-svn: 223719
-
Tom Stellard authored
```--------------------------------------------------------------------- r221748 | richard-llvm | 2014-11-11 20:43:45 -0500 (Tue, 11 Nov 2014) | 4 lines PR21536: Fix a corner case where we'd get confused by a pack expanding into the penultimate parameter of a template parameter list, where the last parameter is itself a pack, and build a bogus empty final pack argument. ``` --------------------------------------------------------------------- llvm-svn: 223718
-
Duncan P. N. Exon Smith authored
```--------------------------------------------------------------------- r223500 | dexonsmith | 2014-12-05 11:13:42 -0800 (Fri, 05 Dec 2014) | 9 lines BFI: Saturate when combining edges to a successor When a loop gets bundled up, its outgoing edges are quite large, and can just barely overflow 64-bits. If one successor has multiple incoming edges -- and that successor is getting all the incoming mass -- combining just its edges can overflow. Handle that by saturating rather than asserting. This fixes PR21622. ``` --------------------------------------------------------------------- llvm-svn: 223716
-
Duncan P. N. Exon Smith authored
llvm-svn: 223713
-
Duncan P. N. Exon Smith authored
```--------------------------------------------------------------------- r223500 | dexonsmith | 2014-12-05 11:13:42 -0800 (Fri, 05 Dec 2014) | 9 lines BFI: Saturate when combining edges to a successor When a loop gets bundled up, its outgoing edges are quite large, and can just barely overflow 64-bits. If one successor has multiple incoming edges -- and that successor is getting all the incoming mass -- combining just its edges can overflow. Handle that by saturating rather than asserting. This fixes PR21622. ``` --------------------------------------------------------------------- llvm-svn: 223712
-
David Majnemer authored
```--------------------------------------------------------------------- r217104 | majnemer | 2014-09-03 16:20:58 -0700 (Wed, 03 Sep 2014) | 1 line Adjust test to handle fallout from r217102. ``` --------------------------------------------------------------------- llvm-svn: 223703
-
David Majnemer authored
```--------------------------------------------------------------------- r221318 | majnemer | 2014-11-04 15:49:08 -0800 (Tue, 04 Nov 2014) | 10 lines Analysis: Make isSafeToSpeculativelyExecute fire less for divides Divides and remainder operations do not behave like other operations when they are given poison: they turn into undefined behavior. It's really hard to know if the operands going into a div are or are not poison. Because of this, we should only choose to speculate if there are constant operands which we can easily reason about. This fixes PR21412. ``` --------------------------------------------------------------------- llvm-svn: 223647
-
David Majnemer authored
llvm-svn: 223646
-
David Majnemer authored
```--------------------------------------------------------------------- r215818 | majnemer | 2014-08-16 02:23:42 -0700 (Sat, 16 Aug 2014) | 12 lines InstCombine: Fix a potential bug in 0 - (X sdiv C) -> (X sdiv -C) While *most* (X sdiv 1) operations will get caught by InstSimplify, it is still possible for a sdiv to appear in the worklist which hasn't been simplified yet. This means that it is possible for 0 - (X sdiv 1) to get transformed into (X sdiv -1); dividing by -1 can make the transform produce undef values instead of the proper result. Sorry for the lack of testcase, it's a bit problematic because it relies on the exact order of operations in the worklist. ``` --------------------------------------------------------------------- llvm-svn: 223645
-
David Majnemer authored
```--------------------------------------------------------------------- r214385 | majnemer | 2014-07-30 21:49:29 -0700 (Wed, 30 Jul 2014) | 9 lines InstCombine: Correctly propagate NSW/NUW for x-(-A) -> x+A We can only propagate the nsw bits if both subtraction instructions are marked with the appropriate bit. N.B. We only propagate the nsw bit in InstCombine because the nuw case is already handled in InstSimplify. This fixes PR20189. ``` --------------------------------------------------------------------- llvm-svn: 223644
-
David Majnemer authored
llvm-svn: 223643
-
David Majnemer authored
```--------------------------------------------------------------------- r222500 | majnemer | 2014-11-20 18:37:38 -0800 (Thu, 20 Nov 2014) | 1 line This Reassociate change unintentionally slipped in r222499 ``` --------------------------------------------------------------------- llvm-svn: 223640
-
David Majnemer authored
```--------------------------------------------------------------------- r216891 | majnemer | 2014-09-01 14:20:14 -0700 (Mon, 01 Sep 2014) | 12 lines SROA: Don't insert instructions before a PHI SROA may decide that it needs to insert a bitcast and would set it's insertion point before a PHI. This will create an invalid module right quick. Instead, choose the first insertion point in the basic block that holds our PHI. This fixes PR20822. Differential Revision: http://reviews.llvm.org/D5141 ``` --------------------------------------------------------------------- llvm-svn: 223639
-
David Majnemer authored
```--------------------------------------------------------------------- r217115 | majnemer | 2014-09-03 17:23:13 -0700 (Wed, 03 Sep 2014) | 3 lines IndVarSimplify: Address review comments for r217102 No functional change intended, just some cleanups and comments added. ``` --------------------------------------------------------------------- llvm-svn: 223638
-
David Majnemer authored
```--------------------------------------------------------------------- r217102 | majnemer | 2014-09-03 16:03:18 -0700 (Wed, 03 Sep 2014) | 11 lines IndVarSimplify: Don't let LFTR compare against a poison value LinearFunctionTestReplace tries to use the *next* indvar to compare against when possible. However, it may be the case that the calculation for the next indvar has NUW/NSW flags and that it may only be safely used inside the loop. Using it in a comparison to calculate the exit condition could result in observing poison. This fixes PR20680. Differential Revision: http://reviews.llvm.org/D5174 ``` --------------------------------------------------------------------- llvm-svn: 223637
-
David Majnemer authored
```--------------------------------------------------------------------- r221501 | majnemer | 2014-11-06 16:31:14 -0800 (Thu, 06 Nov 2014) | 7 lines LoopVectorize: Don't assume pointees are sized A pointer's pointee might not be sized: the pointee could be a function. Report this as IK_NoInduction when calculating isInductionVariable. This fixes PR21508. ``` --------------------------------------------------------------------- llvm-svn: 223636
-
David Majnemer authored
```--------------------------------------------------------------------- r222376 | majnemer | 2014-11-19 11:36:18 -0800 (Wed, 19 Nov 2014) | 3 lines AliasSet: Simplify mergeSetIn No functional change intended. ``` --------------------------------------------------------------------- llvm-svn: 223635
-
David Majnemer authored
```--------------------------------------------------------------------- r222338 | majnemer | 2014-11-19 01:41:05 -0800 (Wed, 19 Nov 2014) | 16 lines AliasSetTracker: UnknownInsts should contribute to the refcount AliasSetTracker::addUnknown may create an AliasSet devoid of pointers just to contain an instruction if no suitable AliasSet already exists. It will then AliasSet::addUnknownInst and we will be done. However, it's possible for addUnknown to choose an existing AliasSet to addUnknownInst. If this were to occur, we are in a bit of a pickle: removing pointers from the AliasSet can cause the entire AliasSet to become destroyed, taking our unknown instructions out with them. Instead, keep track whether or not our AliasSet has any unknown instructions. This fixes PR21582. ``` --------------------------------------------------------------------- llvm-svn: 223627
-
- 06 Dec, 2014 2 commits
-
-
Daniel Sanders authored
```--------------------------------------------------------------------- r215169 | ghoflehner | 2014-08-08 00:19:55 +0100 (Fri, 08 Aug 2014) | 2 lines Fix for multi-line comment warning ``` --------------------------------------------------------------------- llvm-svn: 223582
-
Daniel Sanders authored
```--------------------------------------------------------------------- r220360 | filcab | 2014-10-22 03:16:06 +0100 (Wed, 22 Oct 2014) | 5 lines Silence gcc's -Wcomment gcc's (4.7, I think) -Wcomment warning is not "as smart" as clang's and warns even if the line right after the backslash-newline sequence only has a line comment that starts at the beginning of the line. ``` --------------------------------------------------------------------- llvm-svn: 223581
-
- 05 Dec, 2014 5 commits
-
-
Michael Zolotukhin authored
PR21302. Vectorize only bottom-tested loops. rdar://problem/18886083 llvm-svn: 223535
-
Michael Zolotukhin authored
Apply loop-rotate to several vectorizer tests. Such loops shouldn't be vectorized due to the loops form. After applying loop-rotate (+simplifycfg) the tests again start to check what they are intended to check. llvm-svn: 223534
-
Michael Zolotukhin authored
Correctly update dom-tree after loop vectorizer. llvm-svn: 223531
-
Daniel Sanders authored
llvm-svn: 223459
-
Daniel Sanders authored
```--------------------------------------------------------------------- r223148 | dsanders | 2014-12-02 20:40:27 +0000 (Tue, 02 Dec 2014) | 17 lines [mips] Fix passing of small structures for big-endian O32. Summary: Like N32/N64, they must be passed in the upper bits of the register. The new code could be merged with the existing if-statements but I've refrained from doing this since it will make porting the O32 implementation to tablegen harder later. Reviewers: vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6463 ``` --------------------------------------------------------------------- llvm-svn: 223457
-