and though bugs are the bane of my existence, rest assured the wretched thing will get the best of care here

  1. 11 Apr, 2014 17 commits
    • Tom Stellard's avatar
      Merging r203581: · f909ae3f
      Tom Stellard authored
      ```---------------------------------------------------------------------
      r203581 | hans | 2014-03-11 11:49:24 -0400 (Tue, 11 Mar 2014) | 7 lines
      
      X86: Don't generate 64-bit movd after cmpneqsd in 32-bit mode (PR19059)
      
      This fixes the bug where we would bitcast the 64-bit floating point result
      of cmpneqsd to a 64-bit integer even on 32-bit targets.
      
      Differential Revision: http://llvm-reviews.chandlerc.com/D3009
      ```
      
      ---------------------------------------------------------------------
      
      llvm-svn: 206071
      f909ae3f
    • Tom Stellard's avatar
      Merging r196981: · 7c40da76
      Tom Stellard authored
      ```---------------------------------------------------------------------
      r196981 | fang | 2013-12-10 17:51:25 -0500 (Tue, 10 Dec 2013) | 2 lines
      
      darwin asm driver: suppress -Q for -no-integrated-as on darwin<11
      ```
      
      ---------------------------------------------------------------------
      
      llvm-svn: 206067
      7c40da76
    • Tom Stellard's avatar
      Merging r204742: · 1b4f99d6
      Tom Stellard authored
      ```---------------------------------------------------------------------
      r204742 | benny.kra | 2014-03-25 14:02:07 -0400 (Tue, 25 Mar 2014) | 10 lines
      
      Fix an logic error in the clang driver preventing crtfastmath.o from linking when -Ofast is used without -ffast-math
      
      In gcc using -Ofast forces linking of crtfastmath.o.
      In the current clang crtfastmath.o is only linked when -ffast-math/-funsafe-math-optimizations passed. It can lead to performance issues, when using only -Ofast without explicit -ffast-math (I faced with it).
      My patch fixes inconsistency with gcc behaviour and also introduces few tests on it.
      
      Patch by Zinovy Nis!
      
      Differential Revision: http://llvm-reviews.chandlerc.com/D3114
      ```
      
      ---------------------------------------------------------------------
      
      llvm-svn: 206066
      1b4f99d6
    • Tom Stellard's avatar
      Merging r198937: · 330ff7f9
      Tom Stellard authored
      ```---------------------------------------------------------------------
      r198937 | kristof.beyls | 2014-01-10 08:41:49 -0500 (Fri, 10 Jan 2014) | 2 lines
      
      Make sure -use-init-array has intended effect on all AArch64 ELF targets, not just linux.
      ```
      
      ---------------------------------------------------------------------
      
      llvm-svn: 206065
      330ff7f9
    • Tom Stellard's avatar
      Merging r198940: · 18959c31
      Tom Stellard authored
      ```---------------------------------------------------------------------
      r198940 | kristof.beyls | 2014-01-10 08:44:34 -0500 (Fri, 10 Jan 2014) | 2 lines
      
      Enable -fuse-init-array for all AArch64 ELF targets by default, not just linux.
      ```
      
      ---------------------------------------------------------------------
      
      llvm-svn: 206064
      18959c31
    • Tom Stellard's avatar
      Merging r202774: · 7ed7738a
      Tom Stellard authored
      ```---------------------------------------------------------------------
      r202774 | reid | 2014-03-03 19:33:17 -0500 (Mon, 03 Mar 2014) | 7 lines
      
      MC: Fix Intel assembly parser for [global + offset]
      
      We were dropping the displacement on the floor if we also had some
      immediate offset.
      
      Should fix PR19033.
      ```
      
      ---------------------------------------------------------------------
      
      llvm-svn: 206061
      7ed7738a
    • Tom Stellard's avatar
      Merging r203007: · f0092f48
      Tom Stellard authored
      ```---------------------------------------------------------------------
      r203007 | rafael.espindola | 2014-03-05 16:04:41 -0500 (Wed, 05 Mar 2014) | 4 lines
      
      Don't produce an alias between destructors with different calling conventions.
      
      Fixes pr19007.
      ```
      
      ---------------------------------------------------------------------
      
      llvm-svn: 206059
      f0092f48
    • Tom Stellard's avatar
      Merging r205144: · 2c2b1619
      Tom Stellard authored
      ```---------------------------------------------------------------------
      r205144 | hfinkel | 2014-03-30 09:00:06 -0400 (Sun, 30 Mar 2014) | 7 lines
      
      [PowerPC] Make -pg generate calls to _mcount not mcount
      
      At least on REL6 (Linux/glibc 2.12), the proper symbol for generating gprof
      data is _mcount, not mcount. Prior to this change, compiling with -pg would
      generate linking errors (because of unresolved references to mcount), after
      this change -pg seems at least minimally functional.
      ```
      
      ---------------------------------------------------------------------
      
      llvm-svn: 206058
      2c2b1619
    • Tom Stellard's avatar
      Merging r201126: · c26b5e82
      Tom Stellard authored
      ```---------------------------------------------------------------------
      r201126 | craig.topper | 2014-02-10 23:05:33 -0500 (Mon, 10 Feb 2014) | 2 lines
      
      Changed attributes of all gather intrinsics from IntrReadMem to IntrReadArgMem as they access only memory based on argument. Patch by Robert Khasanov.
      ```
      
      ---------------------------------------------------------------------
      
      llvm-svn: 206057
      c26b5e82
    • Tom Stellard's avatar
      Merging r201507: · 89439d07
      Tom Stellard authored
      ```---------------------------------------------------------------------
      r201507 | craig.topper | 2014-02-17 05:03:43 -0500 (Mon, 17 Feb 2014) | 2 lines
      
      Fix diassembler handling of rex.b when mod=00/01/10 and bbb=101. Mod=00 should ignore the base register entirely. Mod=01/10 should treat this as R13 plus displacment. Fixes PR18860.
      ```
      
      ---------------------------------------------------------------------
      
      llvm-svn: 206056
      89439d07
    • Tom Stellard's avatar
      Merging r205067: · d51319cf
      Tom Stellard authored
      ```---------------------------------------------------------------------
      r205067 | ahatanaka | 2014-03-28 19:28:07 -0400 (Fri, 28 Mar 2014) | 7 lines
      
      [x86] Fix printing of register operands with q modifier.
      
      Emit 32-bit register names instead of 64-bit register names if the target does
      not have 64-bit general purpose registers.
      
      <rdar://problem/14653996>
      ```
      
      ---------------------------------------------------------------------
      
      llvm-svn: 206055
      d51319cf
    • Tom Stellard's avatar
      Merging r200028: · 74311623
      Tom Stellard authored
      ```---------------------------------------------------------------------
      r200028 | benny.kra | 2014-01-24 14:02:37 -0500 (Fri, 24 Jan 2014) | 4 lines
      
      InstCombine: Don't try to use aggregate elements of ConstantExprs.
      
      PR18600.
      ```
      
      ---------------------------------------------------------------------
      
      llvm-svn: 206054
      74311623
    • Tom Stellard's avatar
      Merging r199351: · 221dfdbe
      Tom Stellard authored
      ```---------------------------------------------------------------------
      r199351 | aschwaighofer | 2014-01-15 23:53:18 -0500 (Wed, 15 Jan 2014) | 5 lines
      
      BasicAA: We need to check both access sizes when comparing a gep and an
      underlying object of unknown size.
      
      Fixes PR18460.
      ```
      
      ---------------------------------------------------------------------
      
      llvm-svn: 206053
      221dfdbe
    • Tom Stellard's avatar
      Merging r198400: · e2151aa2
      Tom Stellard authored
      ```---------------------------------------------------------------------
      r198400 | aschwaighofer | 2014-01-03 00:47:03 -0500 (Fri, 03 Jan 2014) | 18 lines
      
      BasicAA: Use reachabilty instead of dominance for checking value equality in phi
      cycles
      
      This allows the value equality check to work even if we don't have a dominator
      tree. Also add some more comments.
      
      I was worried about compile time impacts and did not implement reachability but
      used the dominance check in the initial patch. The trade-off was that the
      dominator tree was required.
      The llvm utility function isPotentiallyReachable cuts off the recursive search
      after 32 visits. Testing did not show any compile time regressions showing my
      worries unjustfied.
      
      No compile time or performance regressions at O3 -flto -mavx on test-suite +
      externals.
      
      Addresses review comments from r198290.
      ```
      
      ---------------------------------------------------------------------
      
      llvm-svn: 206052
      e2151aa2
    • Tom Stellard's avatar
      Merging r198290: · 2840a1c0
      Tom Stellard authored
      ```---------------------------------------------------------------------
      r198290 | aschwaighofer | 2014-01-01 22:31:36 -0500 (Wed, 01 Jan 2014) | 23 lines
      
      BasicAA: Fix value equality and phi cycles
      
      When there are cycles in the value graph we have to be careful interpreting
      "Value*" identity as "value" equivalence. We interpret the value of a phi node
      as the value of its operands.
      When we check for value equivalence now we make sure that the "Value*" dominates
      all cycles (phis).
      
      %0 = phi [%noaliasval, %addr2]
      %l = load %ptr
      %addr1 = gep @a, 0, %l
      %addr2 = gep @a, 0, (%l + 1)
      store %ptr ...
      
      Before this patch we would return NoAlias for (%0, %addr1) which is wrong
      because the value of the load is from different iterations of the loop.
      
      Tested on x86_64 -mavx at O3 and O3 -flto with no performance or compile time
      regressions.
      
      PR18068
      radar://15653794
      ```
      
      ---------------------------------------------------------------------
      
      llvm-svn: 206051
      2840a1c0
    • Tom Stellard's avatar
      Merging r196970: · 16004dc2
      Tom Stellard authored
      ```---------------------------------------------------------------------
      r196970 | fang | 2013-12-10 16:37:41 -0500 (Tue, 10 Dec 2013) | 3 lines
      
      on darwin<10, fallback to .weak_definition (PPC,X86)
      .weak_def_can_be_hidden was not yet supported by the system assembler
      ```
      
      ---------------------------------------------------------------------
      
      llvm-svn: 206050
      16004dc2
    • Tom Stellard's avatar
      Merging r195971: · aa29e1bd
      Tom Stellard authored
      ```---------------------------------------------------------------------
      r195971 | juergen | 2013-11-29 22:07:16 -0500 (Fri, 29 Nov 2013) | 2 lines
      
      Force CPU type to unbreak unit tests on Haswell machines.
      ```
      
      ---------------------------------------------------------------------
      
      llvm-svn: 206049
      aa29e1bd
  2. 09 Apr, 2014 14 commits
    • Tom Stellard's avatar
      Merging r200705: · 06809a95
      Tom Stellard authored
      ------------------------------------------------------------------------
      r200705 | hfinkel | 2014-02-03 12:27:25 -0500 (Mon, 03 Feb 2014) | 5 lines
      
      Expand vector bswap in LegalizeVectorOps
      
      ISD::BSWAP was missing from the list of node types that should be expanded
      element-wise.
      
      llvm-svn: 205910
      06809a95
    • Tom Stellard's avatar
      Merging r205630: · c499fcb1
      Tom Stellard authored
      ```---------------------------------------------------------------------
      r205630 | hfinkel | 2014-04-04 11:15:57 -0400 (Fri, 04 Apr 2014) | 6 lines
      
      [PowerPC] Add a full condition code register to make the "cc" clobber work
      
      gcc inline asm supports specifying "cc" as a clobber of all condition
      registers. Add just enough modeling of the full register to make this work.
      Fixed PR19326.
      ```
      
      ---------------------------------------------------------------------
      
      llvm-svn: 205908
      c499fcb1
    • Tom Stellard's avatar
      Merging r204304: · 3bf56a8a
      Tom Stellard authored
      ```---------------------------------------------------------------------
      r204304 | Hao.Liu | 2014-03-20 01:36:59 -0400 (Thu, 20 Mar 2014) | 2 lines
      
      [ARM]Fix an assertion failure in A15SDOptimizer about DPair reg class by treating DPair as QPR.
      ```
      
      ---------------------------------------------------------------------
      
      llvm-svn: 205904
      3bf56a8a
    • Tom Stellard's avatar
      Merging r201841: · ac9f8f51
      Tom Stellard authored
      ```---------------------------------------------------------------------
      r201841 | Kevin.Qin | 2014-02-21 02:45:48 -0500 (Fri, 21 Feb 2014) | 2 lines
      
      [AArch64] Add register constraints to avoid generating STLXR and STXR with unpredictable behavior.
      ```
      
      ---------------------------------------------------------------------
      
      llvm-svn: 205903
      ac9f8f51
    • Tom Stellard's avatar
      Merging r201541: · 2bf16f00
      Tom Stellard authored
      ```---------------------------------------------------------------------
      r201541 | jiangning.liu | 2014-02-17 21:37:42 -0500 (Mon, 17 Feb 2014) | 2 lines
      
      Fix a typo about lowering AArch64 va_copy.
      ```
      
      ---------------------------------------------------------------------
      
      llvm-svn: 205902
      2bf16f00
    • Tom Stellard's avatar
      Merging r199369: · f9962e40
      Tom Stellard authored
      ```---------------------------------------------------------------------
      r199369 | jiangning.liu | 2014-01-16 04:16:13 -0500 (Thu, 16 Jan 2014) | 2 lines
      
      For ARM, fix assertuib failures for some ld/st 3/4 instruction with wirteback.
      ```
      
      ---------------------------------------------------------------------
      
      llvm-svn: 205901
      f9962e40
    • Tom Stellard's avatar
      Merging r204155: · 4258f9ed
      Tom Stellard authored
      ```---------------------------------------------------------------------
      r204155 | wschmidt | 2014-03-18 10:32:50 -0400 (Tue, 18 Mar 2014) | 16 lines
      
      Fix PR19144: Incorrect offset generated for int-to-fp conversion at -O0.
      
      When converting a signed 32-bit integer to double-precision floating point on
      hardware without a lfiwax instruction, we have to instead use a lfd followed
      by fcfid.  We were erroneously offsetting the address by 4 bytes in
      preparation for either a lfiwax or lfiwzx when generating the lfd.  This fixes
      that silly error.
      
      This was not caught in the test suite since the conversion tests were run with
      -mcpu=pwr7, which implies availability of lfiwax.  I've added another test
      case for older hardware that checks the code we expect in the absence of
      lfiwax and other flavors of fcfid.  There are fewer tests in this test case
      because we punt to DAG selection in more cases on older hardware.  (We must
      generate complex fiddly sequences in those cases, and there is marginal
      benefit in duplicating that logic in fast-isel.)
      ```
      
      ---------------------------------------------------------------------
      
      llvm-svn: 205824
      4258f9ed
    • Tom Stellard's avatar
      Merging r203054: · 8315f9b1
      Tom Stellard authored
      ```---------------------------------------------------------------------
      r203054 | hfinkel | 2014-03-05 20:28:23 -0500 (Wed, 05 Mar 2014) | 7 lines
      
      The PPC global base register cannot be r0
      
      The global base register cannot be r0 because it might end up as the first
      argument to addi or addis. Fixes PR18316.
      
      I don't have a small stable test case.
      ```
      
      ---------------------------------------------------------------------
      
      llvm-svn: 205823
      8315f9b1
    • Tom Stellard's avatar
      Merging r202192: · f5240be8
      Tom Stellard authored
      ```---------------------------------------------------------------------
      r202192 | hfinkel | 2014-02-25 15:51:50 -0500 (Tue, 25 Feb 2014) | 5 lines
      
      Account for 128-bit integer operations in PPCCTRLoops
      
      We need to abort the formation of counter-register-based loops where there are
      128-bit integer operations that might become function calls.
      ```
      
      ---------------------------------------------------------------------
      
      llvm-svn: 205822
      f5240be8
    • Tom Stellard's avatar
      Merging r200288: · 2d0bbf7e
      Tom Stellard authored
      ```---------------------------------------------------------------------
      r200288 | hfinkel | 2014-01-28 00:32:58 -0500 (Tue, 28 Jan 2014) | 5 lines
      
      Handle spilling the PPC GPRC_NOR0 register class
      
      GPRC_NOR0 is not a subclass of GPRC (because it also contains the ZERO pseudo
      register). As a result, we also need to check for it in the spilling code.
      ```
      
      ---------------------------------------------------------------------
      
      llvm-svn: 205821
      2d0bbf7e
    • Tom Stellard's avatar
      Merging r199832: · 9b9af8e0
      Tom Stellard authored
      ```---------------------------------------------------------------------
      r199832 | rafael.espindola | 2014-01-22 15:20:52 -0500 (Wed, 22 Jan 2014) | 11 lines
      
      Fix pr18515.
      
      My understanding (from reading just the llvm code) is that
      * most ppc cpus have a "sync n" instruction and an msync alias that is
      * "sync 0".
      * "book e" cpus instead have a msync instruction and not the more
      general "sync n"
      
      This patch reflects that in the .td files, allowing a single codepath
      for
      asm ond obj streamer and incidentelly fixes a crash when EmitRawText was
      called on a obj streamer.
      ```
      
      ---------------------------------------------------------------------
      
      llvm-svn: 205820
      9b9af8e0
    • Tom Stellard's avatar
      Merging r199763: · afbacceb
      Tom Stellard authored
      ```---------------------------------------------------------------------
      r199763 | hfinkel | 2014-01-21 15:15:58 -0500 (Tue, 21 Jan 2014) | 9 lines
      
      Fix pointer info on PPC byval stores
      
      For PPC64 SVR (and Darwin), the stores that take byval aggregate parameters
      from registers into the stack frame had MachinePointerInfo objects with
      incorrect offsets. These offsets are relative to the object itself, not to the
      stack frame base.
      
      This fixes self hosting on PPC64 when compiling with -enable-aa-sched-mi.
      ```
      
      ---------------------------------------------------------------------
      
      llvm-svn: 205819
      afbacceb
    • Tom Stellard's avatar
      Merging r199570: · 270303dc
      Tom Stellard authored
      ```---------------------------------------------------------------------
      r199570 | aschwaighofer | 2014-01-18 22:18:31 -0500 (Sat, 18 Jan 2014) | 11 lines
      
      LoopVectorizer: A reduction that has multiple uses of the reduction value is not
      a reduction.
      
      Really. Under certain circumstances (the use list of an instruction has to be
      set up right - hence the extra pass in the test case) we would not recognize
      when a value in a potential reduction cycle was used multiple times by the
      reduction cycle.
      
      Fixes PR18526.
      radar://15851149
      ```
      
      ---------------------------------------------------------------------
      
      llvm-svn: 205818
      270303dc
    • Tom Stellard's avatar
      Merging r198425: · 5e1625b0
      Tom Stellard authored
      ```---------------------------------------------------------------------
      r198425 | dpeixott | 2014-01-03 12:20:01 -0500 (Fri, 03 Jan 2014) | 33 lines
      
      Fix loop rerolling pass failure with non-consant loop lower bound
      
      The loop rerolling pass was failing with an assertion failure from a
      failed cast on loops like this:
      
        void foo(int *A, int *B, int m, int n) {
          for (int i = m; i < n; i+=4) {
            A[i+0] = B[i+0] * 4;
            A[i+1] = B[i+1] * 4;
            A[i+2] = B[i+2] * 4;
            A[i+3] = B[i+3] * 4;
          }
        }
      
      The code was casting the SCEV-expanded code for the new
      induction variable to a phi-node. When the loop had a non-constant
      lower bound, the SCEV expander would end the code expansion with an
      add insted of a phi node and the cast would fail.
      
      It looks like the cast to a phi node was only needed to get the
      induction variable value coming from the backedge to compute the end
      of loop condition. This patch changes the loop reroller to compare
      the induction variable to the number of times the backedge is taken
      instead of the iteration count of the loop. In other words, we stop
      the loop when the current value of the induction variable ==
      IterationCount-1. Previously, the comparison was comparing the
      induction variable value from the next iteration == IterationCount.
      
      This problem only seems to occur on 32-bit targets. For some reason,
      the loop is not rerolled on 64-bit targets.
      
      PR18290
      ```
      
      ---------------------------------------------------------------------
      
      llvm-svn: 205817
      5e1625b0
  3. 08 Apr, 2014 9 commits
    • Tom Stellard's avatar
      Merging r203146: · d0fe0204
      Tom Stellard authored
      ```---------------------------------------------------------------------
      r203146 | reid | 2014-03-06 14:19:12 -0500 (Thu, 06 Mar 2014) | 6 lines
      
      MS asm: The initial dot in struct access is optional
      
      Fixes PR18994.
      
      Tests, once again, in that other repository.  =P
      ```
      
      ---------------------------------------------------------------------
      
      llvm-svn: 205814
      d0fe0204
    • Tom Stellard's avatar
      Merging r205738: · dcd1b1bd
      Tom Stellard authored
      ```---------------------------------------------------------------------
      r205738 | atrick | 2014-04-07 17:29:22 -0400 (Mon, 07 Apr 2014) | 16 lines
      
      Put a limit on ScheduleDAGSDNodes::ClusterNeighboringLoads to avoid blowing up compile time.
      
      Fixes PR16365 - Extremely slow compilation in -O1 and -O2.
      
      The SD scheduler has a quadratic implementation of load clustering
      which absolutely blows up compile time for large blocks with constant
      pool loads. The MI scheduler has a better implementation of load
      clustering. However, we have not done the work yet to completely
      eliminate the SD scheduler. Some benchmarks still seem to benefit from
      early load clustering, although maybe by chance.
      
      As an intermediate term fix, I just put a nice limit on the number of
      DAG users to search before finding a match. With this limit there are no
      binary differences in the LLVM test suite, and the PR16365 test case
      does not suffer any compile time impact from this routine.
      ```
      
      ---------------------------------------------------------------------
      
      llvm-svn: 205808
      dcd1b1bd
    • Tom Stellard's avatar
      Merging r200202: · 67415166
      Tom Stellard authored
      ```---------------------------------------------------------------------
      r200202 | stpworld | 2014-01-27 04:43:10 -0500 (Mon, 27 Jan 2014) | 2 lines
      
      Additional fix for 200201: due to dependence on bitwidth test was moved to X86 directory.
      ```
      
      ---------------------------------------------------------------------
      
      llvm-svn: 205807
      67415166
    • Tom Stellard's avatar
      Merging r200201: · 6467e67e
      Tom Stellard authored
      ```---------------------------------------------------------------------
      r200201 | stpworld | 2014-01-27 04:18:31 -0500 (Mon, 27 Jan 2014) | 31 lines
      
      Fix for PR18102.
      
      Issue outcomes from DAGCombiner::MergeConsequtiveStores, more precisely from
      mem-ops sequence sorting.
      
      Consider, how MergeConsequtiveStores works for next example:
      
      store i8 1, a[0]
      store i8 2, a[1]
      store i8 3, a[1]   ; a[1] again.
      return   ; DAG starts here
      
      1. Method will collect all the 3 stores.
      2. It sorts them by distance from the base pointer (farthest with highest
      index).
      3. It takes first consecutive non-overlapping stores and (if possible) replaces
      them with a single store instruction.
      
      The point is, we can't determine here which 'store' instruction
      would be the second after sorting ('store 2' or 'store 3').
      It happens that 'store 3' would be the second, and 'store 2' would be the third.
      
      So after merging we have the next result:
      
      store i16 (1 | 3 << 8), base   ; is a[0] but bit-casted to i16
      store i8 2, a[1]
      
      So actually we swapped 'store 3' and 'store 2' and got wrong contents in a[1].
      
      Fix: In sort routine just also take into account mem-op sequence number.
      ```
      
      ---------------------------------------------------------------------
      
      llvm-svn: 205806
      6467e67e
    • Tom Stellard's avatar
      Merging r203725: · 1e5d9382
      Tom Stellard authored
      ```---------------------------------------------------------------------
      r203725 | rafael.espindola | 2014-03-12 18:03:43 -0400 (Wed, 12 Mar 2014) | 2 lines
      
      This test need the X86 backend, move it to the X86 sub directory.
      ```
      
      ---------------------------------------------------------------------
      
      llvm-svn: 205798
      1e5d9382
    • Tom Stellard's avatar
      Merging r203719: · 81396a23
      Tom Stellard authored
      ```---------------------------------------------------------------------
      r203719 | mzolotukhin | 2014-03-12 17:31:05 -0400 (Wed, 12 Mar 2014) | 4 lines
      
      PR17473:
      Don't normalize an expression during postinc transformation unless it's
      invertible.
      ```
      
      ---------------------------------------------------------------------
      
      llvm-svn: 205797
      81396a23
    • Tom Stellard's avatar
      Merging r202273: · 335a9ef3
      Tom Stellard authored
      ```---------------------------------------------------------------------
      r202273 | atrick | 2014-02-26 11:31:56 -0500 (Wed, 26 Feb 2014) | 4 lines
      
      Fix PR18165: LSR must avoid scaling factors that exceed the limit on truncated use.
      
      Patch by Michael Zolotukhin!
      ```
      
      ---------------------------------------------------------------------
      
      llvm-svn: 205796
      335a9ef3
    • Tom Stellard's avatar
      Merging r201104: · 5da6a378
      Tom Stellard authored
      ```---------------------------------------------------------------------
      r201104 | chandlerc | 2014-02-10 14:39:35 -0500 (Mon, 10 Feb 2014) | 26 lines
      
      [LPM] A terribly simple fix to a terribly complex bug: PR18773.
      
      The crux of the issue is that LCSSA doesn't preserve stateful alias
      analyses. Before r200067, LICM didn't cause LCSSA to run in the LTO pass
      manager, where LICM runs essentially without any of the other loop
      passes. As a consequence the globalmodref-aa pass run before that loop
      pass manager was able to survive the loop pass manager and be used by
      DSE to eliminate stores in the function called from the loop body in
      Adobe-C++/loop_unroll (and similar patterns in other benchmarks).
      
      When LICM was taught to preserve LCSSA it had to require it as well.
      This caused it to be run in the loop pass manager and because it did not
      preserve AA, the stateful AA was lost. Most of LLVM's AA isn't stateful
      and so this didn't manifest in most cases. Also, in most cases LCSSA was
      already running, and so there was no interesting change.
      
      The real kicker is that LCSSA by its definition (injecting PHI nodes
      only) trivially preserves AA! All we need to do is mark it, and then
      everything goes back to working as intended. It probably was blocking
      some other weird cases of stateful AA but the only one I have is
      a 1000-line IR test case from loop_unroll, so I don't really have a good
      test case here.
      
      Hopefully this fixes the regressions on performance that have been seen
      since that revision.
      ```
      
      ---------------------------------------------------------------------
      
      llvm-svn: 205795
      5da6a378
    • Tom Stellard's avatar
      Merging r198863: · ff32fb24
      Tom Stellard authored
      ```---------------------------------------------------------------------
      r198863 | stpworld | 2014-01-09 07:26:12 -0500 (Thu, 09 Jan 2014) | 6 lines
      
      Fixed old typo in ScalarEvolution, that caused wrong SCEVs zext operation.
      Detailed description is here:
      http://llvm.org/bugs/show_bug.cgi?id=18000#c16
      
      For participation in bugfix process special thanks to David Wiberg.
      ```
      
      ---------------------------------------------------------------------
      
      llvm-svn: 205794
      ff32fb24