- 30 Jan, 2015 5 commits
-
-
Hans Wennborg authored
and ThreadLocal out of the pretty stack tracing code." The patch has been having trouble on trunk and doesn't seem ready for 3.6. Reverting to get it out of the branch before tagging rc2. llvm-svn: 227646
-
Hans Wennborg authored
```--------------------------------------------------------------------- r227584 | compnerd | 2015-01-30 09:58:25 -0800 (Fri, 30 Jan 2015) | 10 lines ARM: correct handling of .fpu directive The FPU directive permits the user to switch the target FPU, enabling instructions that would be otherwise unavailable. However, when configuring the new subtarget features, we would not enable the implied functions for newer FPUs. This would result in invalid rejection of valid input. Ensure that we inherit the implied FPU functionality when enabling newer versions of the FPU. Fortunately, these are mostly hierarchical, unlike the CPUs. Addresses PR22395. ``` --------------------------------------------------------------------- llvm-svn: 227637
-
Tom Stellard authored
```--------------------------------------------------------------------- r227462 | thomas.stellard | 2015-01-29 11:55:28 -0500 (Thu, 29 Jan 2015) | 2 lines R600/SI: Remove stray debug statements ``` --------------------------------------------------------------------- llvm-svn: 227597
-
Tom Stellard authored
The schedule model is not complete yet, and could be improved. This is a partial merge of r227461. The difference is that it does not enable the machine scheduler by default. llvm-svn: 227596
-
Hans Wennborg authored
```--------------------------------------------------------------------- r227491 | spatel | 2015-01-29 12:51:49 -0800 (Thu, 29 Jan 2015) | 13 lines [GVN] don't propagate equality comparisons of FP zero (PR22376) In http://reviews.llvm.org/D6911, we allowed GVN to propagate FP equalities to allow some simple value range optimizations. But that introduced a bug when comparing to -0.0 or 0.0: these compare equal even though they are not bitwise identical. This patch disallows propagating zero constants in equality comparisons. Fixes: http://llvm.org/bugs/show_bug.cgi?id=22376 Differential Revision: http://reviews.llvm.org/D7257 ``` --------------------------------------------------------------------- llvm-svn: 227537
-
- 29 Jan, 2015 3 commits
-
-
Hans Wennborg authored
```--------------------------------------------------------------------- r227290 | dblaikie | 2015-01-27 18:34:53 -0800 (Tue, 27 Jan 2015) | 1 line PR22356: DebugInfo: Handle the size of a member where the type of that member is a typedef (or other sugar) of a declaration. ``` --------------------------------------------------------------------- llvm-svn: 227492
-
Hans Wennborg authored
```--------------------------------------------------------------------- r227339 | bsteinbr | 2015-01-28 10:32:31 -0800 (Wed, 28 Jan 2015) | 3 lines Fix build breakage caused by memory leaks in llvm-c-test I accidently introduced those in r227319. ``` --------------------------------------------------------------------- llvm-svn: 227477
-
Hans Wennborg authored
```--------------------------------------------------------------------- r227319 | bsteinbr | 2015-01-28 08:35:59 -0800 (Wed, 28 Jan 2015) | 10 lines Fix LLVMSetMetadata and LLVMAddNamedMetadataOperand for single value MDNodes Summary: MetadataAsValue uses a canonical format that strips the MDNode if it contains only a single constant value. This triggers an assertion when trying to cast the value to a MDNode. Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7165 ``` --------------------------------------------------------------------- llvm-svn: 227475
-
- 28 Jan, 2015 15 commits
-
-
Hans Wennborg authored
```--------------------------------------------------------------------- r227393 | joerg | 2015-01-28 15:30:39 -0800 (Wed, 28 Jan 2015) | 5 lines For the --be8 flag, check explicitly for pre-v7 / pre-v6m cores. Those used the old Big Endian support on ARM and don't need flags. Refactor the logic in a separate common function, which also looks at -march. Add corresponding logic for the Linux toolchain. ``` --------------------------------------------------------------------- llvm-svn: 227398
-
Hans Wennborg authored
```--------------------------------------------------------------------- r227088 | joerg | 2015-01-26 04:30:16 -0800 (Mon, 26 Jan 2015) | 3 lines For NetBSD/ARM-EB, link with --be8. Support for the older BE32 is currently not planned. ``` --------------------------------------------------------------------- llvm-svn: 227396
-
Hans Wennborg authored
```--------------------------------------------------------------------- r227085 | joerg | 2015-01-26 03:41:48 -0800 (Mon, 26 Jan 2015) | 13 lines The canonical CPU variant for ARM according to config.guess uses a suffix it seems: # ./config.guess earmv7hfeb-unknown-netbsd7.99.4 Extend the triple parsing to support this. Avoid running the ARM parser multiple times because StringSwitch is not lazy. Reviewers: Renato Golin, Tim Northover Differential Revision: http://reviews.llvm.org/D7166 ``` --------------------------------------------------------------------- llvm-svn: 227394 -
Hans Wennborg authored
```--------------------------------------------------------------------- r227368 | rikka | 2015-01-28 13:10:46 -0800 (Wed, 28 Jan 2015) | 9 lines Revert a change from r222797 that is no longer needed and can cause infinite recursion. Also guard against said infinite recursion by adding an assert that will trigger if CorrectDelayedTyposInExpr is called before a previous call to CorrectDelayedTyposInExpr returns (i.e. if the TreeTransform run by CorrectDelayedTyposInExpr calls a sequence of methods that end up calling CorrectDelayedTyposInExpr, as the new test case had done prior to this commit). Fixes PR22292. ``` --------------------------------------------------------------------- llvm-svn: 227375
-
Tom Stellard authored
```--------------------------------------------------------------------- r226970 | thomas.stellard | 2015-01-23 18:59:08 -0500 (Fri, 23 Jan 2015) | 2 lines R600/SI: Emit .hsa.version section for amdhsa OS ``` --------------------------------------------------------------------- llvm-svn: 227365
-
Tom Stellard authored
```--------------------------------------------------------------------- r226945 | thomas.stellard | 2015-01-23 17:05:45 -0500 (Fri, 23 Jan 2015) | 9 lines R600/SI: Move i64 -> v2i32 load promotion into AMDGPUDAGToDAGISel::Select() We used to do this promotion during DAG legalization, but this caused an infinite loop in ExpandUnalignedLoad() because it assumed that i64 loads were legal if i64 was a legal type. It also seems better to report i64 loads as legal, since they actually are and we were just promoting them to simplify our tablegen files. ``` --------------------------------------------------------------------- llvm-svn: 227364
-
Hans Wennborg authored
```--------------------------------------------------------------------- r227295 | majnemer | 2015-01-27 21:48:06 -0800 (Tue, 27 Jan 2015) | 5 lines Sema: Ensure that __c11_atomic_fetch_add has a pointer to complete type Pointer arithmetic is only makes sense if the pointee type is complete. This fixes PR22361. ``` --------------------------------------------------------------------- llvm-svn: 227349
-
Hans Wennborg authored
```--------------------------------------------------------------------- r227250 | ab | 2015-01-27 13:52:16 -0800 (Tue, 27 Jan 2015) | 31 lines [SimplifyLibCalls] Don't confuse strcpy_chk for stpcpy_chk. This was introduced in a faulty refactoring (r225640, mea culpa): the tests weren't testing the return values, so, for both __strcpy_chk and __stpcpy_chk, we would return the end of the buffer (matching stpcpy) instead of the beginning (for strcpy). The root cause was the prefix "__" being ignored when comparing, which made us always pick LibFunc::stpcpy_chk. Pass the LibFunc::Func directly to avoid this kind of error. Also, make the testcases as explicit as possible to prevent this. The now-useful testcases expose another, entangled, stpcpy problem, with the further simplification. This was introduced in a refactoring (r225640) to match the original behavior. However, this leads to problems when successive simplifications generate several similar instructions, none of which are removed by the custom replaceAllUsesWith. For instance, InstCombine (the main user) doesn't erase the instruction in its custom RAUW. When trying to simplify say __stpcpy_chk: - first, an stpcpy is created (fortified simplifier), - second, a memcpy is created (normal simplifier), but the stpcpy call isn't removed. - third, InstCombine later revisits the instructions, and simplifies the first stpcpy to a memcpy. We now have two memcpys. ``` --------------------------------------------------------------------- llvm-svn: 227346
-
Rafael Espindola authored
This fixes pr22351. Original messages: r226026: Fix handling of extern_weak. This was broken by r225983 r226031: Fix linking of shared libraries. In shared libraries the plugin can see non-weak declarations that are still undefined. llvm-svn: 227344
-
Hans Wennborg authored
```--------------------------------------------------------------------- r227300 | chandlerc | 2015-01-28 01:52:14 -0800 (Wed, 28 Jan 2015) | 34 lines [LPM] Rip all of ManagedStatic and ThreadLocal out of the pretty stack tracing code. Managed static was just insane overhead for this. We took memory fences and external function calls in every path that pushed a pretty stack frame. This includes a multitude of layers setting up and tearing down passes, the parser in Clang, everywhere. For the regression test suite or low-overhead JITs, this was contributing to really significant overhead. Even the LLVM ThreadLocal is really overkill here because it uses pthread_{set,get}_specific logic, and has careful code to both allocate and delete the thread local data. We don't actually want any of that, and this code in particular has problems coping with deallocation. What we want is a single TLS pointer that is valid to use during global construction and during global destruction, any time we want. That is exactly what every host compiler and OS we use has implemented for a long time, and what was standardized in C++11. Even though not all of our host compilers support the thread_local keyword, we can directly use the platform-specific keywords to get the minimal functionality needed. Provided this limited trial survives the build bots, I will move this to Compiler.h so it is more widely available as a light weight if limited alternative to the ThreadLocal class. Many thanks to David Majnemer for helping me think through the implications across platforms and craft the MSVC-compatible syntax. The end result is *substantially* faster. When running llc in a tight loop over a small IR file targeting the aarch64 backend, this improves its performance by over 10% for me. It also seems likely to fix the remaining regressions seen by JIT users with threading enabled. This may actually have more impact on real-world compile times due to the use of the pretty stack tracing utility throughout the rest of Clang or LLVM, but I've not collected any detailed measurements. ``` --------------------------------------------------------------------- llvm-svn: 227332 -
Hans Wennborg authored
```--------------------------------------------------------------------- r227299 | chandlerc | 2015-01-28 01:47:21 -0800 (Wed, 28 Jan 2015) | 48 lines [LPM] A targeted but somewhat horrible fix to the legacy pass manager's querying of the pass registry. The pass manager relies on the static registry of PassInfo objects to perform all manner of its functionality. I don't understand why it does much of this. My very vague understanding is that this registry is touched both during static initialization *and* while each pass is being constructed. As a consequence it is hard to make accessing it not require a acquiring some lock. This lock ends up in the hot path of setting up, tearing down, and invaliditing analyses in the legacy pass manager. On most systems you can observe this as a non-trivial % of the time spent in 'ninja check-llvm'. However, I haven't really seen it be more than 1% in extreme cases of compiling more real-world software, including LTO. Unfortunately, some of the GPU JITs are seeing this taking essentially all of their time because they have very small IR running through a small pass pipeline very many times (at least, this is the vague understanding I have of it). This patch tries to minimize the cost of looking up PassInfo objects by leveraging the fact that the objects themselves are immutable and they are allocated separately on the heap and so don't have their address change. It also requires a change I made the last time I tried to debug this problem which removed the ability to de-register a pass from the registry. This patch creates a single access path to these objects inside the PMTopLevelManager which memoizes the result of querying the registry. This is somewhat gross as I don't really know if PMTopLevelManager is the *right* place to put it, and I dislike using a mutable member to memoize things, but it seems to work. For long-lived pass managers this should completely eliminate the cost of acquiring locks to look into the pass registry once the memoized cache is warm. For 'ninja check' I measured about 1.5% reduction in CPU time and in total time on a machine with 32 hardware threads. For normal compilation, I don't know how much this will help, sadly. We will still pay the cost while we populate the memoized cache. I don't think it will hurt though, and for LTO or compiles with many small functions it should still be a win. However, for tight loops around a pass manager with many passes and small modules, this will help tremendously. On the AArch64 backend I saw nearly 50% reductions in time to complete 2000 cycles of spinning up and tearing down the pipeline. Measurements from Owen of an actual long-lived pass manager show more along the lines of 10% improvements. Differential Revision: http://reviews.llvm.org/D7213 ``` --------------------------------------------------------------------- llvm-svn: 227331
-
Hans Wennborg authored
```--------------------------------------------------------------------- r227294 | chandlerc | 2015-01-27 20:57:56 -0800 (Tue, 27 Jan 2015) | 23 lines [LPM] Stop using the string based preservation API. It is an abomination. For starters, this API is incredibly slow. In order to lookup the name of a pass it must take a memory fence to acquire a pointer to the managed static pass registry, and then potentially acquire locks while it consults this registry for information about what passes exist by that name. This stops the world of LLVMs in your process no matter how little they cared about the result. To make this more joyful, you'll note that we are preserving many passes which *do not exist* any more, or are not even analyses which one might wish to have be preserved. This means we do all the work only to say "nope" with no error to the user. String-based APIs are a *bad idea*. String-based APIs that cannot produce any meaningful error are an even worse idea. =/ I have a patch that simply removes this API completely, but I'm hesitant to commit it as I don't really want to perniciously break out-of-tree users of the old pass manager. I'd rather they just have to migrate to the new one at some point. If others disagree and would like me to kill it with fire, just say the word. =] ``` --------------------------------------------------------------------- llvm-svn: 227328
-
Hans Wennborg authored
```--------------------------------------------------------------------- r227261 | compnerd | 2015-01-27 14:57:39 -0800 (Tue, 27 Jan 2015) | 6 lines SymbolRewriter: allow rewriting with comdats COMDATs must be identically named to the symbol. When support for COMDATs was introduced, the symbol rewriter was not updated, resulting in rewriting failing for symbols which were placed into COMDATs. This corrects the behaviour and adds test cases for this. ``` --------------------------------------------------------------------- llvm-svn: 227324
-
Hans Wennborg authored
```--------------------------------------------------------------------- r227260 | compnerd | 2015-01-27 14:57:35 -0800 (Tue, 27 Jan 2015) | 4 lines SymbolRewriter: prevent unnecessary rewrite The rewrite for the pattern based rewrite is unnecessary if the existing name matches the pattern. ``` --------------------------------------------------------------------- llvm-svn: 227323
-
Hans Wennborg authored
```--------------------------------------------------------------------- r227278 | rikka | 2015-01-27 16:46:09 -0800 (Tue, 27 Jan 2015) | 6 lines Use the real CXXScopeSpec when setting the correction SourceRange. Otherwise, in the most important case and the only case where SS and TempSS are different (which is when the CXXScopeSpec should be dropped, and TempSS is NULL) the wrong SourceRange will be used in the fixit for the typo correction. Fixes the remaining issue in PR20626. ``` --------------------------------------------------------------------- llvm-svn: 227280
-
- 27 Jan, 2015 10 commits
-
-
Hans Wennborg authored
```--------------------------------------------------------------------- r227251 | rikka | 2015-01-27 14:01:39 -0800 (Tue, 27 Jan 2015) | 17 lines Fix a think-o in handling ambiguous corrections for a TypoExpr. Under certain circumstances, the identifier mentioned in the diagnostic won't match the intended correction even though the replacement expression and the note pointing to the decl are both correct. Basically, the TreeTransform assumes the TypoExpr's Consumer points to the correct TypoCorrection, but the handling of typos that appear to be ambiguous from the point of view of TransformTypoExpr would cause that assumption to be violated by altering the Consumer's correction stream. This fix allows the Consumer's correction stream to be reset to the right TypoCorrection after successfully resolving the percieved ambiguity. Included is a fix to suppress correcting the RHS of an assignment to the LHS of that assignment for non-C++ code, to prevent a regression in test/SemaObjC/provisional-ivar-lookup.m. This fixes PR22297. ``` --------------------------------------------------------------------- llvm-svn: 227266
-
Hans Wennborg authored
```--------------------------------------------------------------------- r227220 | rikka | 2015-01-27 10:26:18 -0800 (Tue, 27 Jan 2015) | 7 lines Properly handle typos in the conditional of ?: expressions in C. In particular, remove the OpaqueExpr transformation from r225389 and move the correction of the conditional from CheckConditionalOperands to ActOnConditionalOp before the OpaqueExpr is created. This fixes the typo correction behavior in C code that uses the GNU extension for a binary ?: (without an expression between the "?" and the ":"). ``` --------------------------------------------------------------------- llvm-svn: 227243
-
Hans Wennborg authored
```--------------------------------------------------------------------- r226824 | logan | 2015-01-22 05:40:16 -0800 (Thu, 22 Jan 2015) | 2 lines Enable backtrace_test for ARM. ``` --------------------------------------------------------------------- llvm-svn: 227239
-
Hans Wennborg authored
```--------------------------------------------------------------------- r226823 | logan | 2015-01-22 05:39:08 -0800 (Thu, 22 Jan 2015) | 6 lines Add -funwind-tables to CMAKE_C_FLAGS. Without -funwind-tables, the compiler won't generate the unwinding table for these C functions. However, the functions in libunwind, such as `_Unwind_Backtrace()`, WILL unwind stack to get the backtrace. ``` --------------------------------------------------------------------- llvm-svn: 227238
-
Hans Wennborg authored
```--------------------------------------------------------------------- r226822 | logan | 2015-01-22 05:38:11 -0800 (Thu, 22 Jan 2015) | 16 lines Force unwind frame with user-defined personality. If libcxxabi is compiled as a shared library, and the executable references the user-defined personality routines (e.g. __gxx_personality_v0), then the pointer comparison in Unwind-EHABI.cpp won't work. This is due to the fact that the PREL31 will point to the PLT stubs for the personality routines (in the executable), while the __gxx_personality_v0 symbol reference is yet another (different) PLT stub (in the libunwind.) This will cause _Unwind_Backtrace() stops to unwind the frame whenever it reaches __gxx_personality_v0(). This CL fix the problem by calling the user-defined personality routines with an undocumented API for force unwinding. ``` --------------------------------------------------------------------- llvm-svn: 227237
-
Hans Wennborg authored
```--------------------------------------------------------------------- r226820 | logan | 2015-01-22 05:28:39 -0800 (Thu, 22 Jan 2015) | 5 lines Fix _Unwind_Backtrace for libc++abi built with libgcc. Implement an undocumented _US_FORCE_UNWIND flag for force unwinding. ``` --------------------------------------------------------------------- llvm-svn: 227236
-
Hans Wennborg authored
```--------------------------------------------------------------------- r226819 | logan | 2015-01-22 05:27:36 -0800 (Thu, 22 Jan 2015) | 9 lines Allow libc++abi to be built without unwinder. This CL adds a new compilation flags LIBCXXABI_USE_LLVM_UNWINDER to specify whether the LLVM unwinder is enabled. Besides, all unwinder-specific code are guarded with this definition. Now, libc++abi will be able to use the unwinding routine from libgcc when LIBCXXABI_USE_LLVM_UNWINDER is disabled. ``` --------------------------------------------------------------------- llvm-svn: 227235
-
Hans Wennborg authored
```--------------------------------------------------------------------- r226818 | logan | 2015-01-22 05:25:55 -0800 (Thu, 22 Jan 2015) | 12 lines Remove _Unwind_{Get,Set}{GR,IP} from ARM EHABI build. This commit partially reverts r219629. This functions are not a part of ARM EHABI specification, and AFAIK, the de facto implementation does not export these functions. Without this change, any programs compiled with this unwind.h will be incompatible with other implementations due to linkage error. ``` --------------------------------------------------------------------- llvm-svn: 227234 -
Daniel Sanders authored
```--------------------------------------------------------------------- r227005 | dsanders | 2015-01-24 14:35:11 +0000 (Sat, 24 Jan 2015) | 38 lines [mips] Fix 'jumpy' debug line info around calls. Summary: At the moment, address calculation is taking the debug line info from the address node (e.g. TargetGlobalAddress). When a function is called multiple times, this results in output of the form: .loc $first_call_location .. address calculation .. .. function call .. .. address calculation .. .loc $second_call_location .. function call .. .loc $first_call_location .. address calculation .. .loc $third_call_location .. function call .. This patch makes address calculations for function calls take the debug line info for the call node and results in output of the form: .loc $first_call_location .. address calculation .. .. function call .. .loc $second_call_location .. address calculation .. .. function call .. .loc $third_call_location .. address calculation .. .. function call .. All other address calculations continue to use the address node. Test Plan: Fixes test/DebugInfo/multiline.ll on a mips host. Subscribers: dblaikie, llvm-commits Differential Revision: http://reviews.llvm.org/D7050 ``` --------------------------------------------------------------------- llvm-svn: 227193
-
Pekka Jaaskelainen authored
llvm-svn: 227188
-
- 26 Jan, 2015 3 commits
-
-
Hans Wennborg authored
```--------------------------------------------------------------------- r227062 | rengolin | 2015-01-25 15:17:48 -0800 (Sun, 25 Jan 2015) | 10 lines Allows Clang to use LLVM's fixes-x18 option This patch allows clang to have llvm reserve the x18 platform register on AArch64. FreeBSD will use this in the kernel for per-cpu data but has no need to reserve this register in userland so will need this flag to reserve it. This uses llvm r226664 to allow this register to be reserved. Patch by Andrew Turner. ``` --------------------------------------------------------------------- llvm-svn: 227151
-
Hans Wennborg authored
```--------------------------------------------------------------------- r226664 | tnorthover | 2015-01-21 07:43:31 -0800 (Wed, 21 Jan 2015) | 7 lines AArch64: add backend option to reserve x18 (platform register) AAPCS64 says that it's up to the platform to specify whether x18 is reserved, and a first step on that way is to add a flag controlling it. From: Andrew Turner <andrew@fubar.geek.nz> ``` --------------------------------------------------------------------- llvm-svn: 227150
-
Reid Kleckner authored
llvm-svn: 227128
-
- 23 Jan, 2015 3 commits
-
-
Hans Wennborg authored
```--------------------------------------------------------------------- r226711 | jroelofs | 2015-01-21 14:39:43 -0800 (Wed, 21 Jan 2015) | 8 lines Fix load-store optimizer on thumbv4t Thumbv4t does not have lo->lo copies other than MOVS, and that can't be predicated. So emit MOVS when needed and bail if there's a predicate. http://reviews.llvm.org/D6592 ``` --------------------------------------------------------------------- llvm-svn: 226918
-
Hans Wennborg authored
```--------------------------------------------------------------------- r226863 | joerg | 2015-01-22 13:01:00 -0800 (Thu, 22 Jan 2015) | 3 lines When reporting constraints that should be constant, the type doesn't really help. Improve diagnostics. ``` --------------------------------------------------------------------- llvm-svn: 226916
-
Hans Wennborg authored
```--------------------------------------------------------------------- r226877 | atanasyan | 2015-01-22 15:16:48 -0800 (Thu, 22 Jan 2015) | 3 lines [Mips] Fix type of 64-bit integer in case of MIPS N64 ABI Differential Revision: http://reviews.llvm.org/D7127 ``` --------------------------------------------------------------------- llvm-svn: 226894
-
- 22 Jan, 2015 1 commit
-
-
Hans Wennborg authored
```--------------------------------------------------------------------- r226847 | marshall | 2015-01-22 10:33:29 -0800 (Thu, 22 Jan 2015) | 1 line Fix PR#22284. Add a new overload to deque::insert to handle forward iterators. Update tests to exercise this case. ``` --------------------------------------------------------------------- llvm-svn: 226859
-