Compile performance with large constant arrays: Crash, quadratic behaviour


#1

Hi!

I wanted to draw some attention to a problem with compile times. I am using a 50.000 element constant array of integers:

let x: [UInt] = [1, 2, 3, 4, 5, ..., 50000]

Compile time for this in the minutes. For more than 70.000 elements Swift crashes. This bugs me for quite some time. The array is used for distributing samples via Sobol matrices similar to "https://github.com/mmp/pbrt-v3/blob/master/src/core/sobolmatrices.cpp".

This simple test case already resulted in several bugs fixed:

SR-7632 also got a nice counter checking complexity as mentioned in @Graydon_Hoare's https://github.com/apple/swift/blob/master/docs/CompilerPerformance.md: (https://github.com/apple/swift/pull/16560).

Bugs not fixed are:

  • Slow SILVerifier (SR-7702).
  • Quadratic complexity in redundant load elimination (SR-7703).
  • Compiler crash (SR-9291).

I am kind of satisfied for now since I run my home-grown (Linux, release, no-assertions) Swift compiler where I disabled redundant load elimination to get compile times from minutes down to a second again. The SILVerifier is disabled by no-assertions. And I have less than 70.000 elements so I'm not affected by SR-9291.
XCode on the other hand is not usable. so long term I'd like to express some wishes/recommendations.

  • Fix the crash for arrays with more than 70.000 elements (as @Andrew_Trick pointed out, a param index of 16 bit might be the problem.
  • Fix quadratic complexity in RLE.
  • Speed up the SILVerifier for big arrays.

Additionally, what would be nice to have is:

  • A counter for the SILInliner and RLE to make sure they stay linear once they are fixed.
  • A validation test à la "validation-test/Sema/type_checker_perf/slow" that checks that compilation time for a, say, 10.000 array stays below one second. You may want to have a look at this diagram: http://bit.ly/2Dqiwx1
  • Change the Linux releases to no-assertions like XCode until SILVerifier is fixed.

To put everything in context I did a little survey on the most used languages according to the TIOBE index (excluding SQL). Here are my results with a 50.000 element array. "No problem" means compilation time is less than a second and execution time for printing one random element is less than a second.

  • Java: Doesn't work: "code too large". With workaround*: No problem.
  • C: No problem
  • C++: No problem
  • Python: No problem
  • Visual Basic .NET: Compilation slow but works
  • C#: No problem
  • Javascript: No problem
  • PHP: No problem
  • Go: No problem
  • Objective-C: No problem

*The Java workaround is to split the array into smaller arrays of 8000 elements into different classes.

So Swift right now is the only language that crashes/asserts/is very slow.

If I can help out with stack traces, better bug reports or anything please let me know.

Andreas

P.S.: Without specifying the type (let x = [1, 2, 3, ..., 5000]) compile times are even worse. The main offender is swift::constraints::ConstraintGraph::gatherConstraints.