Why is Swift so slow (timeout) in compiling this code?

I wrote a simple 6 lines quick sort function for benchmarking Rust (1.65), Swift 5.7 and Python3.11.

The array length is 999,999.
Rust compiled in about a minute and the runtime was 2 secs:dash:.
Python3.11's runtime was 5 secs:eyes:.
And Swift... it's been compiling for more than an hour now🫣... Why is Swift so slow... I'm on M1 MBP (2020).

When I'd tried for 99,999 numbers; it took 4:30 mins to compile and the runtime was 0.51 ms (compared to Python's 0.29 and rust's 0.12ms) rust compiled in 2 secs btw.

Here's the Github gist for the code... can someone try as well and confirm is it just my machine or Swift really isn't as fast as I expected🤔...

1 Like

The compile time problem is not your 6 line quick sort function, is the huge literal vector. In short is because compiler has to infer it as [Int] but the only thing it knows about each element literal in this array is that it conforms to ExpressibleByIntegerLiteral so it would probably attempt UInt8,Int8, Int16, Int32 ... every integer type in stdlib that conforms to that and possibly user defined types that also conform to it which takes a lot of time.
And also it has to ensure that every element matches this requirement so for this huge literal array this end taking even more time. If you try to explicit tell the contextual type for it like let vector: [Int] = [1,2, 3, ...] that will probably help the compile time of that code.

7 Likes

When your source code is so long that it brings GitHub, your browser, and probably your text editor to a grinding halt, perhaps that's a decent indication "something ain't right".

I wouldn't be surprised if "parse and compile 8MB of integer literals" isn't a very well optimized code path in the compiler, because nobody is parsing and compiling 8MB of integer literals outside of artificial exercises like this :stuck_out_tongue:

That said, it's not just the literal type-checking slowing things down here. I added a type annotation (let vector: [Int] = [ ... ]) and it still doesn't compile within a few mins

6 Likes

It is probably because even filtering only by Int, each one of the 99999 elements has to be convertible to Int. Because situations like let vector: [Int] = [1, 1, ..., "a"] has to fail...
But bottom line is the compiler bottom neck is definitely due to the literal expression. You could try to constraint each element (with find and replace in editor) to let vector: [Int] = [1 as Int, 2 as Int, ...] and see if there is a better result, my guess is that it will reduce, but is definetely less guessing work for the typechecker.

It's not that unrealistic of a scenario, even if there are better ways to do it. And 8MB is pretty small. There's no good reason why we can't do better.

I found a trace of the compiler kind of surprising; the time is not all being spent in the type checker, as you observed:

9 Likes

Yup agree that the input size is too large... but the other two test languages do handle the situation and Swift, unfortunately doesn't.

Thanks for mentioning the internal workings of the compiler. I've been trying your suggestion for the past hour... but unfortunately it's STILL compiling; Rust compiler is inferring the type much faster; so seems some flaw in Swift itself.

As @scanon mentioned there is something else going on other than type checker later on pipeline (if you try swiftc -dump-ast your-file.swift which only parse and typechecks and you see that is reasonable after adding type annotation. Although it will take sometime to actually print the AST typechecking is done in a reasonable time)

I think is unfair say is a "flaw"... it is just that swift has a different set of constraints then Rust when it comes to inference of literals because of some language features.
But in this case specifically as @scanon pointed out we may be dealing with a compiler bug.

Any misunderstanding is deeply regretted; but by "much faster", I meant to say that the Swift compiler is taking forever to do the thing and still not giving any output (I kept it going for more that hour and still no output; consuming significant power in the process) unlike it's counterpart Rust (which does it under a minute). I understand there can be design constraints and I respect it; but I suspect there is something really fishy going on underneath so I said it to be a "flaw".

That said; I have huge admiration for the Swift community and firstly hope that it's really not a flaw; and if it is, it get's sorted out🤝.

4 Likes

FWIW these are the results I got when moved that array into a json file:

build time: 0.6 sec
999999 999999
load data from file:  0.012182950973510742 // try! Data(contentsOf: file)
decode json from data:  0.2169790267944336 // try! JSONSerialization.jsonObject(with: data) as! [Int]
quicksort array:  5.1729899644851685 // quicksort(vector)
standard sort array:  2.165964961051941 // vector.sorted(by: <)
total time using quicksort:  5.402151942253113
total time using standard sort:  2.3951269388198853

With a quick & dirty counted sort implementation:

counted sort array:  0.6888679265975952
total time using counted sort:  0.9205169677734375

You can strip 0.2 sec from total time by storing Ints in a binary file instead of Json to make total time ~0.7 sec. No doubt with similar changes Rust will be faster.

PS. interestingly and unexpectedly JSONSerialization was faster than JSONDecoder
try! JSONSerialization.jsonObject(with: data) as! [Int]
decode json from data with JSONSerialization:  0.21882307529449463

try! JSONDecoder().decode([Int].self, from: data)
decode json from data with JSONDecoder:  1.128619909286499
8 Likes

JSONDecoder is (currently) implemented as a layer over JSONSerialization, so it being slower makes some sense. That said, the magnitude of the difference there is interesting, I'd be somewhat interested in taking a look at an Instruments time profile of that.

4 Likes

Last I knew, JSONDecoder uses JSONSerialization under the hood and then does a bunch of dynamic casting and decoder work, so it's pretty much guaranteed to be slow. There are alternative JSON decoders which are much faster, when it matters. But Decoder's fundamental requirements basically guarantee it's slow no matter how much work is put in. Last figures I saw put it at 50 - 60 MB/s under the most optimal implementation.

Even if this isn't a common scenario, this sort of input makes a good stress test which can provide clear areas of improvement for the compiler. Swift's behavior here (large literals) has been poor from beginning but reports were often met with the same resistance seen in this thread.

9 Likes

I filed a number of bugs some time ago about this issue:

In particular (some are somewhat fixed):

You can see the quadratic complexity in this diagram:

It's quite common in rendering to include precomputed arrays:

I'd be interested in hearing about those. I can think of two:

  1. Read JSON as above. Disadvantages: a) Has to be done every time b) Can't be deployed as a single binary.
  2. Use another language (C, C++). Disadvantage: Clumsy.

Since the bugs are years old I wouldn't bet my hat on it.

4 Likes

Personally, I would put them in a binary file and mmap it, but a .h would be perfectly reasonable as well (and to my mind less clumsy, because who needs to have huge data buffers cluttering up their source).

This also lets you more easily put the content in an arbitrary executable section using build tools, which may or may not be a good thing depending on your access pattern. (Possibly a good idea for data that will be copied or made mutable; probably a bad idea for constants referred to by nearby code.)

Interestingly more time was spent in the type casting than JSON decoding itself:

let o = try! JSONSerialization.jsonObject(with: data) // 0.075 sec
let vector = o as! [Int] // 0.156 sec
  1. Use a binary format for the external file. Advantages - no parsing required (perhaps an endian conversion the file is intended being readable from differently endian devices). Disadvantages: a) Not convenient (harder to read by human / change) b) Can't be deployed as a single binary

  2. Base64 encoded String (if it's faster). Disadvantages: a) Not convenient to read / change.

2 - I do every now and then on the as needed basis. E.g. when I need to be sure that struct has a particular binary layout or when I need to drop down to manual reference counting or when I need to patch method implementation (method_exchangeImplementation), or for an ultimate performance and/or crossplatfomability. In this case if I needed to ship a single binary executable without extra files I'd use (2) or (4).

Makes sense that this would be primarily a bridging issue. I have a few ideas for speeding up Array bridging in general but I’m curious if very long Arrays like this hit anything unusual. Unfortunately I have my hands full with other tasks right now :frowning:

1 Like

not only that, it also avoids the need to link 12.5 MB of libFoundation.so.

Another option: convert array of ints to a string: "12858, 964801,... ". Compilation time is instant in this case (a fraction of a second). Runtime is a bit slower than the already mentioned alternatives.

let intsString = "12858, 964801, .... 767751, 764160" // one million integers in a string
let components = intsString.components(separatedBy: ", ") // 0.24 sec
let vector = components.map { Int($0)! } // 0.21 sec
let l = quicksort(vector) // 5.1 sec
total time:  5.5 sec
A slight optimization to your quicksort algorithm to cut its time from 5 seconds to 3 by doing filtering once.
func quicksort(_ arr: [Int]) -> [Int] {
    guard arr.count > 1 else { return arr }
    let pivot = arr[0]
    var leftInts: [Int] = []
    var rightInts: [Int] = []
    arr.forEach { v in
        if v < pivot {
            leftInts.append(v)
        } else if v > pivot {
            rightInts.append(v)
        }
    }
    let left = quicksort(leftInts)
    let right = quicksort(rightInts)
    return left + [pivot] + right
}
3 Likes

We obviously don't agree here. :slight_smile:
I'd say the two-liner

let a = [ ... 10.000 floats ... ]
let x = a[i]

is less clumsy than

  1. Generating low-discrepancy numbers
  2. Put them into an external binary file
  3. mmap them at runtime
    My guess would be that you can't do this in one line of code.

Reasoning:

  1. A compiler is made to convert source code to binaries.
  2. Every other language I tried (C, C++, Rust, even Python) gets this right.

But I totally understand if it's not top priority.

4 Likes