"Outlined init with copy/take of protocol" using 50% of CPU time in for loop over protocol array

It seems that iterating over an array of structs, as opposed to an array of protocols implemented by those structs, is 5 times faster.

The sample code below takes about 6 seconds when run in Xcode Instruments.

struct Event {
    let a = 0
    let b = 1
}

let array: [Event] = Array(repeating: Event(), count: 1000000000)

for (i, element) in array.enumerated() {
    callback(i) { event in
        if event.a != 0 {
            preconditionFailure()
        }
    }
}

func callback(_ index: Int, block: (_ event: Event) -> Void) {
    block(array[index])
}

The following sample code, on the other hand, takes 5 times as much, about 30 seconds:

protocol EventProtocol {
    var a: Int { get }
    var b: Int { get }
}

struct Event: EventProtocol {
    let a = 0
    let b = 1
}

let array: [EventProtocol] = Array(repeating: Event(), count: 1000000000)

for (i, element) in array.enumerated() {
    callback(i) { event in
        if event.a != 0 {
            preconditionFailure()
        }
    }
}

func callback(_ index: Int, block: (_ event: EventProtocol) -> Void) {
    block(array[index])
}

30% of the time is spent in outlined init with copy of EventProtocol and 18% is spent in outlined init with take of (offset: Int, element: EventProtocol)?.

Why does a for loop over an array of protocols has such an impact on performance?

1 Like

This outlined init… As follow up to the original topic, is there information on this outlined init behavior — why is compiler decides to outline, what impact it makes? I understand what is happening roughly, but could not find information on the actual mechanism and reasons of such behavior (and how to prevent it?). It is also gives once in a month crushes…

As a general rule, working with concrete types is going to be more efficient, yes.

Your execution time is dominated by the cost of first copying the original 16GB array of Events into a 40GB array of any EventProtocols and then copying the individual any EventProtocols out as part of the iteration. It'd be nice to avoid that second copy, and I think we do in some cases, and we're working on things to do that more. But mostly, yeah, going from an array of concretely-typed elements to an array of dynamically-typed protocol values is going to be slower on a micro level, and when you've got literally a billion elements (and your benchmark does basically nothing else but those copies), that micro level is going to dominate.

7 Likes

In fact, changing EventProtocol from a protocol to a class changes execution time from 30 seconds to 16 seconds. Does that mean that one should use class inheritance rather than protocols whenever possible?

Why do you say "dominated"? From the Instruments screenshot I understand that the two most significant parts, outlined init with copy of EventProtocol and outlined init with take of (offset: Int, element: EventProtocol)?, are part of the for loop.

The usual recommendation is to use generic constraints with protocols, rather than protocol existentials

3 Likes

No? There isn't a single answer here. Abstraction has a cost, and if you're going to work with large data sets, you need to actually think about those costs.

Right. Allowing the array to support elements of different types makes elements larger and slower to work with. You have to make good decisions about whether that's okay. Swift provides plenty of options here, and they have tradeoffs.

6 Likes

for curiosity’s sake, what happens if you keep the protocol, but constrain it to the AnyObject layout?

1 Like

Runs in 6 seconds, so faster than class inheritance.

1 Like

You're using existential container for Any P, and it's even slower when your actual concrete type is more than 3 words because it need to allocate on heap.

For large dataset, always choose to concrete type. Abstract contains the cost.