Weird protocol behaviour?

Consider the following example:

protocol A {}
protocol B: A {}

struct X: B {}

func getArrayOfX() -> [X] {
    return [X(), X()]
}

var array = [A]()

// this works fine
array += [X(), X()]

// error: Binary operator '+=(_:_:)' cannot be applied to operands of type '[A]' and '[X]'
array += getArrayOfX()


print(type(of: [X(), X()]) == type(of: getArrayOfX())) // -> true

I'm having trouble understanding why the first += invocation works fine, while the second one results in a compile time error.

It seems to me that the type of the rhs array should be [X] in both cases, regardless of how it was created (literal or via the function call).

Is there a reason why only the first one compiles fine?

Thank you!

Try this:

print(type(of: array) == type(of: getArrayOfX())) // -> false

See, the array literal [X(), X()] have several potential types, e.g [X], [A], [Any], etc. The compiler will infer which you mean, by evaluating the expression in which it exists, and see which type make sense in the context, and then infer that type.

So, in your array += [X(), X()] line the operator is inferred to be of type [A] since this makes the expression valid, but in the type(of: [X(), X()]) line, the type is inferred to be [X] since that makes the expression valid in this second context.

However, I'm not exactly sure why [X] isn't considered to be an [A] when X is an A.

You can workaround this by doing this:

array += getArrayOfX() as [A]

To the compiler experts: shouldn't this work? I thought collections were the special case of covariant generics.

Seems like a bug to me.

It is a change in representation and so handling an [X] as an [A] is a (slightly) non-trivial operation, not just a compile-time thing.

This is because Swift has value types, and arrays that support them. An Array<T> (aka [T]) of length n essentially consists of n contiguous values of type Ts, each of which is the exact right size to store Ts in sequence (i.e. an array has an allocation of size at least n * MemoryLayout<T>.stride); in this case, X's stride is 1 (that is, storing each X using a single byte), but a protocol existential like A has stride 40! This means, that at its simplest, an array of [A]s needs a buffer for its values that is 40 times larger than an array of [X]s.

All this means that "upcasting" between an [X] and a [A] is a linear time operation, essentially the same as getArrayOfX().map { $0 as A }, which creates a whole new array with an appropriately sized allocation, casting each element. Fortunately, you can achieve the upcast with just as [A] as @DevAndArtist says, instead of having to write that manual map, but it still requires some sort of explicit note that a linear time operation is happening.

(I call out value types in particular, because this behaviour differs to languages where all values are pointers, and so the arrays and their allocations are always storing elements of a single size: one pointer. This means that the upcast can be "free" and just be purely a type-system thing, as the representation of the underlying values in the array doesn't need to change.)

10 Likes

Personally, I wish it worked this way everywhere instead of sometimes being implicit:

func takeArrayOfA(_ a: [A]) {
    print(a)
}

let xs: [X] = [X()]
takeArrayOfA(xs)  // this compiles fine
takeArrayOfA(xs as [A])  // this would be clearer
takeArrayOfA(xs.map({ $0 as A }))  // because this is what’s really happening

I know people have the intuition that this kind of covariance should “just work” but as Huon points out, it isn’t free in Swift’s case and so doing it implicitly is unhealthy IMO. Unfortunately I don’t think that’s something we can reverse course on at this point.

5 Likes

@Ben_Cohen if I understand you correctly then you're saying that covariance on collections should be always explicit, and takeArrayOfA(xs) should be an error?

Other than that, do you think there is chance that Swift might introduce explicit covariance capabilities for generic types like other languages do in some future?

Given I dislike the current feature, my personal opinion is: I hope not.

Unless someone comes up with an ingenious solution to the problems the current feature has, that reconciles the different “this should just work” and “this can’t just work, it’s more complicated than that” viewpoints, that is.

That makes sense. On the other hand, I would have written the explicit “as [A]” cast without knowing that it changes the performance characteristics of the code. I would simply dismiss it as a compiler glitch. So do we really gain anything here? (In other words: if the missing implicit conversion is a compiler cue saying I’m doing it wrong, it’s quite a subtle cue and prone to be missed.)

3 Likes

Hmm, personally I wished for some kind of a solution to the current state of art. Collections are covariant, and every other generic type is just invariant. This can become really annoying in some situations.

protocol A {}
struct B : A {}
struct C : A {}
struct D {
  var b = B()
  var c = C()
}

let paths: [KeyPath<D, A>] = [
  \.b, // Key path value type 'B' cannot be converted to contextual type 'A'
  \.c  // Key path value type 'C' cannot be converted to contextual type 'A'
]

There is a workaround for that, but it's pretty ugly IMHO.

I'm also in this boat. I had no idea that as [A] effectively translates to a map of the array. I wonder how many people don't realize this and if there is even any benefit from being required to be explicit in our casts. I bet most people just chalk this error and its fixit as one of swift's quirks and just add the fixit without giving it a second thought (like myself).

Is there some way people can be warned that this may affect the performance of their program? The fixit makes it seem so simple and, much like the fixit to add a force unwrap, people don't realize/understand the implications until they learn a lot more about swift.

I'm hesitant to even suggest possibly changing the fixit to the full .map({ $0 as A }) instead of just as [A] because it's not nearly as concise (and beginners may be confused by it) even if it does explicitly state what's actually going on. I would just like to know that what I'm doing could have a potential performance impact (depending on the size of the array).

1 Like

If it's an issue then it will appear in a profile, the only reliable way of reasoning about performance, so it doesn't seem worth making more verbose. Perhaps some documentation that someone might find while searching for “slow array casts swift” or similar could detail the reason.

1 Like