What happened to Collection protocol recently that this no longer compiles?

Howdy friends! I'm making updates to the Slang library that builds on top of SourceKitten to allow querying. And for the love of God I cannot figure out what has changed in the past year or so that the following no longer compiles?

import Foundation

protocol MyCollection: Collection where Element == Self, Index == Int, SubSequence == Self {
}

extension MyCollection {
    public var startIndex: Index { fatalError()  }
    public var endIndex: Index { fatalError()  }
    public func index(after i: Index) -> Index { fatalError()  }
    public subscript(position: Index) -> Element { fatalError() }
    public subscript(bounds: Range<Index>) -> SubSequence { fatalError() }
}

class MyClass: MyCollection {
}

// Playground execution failed:
// 
// expression failed to parse:
// error: Playground.playground:14:7: error: 'MyCollection' requires the types 'MyClass' and 'Slice<MyClass>' be equivalent
// class MyClass: MyCollection {
//       ^
// 
// Playground.playground:14:7: note: requirement specified as 'Self' == 'Self.Element' [with // Self = MyClass]
// class MyClass: MyCollection {
//       ^

However, adding explicitly that bounds subscript on the implementing class fixes everything:

class MyClass: MyCollection {
    public subscript(bounds: Range<Int>) -> MyClass { fatalError() }
}

A more complete source code lives here, for example. The idea of this code is to return the same type for both subscripts.

I looked at older Swift versions and it seems the SubSequence for a long-long time used to be a Slice type, so doesn't look like anything changed there. But for some reason protocol extension used to do the job very neatly before and not it no longer does. Trying to understand why.

2 Likes

@xwu Thank you, this is super-super informative! :pray: So, it's the default @available(*, unavailable) extension that is causing these troubles, right?

extension Collection {
  // This unavailable default implementation of `subscript(bounds: Range<_>)`
  // prevents incomplete Collection implementations from satisfying the
  // protocol through the use of the generic convenience implementation
  // `subscript<R: RangeExpression>(r: R)`. If that were the case, at
  // runtime the generic implementation would call itself
  // in an infinite recursion because of the absence of a better option.
  @available(*, unavailable)
  @_alwaysEmitIntoClient
  public subscript(bounds: Range<Index>) -> SubSequence { fatalError() }
}

I'm wondering if there's an easy way to somehow have a protocol extension to "shadow" the default one? Somehow to avoid adding the subscript to every implementing structure. I'm playing around with it but not seeing a workaround… not yet.

The extension itself isn't a problem—in fact, it's a pretty important advance to catch an infinite recursion error.

The problem lies elsewhere. You may be aware that class MyClass: MyCollection { } relies on a particularly brittle feature in Swift known as associated type inference. Removing the feature altogether was attempted in SE-0108, but that proposal was rejected. However, the following description remains true to this day, as subsequent attempts to rationalize and improve the feature haven't materialized:


Rife with bugs, ouch. Fortunately, there is a fairly straightforward workaround. Again, I could just give you the solution, but I'll tell you how I arrived at it—skip to the bottom if you just want the bottom line.

To start, I played around with a few variations of your workaround:

First, I checked that declaring MyClass as a final class doesn't change things. This is because returning MyClass and returning SubSequence (aka Self) isn't the same thing for a non-final class (since it can have subclasses). However, even a final class doesn't remove the need for your workaround. I also tried declaring a struct MyStruct: MyCollection { }, but still no luck. However, to simplify the rest of my troubleshooting, I proceeded with the struct, since it eliminates any confounding issues with class inheritance.

Next, I tried spelling the declaration of the relevant protocol extension member as subscript(bounds: Range<Int>) -> Self instead of subscript(bounds: Range<Index>) -> SubSequence, in case there was something about overload resolution that caused the unavailable implementation to be favored over that one. It didn't remove the need for your workaround, so that's not the case (and it certainly shouldn't be the case given how unavailable overloads are supposed to be treated). I also tried using a where clause for the protocol extension declaration to see if a constrained extension would cause the subscript to be favored differently, and it did not.

So, finally, I tried something else similar to the previous step: I changed the spelling of the workaround. Instead of declaring subscript(bounds: Range<Int>) -> MyStruct (remember, I'm still working with the struct), I declared subscript(bounds: Range<Index>) -> MyStruct. Suddenly, your workaround stopped working! This started to raise suspicion in my mind that there's some funniness with associated type inference going on.

I could fix this new problem by declaring typealias Index = Int in the struct. However, I then tried deleting the subscript declaration and the error came back. Could there be some other associated type that isn't being inferred as we expect, such as the SubSequence that the subscript is supposed to return? Certainly, the error message seems to corroborate that, since it's referring to Slice, which it must have inferred. This gives us the final diagnosis and most efficient workaround (which boils down to not relying on associated type inference), which works for both MyStruct and MyClass:

class MyClass: MyCollection {
    typealias Index = Int
    typealias SubSequence = MyClass
}

(I'm still suspicious of this implementation for non-final classes and would urge you to stick with using this only with final classes in your actual code, as it seems you are already.)

So you see, what happened to Collection recently to cause your code not to compile (your original question) isn't the same as what the underlying problem is that makes it not compile!

2 Likes