Flattening protocol towers with bottom types

I've been watching the issues with conditional conformances and generic programming coming to light in threads like these by @dabrahams and others:

It occurs to me that, for closely related protocol hierarchies like the *Collection protocols, it seems like we really want a tighter relationship among these protocols than Swift expresses today. The capabilities a type has as a collection are really a fundamental part of its Collection conformance; however, the language treats protocols as more independent than that. If a type is declared as only a Collection, in theory anyone else could extend it to be a BidirectionalCollection or RandomAccessCollection. It is extremely unlikely that someone would be able to usefully retroactively do so, but the possibility baked into the language semantics makes reasoning about whether something that's a Collection is also a BidirectionalCollection harder than it should be for the optimizer, witness dispatch, algorithm specialization, and other desirable conditional behavior to take advantage of, since we can never with certainty statically assume a type conforms to a protocol, and dynamically checking whether something conforms to a protocol is unsound in the face of potentially multiple retroactive conformances in a process.

We may be able to contain some of this complexity by providing means to explicitly mark optional conformances and group together related conditional conformances in a way that we can correctly select default implementations and allow for efficient capability testing within protocol hierarchies. However, I got to thinking about whether we could capture the desired effect of the Collection protocol tower in a simpler way—if the additional capabilities of BidirectionalCollection, RandomAccessCollection, and friends are a fundamental part of their Collection-ness, and feed back into each other in such intricate ways, could there possibly be one conformance that spans the entire set of behavior? It turns out that there can be! There are some obvious shortcomings with the approach, so this is still more "interesting idea" than a concrete proposal for designing systems this way, but it has some nice properties that make it worthy of exploration.

Although we don't have formal conditional requirements in protocols, there is another means for selectively disabling parts of an interface—uninhabited "bottom" types like Never. A type with no valid values cannot be instantiated, and a method that takes a type with no valid values can never be called. We can leverage this fact to express a single protocol that supports varying levels of capabilities by using possibly-bottom associated types to "turn off" requirements some conforming types don't support. For example, we could express a protocol that supports both uni- and bi-directional collections like this:

protocol Kollection {
    associatedtype Element
    associatedtype Index: Comparable

    // Core API
    subscript(_: Index) -> Element { get }
    var startIndex: Index { get }
    var endIndex: Index { get }
    func index(after: Index) -> Index

    // Bidirectional API
    associatedtype BiIndex: Comparable // should either == Index or == Never
    func index(before: BiIndex) -> BiIndex

    // Upgrade/downgrade a bidirectional index
    func biIndex(for: Index) -> BiIndex?
    func index(for: BiIndex) -> Index
}

// A type that supports bidirectional indexing trivially supports
// upgrading its indexes to BiIndex
extension Kollection where BiIndex == Index {
    func biIndex(for i: Index) -> BiIndex? { return i }
    func index(for i: BiIndex) -> Index { return i }
}

// A type that doesn't support bidirectional indexing
// declares Never as its index type, and trivially implements
// the requirements with no body
extension Kollection where BiIndex == Never {
    func biIndex(for i: Index) -> BiIndex? { return nil }
    func index(for i: BiIndex) -> Index { /*unreachable*/ }
    func index(before: BiIndex) -> BiIndex { /*unreachable*/ }
}

Unfortunately, the language does not let us directly express the "BiIndex == Index or Never" constraint we'd like to, which imposes some of the most glaring ergonomic problems with the design—within a generic context, mixing Index and BiIndex requires unsightly manual conversion calls using the biIndex(for:) and index(for:) method requirements. However, a conforming type doesn't have to directly mess with those requirements, whether it be a unidirectional collection:

struct AscendingKollection: Kollection {
    subscript(element: Int) -> Int { return element }
    var startIndex: Int { return 0 }
    var endIndex: Int { return Int.max }
    func index(after i: Int) -> Int { return i + 1 }

    // No bidirectional capability
    typealias BiIndex = Never
}

or bidirectional:

// Array already implements the real BidirectionalCollection protocol's requirements, which mirror our
// impostor protocol's
extension Array: Kollection {}

Things get more promising when dealing with wrapper collections. Writing a wrapper collection more often than not requires tediously writing a tower of conditional conformances to reflect the underlying collection's capabilities. However, with bottom types, we're using the core type system to propagate the availability of capabilities, so in many cases the right capabilities fall out from simply wrapping the underlying collection's behavior in a straightforward way:

// A kollection that only exposes elements from the wrapped kollection that match a 
// filter predicate
struct FilterKollection<T: Kollection>: Kollection {
    var wrapped: T
    var filter: (T.Element) -> Bool

    // Forward indexing interface
    typealias Element = T.Element
    typealias Index = T.Index

    subscript(i: Index) -> Element { return wrapped[i] }

    var startIndex: Index {
        var result = wrapped.startIndex
        while !filter(wrapped[result]) && result != endIndex {
            result = wrapped.index(after: result)
        }
        return result
    }
    var endIndex: Index { return wrapped.endIndex }

    typealias BiIndex = T.BiIndex
    func biIndex(for i: Index) -> BiIndex? { return wrapped.biIndex(for: i) }
    func index(for i: BiIndex) -> Index { return wrapped.index(for: i) }

    func index(after i: Index) -> Index {
        var result = i
        while result != wrapped.endIndex {
            result = wrapped.index(after: result)
            if filter(wrapped[result]) {
                break
            }
        }
        return result
    }

    func index(before i: BiIndex) -> BiIndex {
        var result = i
        let start = biIndex(for: startIndex)!
        while result != start {
            result = wrapped.index(before: result)
            if filter(wrapped[index(for: result)]) {
                break
            }
        }
        return start
    }
}

This approach also builds the ability to query the capabilities of a collection directly into the protocol requirements. Algorithms that want to provide more specialized behavior over more capable collections are able to do so without relying on the whims of overload resolution, or the overhead of dynamic casts:

extension Kollection {
    var last: Element? {
        // Empty collection: return nil
        if startIndex == endIndex {
            return nil
        }
        // Bidi collection: get index before endIndex
        if let endBiIndex = biIndex(for: endIndex) {
            return self[index(for: index(before: endBiIndex))]
        }
        // Otherwise, O(n) search
        var prevIndex = startIndex
        var i = index(after: startIndex)
        repeat {
            prevIndex = i
            i = index(after: i)
        } while i != endIndex
        return self[prevIndex]
    }
}

Now, we're not going to break the world by replacing the Collection hierarchy anytime soon, and this approach still has ergonomic and expressivity shortcomings (particularly because it requires up-front design to integrate the expected span of capabilities into one protocol; it doesn't allow you to take requirements from existing protocols and mash them together very well). If you're designing a library around protocol towers and running into difficulties with conditional conformances, default implementation selection, and algorithm specialization, it might be worth exploring this technique to see if it leads to something simpler and more robust.

17 Likes

This is a very interesting idea!

I guess it all comes down to this:

Is this faster or easier to optimise than dynamic dispatch? If it is, it seems to me that it achieves that performance by making the API return an optional and then per-emptively conforming and returning nil, cutting off retroactive conformances.

Is there a way that we could more easily harmonise that with the existing model? Perhaps some kind of "negative conformances"? (i.e. String could declare that it is explicitly not a RandomAccessCollection, and you couldn't make it one retroactively).

1 Like

Yeah, this is assuming that retroactive conformance is actively undesirable among the tower of capabilities being described. This should be easier to optimize than a dynamic cast, because specialization should reveal the implementations of biIndex(for:) and index(for:) specific to the collection being operated on, and the implementations should be trivial enough to inline. Unlike a dynamic protocol conformance check, this wouldn't need to do a global search for potential retroactive conformances.

Good question. You could conceivably use the same technique with a possibly-bottom associated type as a way to build a "back reference" from a base protocol to its refinements in a closed system:

protocol Collection {
  // == Self if bidirectional, == Never if not
  associatedtype SelfAsBidirectional: BidirectionalCollection

  var asBidirectional: SelfAsBidirectional? { get }
}

As a more formal language feature, this might look like an explicit conditional requirement, which would be a signal to the compiler to try to find a conformance to the given protocol at the point a conformance is declared:

protocol Collection where Self ?: BidirectionalCollection, Self ?: RandomAccessCollection {
  ...
  /* This would do a lookup of the conditional requirement in the witness table,
     instead of the global environment */
  if self is BidirectionalCollection { ... }
}

These still have the disadvantage of requiring up-front design by the protocol author, though.