Rename protocols that use Self or associated types to “constraints”, and declare them as such

anthonylatsis · April 17, 2018, 5:22pm

For what it's worth, shouldn't a generic protocol not be sugar for an associated type? Generic protocols imply multiple heterogeneous conformances and usage as values*, while protocols with associated types are restricted to a single conformance (not talking about disjoint conditional conformances) and need special treating for the latter*. In short, they have different advantages and have a reason to coexist.

Joe_Groff · April 17, 2018, 5:26pm

Part of the “generalized existentials” work should include providing a mechanism for protocols with contravariant Self requirements to generalize their operations for the existential case. Equatable could implement its own == requirement by considering two values of different types to be unequal, or otherwise compare two values of the same dynamic type using their == operation, for instance. That's something we'd want anyway since people naturally want to make their protocols conform to Equatable and Hashable and then use the protocol type in a heterogeneous dictionary or set. We have AnyHashable as a stopgap for this use case, but we really ought to integrate its behavior into the generalized existential model.

Even for protocols where there isn't a clear contravariant generalization, it's still useful to work with existentials thereof by opening the dynamic type of the value inside them, so that you can form new values of the dynamic Self type or of associated types thereof and work with them as static types. For example, you could say:

// Syntax subject to design
let collection: Collection // Given a dynamically typed Collection,
let x<T: Collection> = collection // bind its dynamic type to T
x.first // get its first value as a T.Element?
x.startIndex // get its start index as a T.Index

let y<U: Collection> = someOtherCollection

if U.Element == T.Element {
  // do something if two collections dynamically have the same element type
}

JoeyKL · April 17, 2018, 5:33pm

That works for Equatable, but doesn't work for e.g. Comparable, so the confusion still exists, and is even less consistent.

But for what it's worth, I'd be fine with generalized existentials instead of my proposal. My proposal was meant as super simple fix for a super big source of confusion, but any move towards greater consistency in the type model is good. At the very least, we can all agree that the Language Guide shouldn't contain lies, and right now it does.

Joe_Groff · April 17, 2018, 5:36pm

Whether a type can be allowed to conform to a protocol multiple times is a separable concern from the syntax used to describe associated types in the conformance. Rust took the approach you described, and I'm not sure that was the right choice, since many of the situations where you really want the Foo<T> sugar, such as for Collection<T>, are also cases where you really also want the functional dependency between Self and T for type inference to work. It also isn't clear to me that the syntax fundamentally implies that you can have multiple conformances with different associated types. In non-protocol contexts, for example, declaring a generic struct Bar<U> doesn't imply that a value can be both a Bar<Int> and a Bar<String> at the same time.

Associated types don't need any different special treatment as existential values; the implementation model is more or less the same for Foo where AssocType == T regardless of whether a concrete type can conform to Foo multiple times with different AssocTypes. That impacts the type system, but not the runtime model.

Paul_Cantrell · April 17, 2018, 8:21pm

This is family of features is one of the gaps in Swift I most want to see filled. It’s the reason Siesta has Resource instead of Resource<T>, which IMO is the library’s primary design flaw. The current burden of hand-coding AnyHashable-like containers is a rough one!

hlovatt · April 18, 2018, 2:32am

This is so unwieldy compared to:

let x: Collection<T> = collection
x.first // get its first value as a T.Element?
x.startIndex // get its start index as a T.Index

let y: Collection<T> = someOtherCollection

// No need to test element typess, just use x and y. Substitutes runtime test for compile time :)

Joe_Groff · April 18, 2018, 2:33am

Of course you could express that x and y statically have the same Element type too, it was just an example. Note too that, if not for associated types, you would need to have declared it as Collection<Self, Index, Element, SubSequence, etc., etc.>, not only Collection<T>, to capture all of the type relations in Swift collections. Associated types and existential opening give more flexibility as to how much type information you can provide.

hlovatt · April 18, 2018, 2:37am

The intermediate type is needed. For example a collector that appends strings together in Java would have the type Collector<String, StringBuffer, String> since StringBuffer is a lot faster than String when mutating. But the user doesn't need such detail so the public type would be Collector<String, ?, String> and if they had generalized existentials it could be Collector<String, String>.

JoeyKL · April 18, 2018, 2:39am

Hmm, I'm still confused. Why does the user even need to be aware that an intermediate type exists at all, even if they don't know specifically what it is?

Joe_Groff · April 18, 2018, 3:07am

Borrowing another idea from Scala, path-dependent types, the "opening" operation could be made implicit:

let x: Collection
x.first // get first value as x.Element?
x.startIndex // get start index as x.Index

let y: Collection where x.Element == y.Element // Or Collection<x.Element> for short, maybe

JoeyKL · April 18, 2018, 3:26am

I think having a distinction between types and type constraints (like Haskell does) is a good thing, but opinions will differ.

hlovatt · April 18, 2018, 3:57am

The user doesn't, but the compiler does. Apologies in advance, the following is very rough.

In Swift you might write Collector as:

struct Collector<E, I, R> {
    let intermediateFactory: () -> I
    let elementProcessor: (inout I, E) -> Void
    let intermediateCombiner: (inout: I, inout: I) -> Void
    let resultGenerator: (inout I) -> R
}

Then in pseudo code collect would be:

extension Collection {
    func collect<I, R>(c: Collector<Element, I, R>) -> R {
        for each processor in parallel
           use c.intermediateFactory to make intermediate storage
           partition off self.count / numberOfProcessor elements and use c.elementProcessor to put processed elements into the intermediate storage
        Combine in parallel all the intermediates into one intermediate using c.intermediateCombiner
        return c.resultGenerator(&finalIntermediateStorage)
    }
}

Therefore the collect function needs the type of the intermediate storage, I, but the user of collect doesn't care, therefore an associatedType would be ideal and Collector would become:

struct Collector<E, R> {
    associatedtype I
    let intermediateFactory: () -> I
    let elementProcessor: (inout I, E) -> Void
    let intermediateCombiner: (inout: I, inout: I) -> Void
    let resultGenerator: (inout I) -> R
}

Joe_Groff · April 18, 2018, 4:24am

The distinction exists in the language model, even though the surface syntax tries to obscure it, perhaps unsuccessfully. All a protocol inherently defines is a new constraint, like a typeclass in Haskell. If you name a protocol or protocol composition P1 & P2 in type position, the builtin existential type gets instantiated with those constraints. If you’re familiar with the ConstraintKinds GHC extension, you could say that there’s an implicit conversion from Constraint to * kind done by instantiating an existential type with that constraint.

JoeyKL · April 18, 2018, 4:45am

I'd say trying to hide it is a good thing, considering the Language Guide chapter on protocols doesn't even mention associated types. The very last heading in the chapter does mention conditional extensions that work based on associated types, but doesn't even explain or even reference associatedtype. To me this is good evidence that the normal notion of "protocol," something that can be mixed heterogeneously without fear, is entrenched in the conceptual notion of what a Swift protocol is.

JoeyKL · April 18, 2018, 5:12am

Unfortunately I do not have the knowhow to use the source compatibility suite, but I would be enormously grateful if someone who does could check how common protocols that use associated type or Self (or inherit from one that does) are. If that's something that's easy to check. My gut says the median is one per project, with some using none and a few using tons.

JoeyKL · April 19, 2018, 3:36am

I have thrown together a proposal draft. Thoughts and criticisms?

Joe_Groff · April 19, 2018, 3:50am

In all honesty, this is extremely unlikely to change at this point. Breaking source compatibility is not something we do without an extremely good reason anymore. If the language guide is misleading, I'd suggest filing bugs to improve it. All Swift protocols are constraints first, and some can be syntactically used as types second (which will eventually be generalized to all, hopefully), and I think that's a better way of thinking about protocols than as primarily types. Swift took the name protocol from Objective-C, and in Objective-C, protocols are not even types at all by themselves but constraints you attach to types.

JoeyKL · April 19, 2018, 3:57am

I will file a bug report. However, I don't think the wrong documentation is the primary source of confusion. The confusion and attention around this topic is enormous to the point where I would consider it "extremely good reason" though of course, others may disagree.

I suspect (without evidence) that this is not how the majority of people think about protocols. It is definitely not how protocols are taught. If we are going to embrace this conceptual model of protocols, then they should be taught completely differently.

I did not know this! Interesting.

Karl · April 19, 2018, 6:20am

All protocols in Swift (with/without associated types) are "bags of constraints". The current differences are language/implementation limitations, and as Joe said, will be filled with a generalised existential model.

The reason you can't just do x == y with 2 random Equatables has nothing to do with the protocol, and everything to do with the == operator, which requires that both its operands have the same dynamic type.

For example, you should be able to write:

let x: Collection = ...
let y: Collection = ...
assert(x.count == y.count)

And in doing so call .count on each Collection without knowing their dynamic types or binding them to any generic parameters, in the same way you'd directly call x.toggle() if X is Toggelable (to reuse a previous example)

JoeyKL · April 19, 2018, 6:48am

Well not exactly (with apologies for pedantry):

class A: Equatable {
  static func ==(_:A, _:A) -> Bool {return true}
}

class B: A {}

A() == B() // defined, but different dynamic types

Karl:

For example, you should be able to write:
let x: Collection = ...
let y: Collection = ...
assert(x.count == y.count)
And in doing so call .count on each Collection without knowing their dynamic types or binding them to any generic parameters, in the same way you’d directly call x.toggle() if X is Toggelable (to reuse a previous example)

This is such a profoundly good example that you have totally changed my mind. Now, the solution would be to create a Countable protocol that Collection inherits from, but that is cruft and needless hierarchy.

It is entirely reasonable to say constraint protocols should be able to be used as types, but only for their methods and properties which do not use Self or an associated type (barring typecasting, of course).

That said! There will still be confusion whenever generalized existentials come out that we will have to work our best to curb with quality explanations of the concepts, and the documentation is still wrong in the mean time.