Honestly, I think we're discussing the wrong problem with protocol generics. We agree that the type system should support generic protocol existentials without type erasure, but this discussion has been mostly centered around making their associated types existential as well—being able to declare, "I want any instances of this protocol, each with any instances of its associated types", such as with AnyHashable
—whereas I think it'd be more useful and pratical to start with (what I'm going to call because I don't know the proper term) "fully qualified" associated types—being able to declare "I want any instances of this protocol, each with these specific implementations for its associated types", such as with AnySequence<T>
. The latter situation is something I find myself needing much more often than the former.
All the examples described above can be more easily described with these existential protocols with universal associated types, and we can later add syntatic sugar to "simplify" the syntax to something more akin to what's being discussed.
I'm going to demonstrate this concept using the syntax Sequence«Element: T»
to denote the proposed non-type-erased equivalent of AnySequence<T>
. I am not advocating for the use of this syntax, I'm just using it as an arbitrary notation for my examples.
One of the first examples of a proposed generic protocol syntax is:
func bar(x: Collection, y: Collection) -> [Collection] { ... }
The problems with this syntax, it's argued, is that the Element
and Index
associated types are not necessarily the same for both parameters passed into bar(x:y:)
as well as the return type of the function itself.
Another example is then used to show that partially constraining the protocol to a specific Element
type still leaves the Index
type unconstrained, leaving us with the same problem.
typealias CollectionOf<T> = Collection where Self.Element == T
func bar<T>(x: CollectionOf<T>, y: CollectionOf<T>) -> [CollectionOf<T>] { ... }
This problem can be solved by requiring existential protocols have "fully qualified" associated types:
func bar<T, U>(x: Collection«Element: T, Index: U», y: Collection«Element: T, Index: U»)
-> Collection«Element: T, Index: U»
Here, x
, y
, and the return value all have the same Element
and Index
types. I'd also argue that Sequence
would be better suited for this specific example, but I understand it's intended as a demonstration with existential associated types.
Sidenote: The way I'm imagining this system, the remaining associated types Indices
, Iterator
, and SubSequence
inherit that "full-qualification"—resolving Indices
as Collection«Element: U, Index: U»
, Iterator
as IteratorProtocol«Element: T»
, and SubSequence
as Sequence«Element: Element»
(with its Iterator
resolved to IteratorProtocol«Element: Element
») respectively. I notice that this specific example recursively references Collection
—causing an infinite type definition. Perhaps this sort of "qualified" type wouldn't be allowed until the compiler is able to reason that the qualified type of Indices.Indices
is equivalent to the type of Indices
, and can handle it somehow. I'm not sure how the type system is implemented, so maybe this won't ever be possible, but I think this behavior should at least be valid for non-recursive types like Sequence
.
The second section discusses Swift's ability to allow the caller of a function to specificy its return type.
func zim<T: P>() -> T { ... }
let x: Int = zim() // T == Int chosen by caller
let y: String = zim() // T == String chosen by caller
It's argued that existential generic protocols can't allow for this behavior because the caller can't specify the Element
of a returned Collection
without first defining the returned collection itself.
Note: I've adjusted the function signature of the following examples to use the BinaryInteger
protocol instead of the Int
structure. This will let the caller better define the return type later on.
func evenValues<C: Collection, I: BinaryInteger>(in collection: C) -> Collection
where C.Element == I, I: ExpressibleByIntegerLiteral {
return collection.lazy.filter { $0.isMultuple(of: 2) }
}
let x = evenValues(in: [1, 2, 3, 4]) // What is type(of: x).Element?
I think it should be noted that these specific examples shouldn't technically work since lazy
is defined on LazyCollectionProtocol
, not Collection
, so I've modified the following examples to take a LazyCollectionProtocol
parameter instead.
The second example illustrates what it would look like for the caller to specify the return type themselves.
func evenValues<C: LazyCollectionProtocol, I: BinaryInteger, Output: Collection>(in collection: C) -> Output
where C.Element == I, Output.Element == I, I: ExpressibleByIntegerLiteral {
return collection.lazy.filter { $0.isMultiple(of: 2) }
}
let x: LazyFilterSequence<[Int]> = evenValues(in: [1, 2, 3, 4]) // Wait... Why do I need to know the return type?
The proposal for opaque result types attempts to address this issue, but it can only be resolved to a specific underlying type, which isn't the existential behavior we're looking for here.
"Fully qualified" associated types can be used here to address both problems—needing to know the return type, and having non-existential behavior for opaque result types.
func evenValues<C: LazyCollectionProtocol, I: BinaryInteger>(in collection: C) -> Collection«Element: I, Index: C.Index»
where C.Element == I, I: ExpressibleByIntegerLiteral {
return collection.lazy.filter { $0.isMultiple(of: 2) }
}
let x: Collection«Element: Int, Index: Int» = evenValues(in: [1, 2, 3, 4]) // Don't care what it is as long as has Ints
let y: Collection«Element: UInt, Index: Int» = evenValues(in: [5, 6, 7, 8]) // Don't care what it is as long as it has UInts
A proposed syntax related to the opaque result type is based on Rust's impl
keyword.
func concatenate(a: some Collection, b: some Collection) -> some Collection { ... }
If I'm understanding the intent behind this syntax correctly, I think this would just be sugar for the following "fully qualified" example.
func concatenate<T, U>(a: Collection«Element: T, Index: U», b: Collection«Element: T, Index: U»)
-> Collection«Element: T, Index: U» { ... }
Which can be read as, "Give me any two (possibly different) Collection
implementations that both have Element
s of T
and Index
s of U
, and then I'll return some other Collection
implementation with both of those associated types as well."
@Joe_Groff, you had this to say about the need to include both the Element
and Index
in the existential type of Collection
:
Swift's design is aimed at enabling more a expressive type system to capture more interesting type-level relationships between values. The C# design would become more cumbersome if you tried to implement something like Swift's
Collection
hierarchy in it, since you'd need to define a typeICollection<Index, Element>
and carry the index around with you everywhere. The type relationship between collections and indexes is what allows Swift's collections to approach "zero cost" in specialized code, since for instance, you know anArray
is always indexed by Ints, and that aString
is always indexed by valid code unit offsets represented byString.Index
. Although you could express that relationship in C#, it would makeICollection
not very useful as a dynamic interface type, since the Index generic argument is usually specific to a single collection family, so for instanceICollection<Int, T>
would effectively be a type that can only holdArray
s. By using associated types, Swift allows you to express relationships betweenCollection
s using only the relevant associated types; you only need to refer to Index when indexing. With more flexible existential types, you'd also be able to refer to anyCollection<.Element == T>
to abstract over collections of a certain element type without confining yourself to a specific index. The goal of associated types is to allow for greater flexibility and expressivity, admittedly at the cost of some shorter-term awkwardness since we're missing so many key features still.
I am of the opinion that keeping a Collection
without knowing its Index
excludes its use for Collection
-specific functionality—which is mostly centered around the ability to index its elements. It would be more appropriate to keep a Sequence
in this case. Since there wouldn't be any type-erasure, it'll be possible for some client to inspect its type and cast it back into a Collection
if they really wanted to get that indexing behavior back.
I'm not saying that using a Collection
without knowing its index is never appropriate—there are definitely legitimate use cases, even if I can't think of one off the top of my head. I just think that there are more use cases where not needing to know the underlying implentation of a protocol and needing it to have specific associated types is more common. This is definitely the case in my experience.
tl;dr I really think we should be focusing on creating a non-type-erased "Any*" type along the lines of AnySequence<T>
before introducing non-type-erased existential protocols such as AnyHashable
.
PS: If you know the formal terms for anything I talked about, let me know! Doubly so if I misunderstand some type theory concept or basic axiom of the Swift type system or philosophy.