Questions about the future of Protocols with Associated Types

Hey everyone, I've got some questions and concerns about PATs and what the roadmap for them is. I've asked about them before, but I can't find that thread nor do I really get what's going on. When I've brought stuff like this up before, words like "higher-kinded types" or "generalized existentials" get thrown around that just make me feel stupid and don't really do anything to contribute to the conversation nor enlighten those of us who don't actually write our own compilers.

To give a bit of clarification about what's causing these questions, I was recently trying to implement some generic predicate code using protocols and was running in to serious issues.

protocol Predicate {
    associatedtype Element
    func contains(_ other: Element) -> Bool
}

On the surface, this is a very very simple protocol, and you can see how this protocol could be easily adopted by NSPredicate, IndexSet, or any SetAlgebra type.

For the sake of continuing examples, we're going to focus on two implementations: IndexSet and a custom set of structs where the Element == Int, and the desire to union two sets.

To do this, we'll want a struct OrPredicate<Int> type that does the unioning logic.

The obvious thinking might be to do this:

protocol Predicate {
    associatedtype Element
    func contains(_ other: Element) -> Bool
    func union(_ other: Self) -> Self
}

The problem with this approach, however, is that we cannot provide a default implementation of union() that uses our OrPredicate<T> implementation. The use of Self imposes a same-type requirement that is great for IndexSet, but broken for everything else. For example, you could imagine wanting to be able to union() a Set<Int> and an IndexSet, and get back some sort of Predicate that contains Int values. However, using Self disallows that.

So, since we don't necessarily want the same-type requirement, we next turn to generics:

protocol Predicate {
    associatedtype Element
    func contains(_ other: Element) -> Bool
    func union<P: Predicate, R: Predicate>(_ other: P) -> R where P.Element == Element, R.Element == Element
}

This breaks in lots of ways, and IndexSet no longer fully implements the protocol when it has this definition, nor can you really imagine how it could implement this.

Next we'll turn to associatedtypes:

protocol Predicate {
    associatedtype Element
    associatedtype UnionResult: Predicate where UnionResult == Element
    func contains(_ other: Element) -> Bool
    func union<P: Predicate>(_ other: P) -> UnionResult where P.Element == Element
}

This is better, but we still run in to situations where requirements aren't satisfied. Additionally, we're limited to a single return type. If we union an IndexSet with an IndexSet, it makes sense we'd want an IndexSet back. But if we union a Set<Int> with an IndexSet, we'd want a UnionPredicate<Int> back. The associatedtype does not allow this.

What I really want to do is this:

protocol Predicate<Element> {
    func contains(_ other: Element) -> Bool
    func union(_ other: Predicate<Element>) -> Predicate<Element>
}

Of course, the compiler doesn't let me do this.

I want to express the idea that union() can take any Predicate-conforming value with the same element type, and you're going to get back some sort of Predicate-conforming value that contains elements of that type.

Then in specific situations where I can provide a much more efficient implementation (like the IndexSet + IndexSet = IndexSet scenario), I want to provide those explicit overrides for that method, and otherwise default back to the implementations I provide in my protocol extension (which happen to return an OrPredicate<Int>)

I guess my questions are:

  1. Why don't we actually have generic protocols?
  2. Will we ever get generic protocols?
  3. Is what I'm wanting to do something that will ever be possible in Swift, or are we kind of screwed and stuck with how things are now?
10 Likes

When we support generalized existentials, you'd be able to use Predicate where Element == T as a type. Declaring a protocol as Predicate<T> could be reasonable sugar for it, but it would most likely be shorthand for declaring a protocol with an associated type. Associated types serve the same purpose as generic arguments on interfaces in other languages.

15 Likes

In the context of Swift, an 'existential' is, for most intents and purposes, a protocol type (for example, if you declared a variable to be type CustomStringConvertible, rather than e.g. Int).

You can't use protocol types in some places right now; the compiler will complain if you try to declare let x: Sequence, for example. "Generalized existentials" just refers to adding in the features necessary for you to be able to use protocol types everywhere you'd be able to use a 'normal' type (in other words, 'generalizing' their usefulness).

Now, imagine a function that takes in any sequence of Ints and turns them into an Array<String>. You can write that function's signature today: func myFunc<T: Sequence>(_ argument: T) -> [String] where T.Element == Int.

Say you wanted to refactor that function so that, no matter what sort of sequence you passed into it, you'd get the same kind of sequence out. So if I passed in an Array<Int>, I'd get out an Array<String>. If I passed in a LinkedList<Int>, I'd get out a LinkedList<String>, and so forth. It turns out you can't really write such a function in Swift today. "Support for higher-kinded types" is the feature that would make doing such a thing possible.

Hope that helps!

18 Likes

An additional comment I’d like to add is that generalised existentials would basically make AnySequence, AnyCollection, and so on redundant (or at least be type-aliased to their GET equivalents) because you’d be able to use any PAT without having to define a Any… type first. The net result is easier protocol-oriented programming without the necessary boilerplate when declaring associated types.

I found the Generics Manifesto quite useful.

1 Like

Because they wouldn't do what you want to do here. If we supported generic protocols, your type wouldn't conform to Predicate; it would conform to Predicate<Int> or Predicate<String> or whatever, and it could conform to both at the same time. This is the same as how, if you have a generic class MyClass, you don't write class Subclass: MyClass; you write class Subclass<T>: MyClass<T>.

What I think you actually want here is the generalized existential feature. "Generalized existential" is a ten-dollar term for a ten-cent idea: "Swift should have built-in type-erasing wrappers for PATs, much like it has them for ordinary protocols." Their syntax might be slightly different—they'd probably use a where clause instead of a generic parameter list—but they would do the thing you want to do here.

We probably won't get the kind of generic protocols I described; they're a little too niche. But there is a ton of demand for generalized existentials and I think we're planning to do them eventually.

I very much hope we aren't screwed and stuck.

7 Likes

Thank you for this question and the detailed answers. I was trying to wrap my head around this as well and now I am looking forward to it getting much more intuitive in the Future<Swift>.

I would like to link this thread post to the current topic. Unfortunately there wasn't any update on GPs since we had this conversation. cc to @regexident

1 Like

I just …

  • added a bunch example snippets
  • extended the paragraph on HKTs
  • added a paragraph on "Protocols with Generic Associated Types"
    (in case you weren't confused enough :sweat_smile:)

… to the post you linked.

2 Likes

And "Higher Kinded Types" is another ten-dollar term for a ten-cent idea: nested generics, or to put it another way, generics of generics.

3 Likes

An update to the Generics Manifesto would be useful to understand the correct state-of-play.

Last I remember hearing, HKTs we’re dead in the water so it would be interesting to hear if that’s no longer the case.

I’m not sure if it’s the slightly less than accessible vocabulary, or the fact that the manifesto hasn’t been fully implemented, but I often find myself going through a sequence of events, such as:

  1. Planning to store a bunch of general types in an array and using a standard obj-c protocol
  2. Realising my needed type is actually a PAT.
  3. Looking for alternatives so I don’t need to write a load of type eraser boilerplate
  4. Writing a load of boilerplate

My understanding is Generalised Existentials will help this.

I also wonder if a protocol vs a PAT should have had alternative names. The ‘associated type requirement’ completely changes the capability of the type and I find that’s probably the hardest thing in gaining a solid understanding. It must be mind boggling for a beginner. Even forcing comprehension with an @objc protocol qualifier might help make the distinction more explicit.

They are different but because they’re named the same it adds a significant weight in cognitive load.

But yes, an update to the manifesto would be appreciated. I’m still burning a candle for HKTs.

1 Like

I don’t think this is true in the long run, they’re just not a priority when things like ABI stability and a concurrency model are still in progress. There has been some skepticism expressed about how important HKT and the usual FP abstractions are. My sense is that HKT is pretty likely as a language feature someday but we’re unlikely to see even Functor added to the standard library without pretty strong motivation demonstrating concrete problems it solves in Swift.

There is a thread on the topic that had a lot of activity recently you may wish to read: Higher Kinded Types (Monads, Functors, etc.)

If you haven’t seen my gist showing how to encode HKT in Swift today you may also be interested in that: Emulating HKT in Swift · GitHub.

2 Likes

At the moment you can use code generation to do this. See my article here for example: [Superpowered Sequences. Sequences and Collections in Swift are… | by Jonathan @ Strictly Swift | Medium] ... which makes heavy use of @ananabits HKT gist and Sourcery.

They're both protocol kinds. The lack of capabilities when you introduce associated types or Self constraints is a limitation in the current system, one that is planned to be addressed. So it doesn't really make sense to have them be different kinds. Now if ever introduce a mixin that behaves like a protocol, but can have stored properties in them, that I would argue could be a new kind.

1 Like

In that other post, there is a problem with this section:

Because it should be more like:

protocol Bindable<T> {
  // …
  associatedtype Flattenable: Bindable = Self
  func flatMap(_ transform: (Self.Element) throws -> Flattenable) rethrows → Output<Flattenable.Element>
}

Ok, yes, I think I see what you mean - that’s reassuring. It certainly feels like a ‘gap’ in the implementation that comes up frequently.

So once we have Generalised Existentials we’ll be able to use PATs in the same way as we can currently use objc protocols. Type Erasers are just a bridge until that point.

I think having that gap filled will go some ways to smoothing over comprehension for many people.

It takes a long time to ‘get’ that the error message ‘type has associated type or self requirements’ actually means, this is a gap in the generics implementation and you probably want a Type Eraser in the mean time! :)

The core team is well aware of this issue. I believe @Douglas_Gregor has indicated that he considers this to be the top priority for significant enhancements to the generics system. It isn't clear when it will make it to the top of the overall priority list yet though.

I consider it a high priority because I don't like that the addition of an associated type to a protocol, which is often required to describe the protocol's interface well, makes it impossible to form a value of that protocol's type. We should fix that.

Generalized existentials encompasses the solution to this problem. I appreciate @beccadax's framing of this term as a "ten-dollar term for a ten-cent idea," because the term itself is very off-putting. I think it's made worse because discussions of generalized existentials can quickly veer into really complicated examples and terminology. There are complications here, and a proposal to introduce generalized existentials will have to deal with them.

However, I think we (as a community) haven't done a great job of showing how generalized existentials would address real-world problems like the ones @davedelong is referring to. Partly that's because we don't have a full design yet, and haven't really addressed the question "how can we teach this?"

Doug

14 Likes

What I'm really hoping to get out of all of this is the Swift equivalent of Objective-C's "class clusters", which I'll refer to as "protocol clusters" here.

With a cluster, I can simply define an interface, and then return any value that conforms to that interface. This is, ideally, what a protocol is for. Then under-the-hood, I can have specific implementations for various scenarios that can implement things in a more efficient manner. For example, IndexSet can be a whole lot more efficient than a Set<Int>, because IndexSet can store a bunch of Range<Int> values to encompass larger ranges than are feasible to represent by discrete Int values in a Set.

We're halfway there today. I can have a function that returns a CustomStringConvertible, and can still call .description on the result, even though I have no idea what the underlying type is.

However, once we want a generic type in there, this all goes out the window. There's no way to say "this function returns some unknown concrete type that conforms to this particular protocol, of this generic type". Like, I can't return "some collection of Ints" or "some set of strings" (which might be a Set<String> or a Trie, depending on internal decisions).

I want protocol clusters, so my code can decide for itself what the "best" implementation is, and then return that value without burdening the caller with all the implementation details, like PATs currently require.

8 Likes

I'd certainly like something more like Obj-C'c class clusters, too. Even the stdlib types like Array, String and Dictionary are really clusters (using an enum under-the-hood).

Writing these kinds of types is pretty tedious at the moment, though. Really, a "protocol cluster" should just be a public interface wrapper around an existential:

private protocol DictionaryImplementationProtocol { ... }

struct Dictionary<Key, Value> {
  private var storage: DictionaryImplementationProtocol

  // define public API.
}

And when that protocol is non-public or closed (when protocols are allowed to be closed...), the compiler might convert that existential to an enum of all known implementations, if it felt like it.

Hi Dave, forgive me if this is stating the obvious, but isn't that exactly the definition of a Generalised Existential/Type Eraser?

Can you not do something like:

extension Set: Predicate where Element == Int {
    func union<P: Predicate>(_ other: P) -> AnyPredicate<Element> where P.Element == Element {
        return AnyPredicate(IndexSet(filter { other.contains($0) }))
    }
}

extension IndexSet: Predicate {
    func union<P: Predicate>(_ other: P) -> AnyPredicate<Element> where P.Element == Element {
        guard let otherIndexSet = other as? IndexSet else {
            return AnyPredicate(IndexSet(filter { other.contains($0) }))
        }
        return AnyPredicate(otherIndexSet.union(otherIndexSet))
    }
}

Obviously, this is a contrived example and so maybe IndexSet isn't going to be the optimised backing store for every case, but perhaps instead of using IndexSet you could define a type that would allow for more flexible optimisations.

Edit: Scrap most of that – it only works on items that haven't been boxed in an AnyPredicate. Interesting problem, though.