Sealed protocols

Karl · January 8, 2019, 10:56pm

My point was that (for teaching purposes), if somebody writes a data-structure, then decides "oh, I could write some kind of optimised storage for this particular case; I need to talk about this type in a more abstract way", or that some algorithm could return an optimised result for a particular case, they are doing type erasure and should go for a protocol (or a class).

Even if it's technically possible with an enum. This way is just easier to write and maintain, and should have no performance consequences (assuming the types are not public, cross-module is more complicated).

It's probably better to talk about with some code.

enum MyEnum {
  case a(A)
  case b(B)

  func aReq() -> Int {
    switch self {
    case .a(let a): return a.aReq()
    case .b(let b): return b.aReq()
    }
  }
}

extension A {
  func aReq() -> Int { /* ... */ }
}
extension B {
  func aReq() -> Int { /* ... */ }
}

======= vs =======

protocol MyProto {
  func aReq() -> Int
}
extension A: MyProto {
  func aReq() -> Int { /* ... */ }
}
extension B: MyProto {
  func aReq() -> Int { /* ... */ }
}

yxckjhasdkjh · January 9, 2019, 2:05pm

The way I see it, enums are just much easier to reason about because you know in advance all the possible cases there are (e.g. a Result type). The information about how to deal with the different cases is encapsulated in each function / method that uses the enum, while with protocols that information is to some extent delegated to conforming types. And even with sealed protocols, your different conformances could be spread out over a large module.

Protocols open up a lot of flexibility (which is needed for highly generic code or when you know the list of possible conformances is not somehow naturally limited etc.), but they also add indirection and thus the code is not as easy to inspect (even though the compiler does prevent any obvious bugs).

I see the discussion as being somewhat similar to using sum types vs. typeclasses in Haskell, so it's probably helpful to read up a bit on the Haskell community consensus about when to use which (of course, there are some differences, for example you can't use Haskell typeclasses directly to build heterogenous lists, etc.).

Joe_Groff · January 9, 2019, 5:58pm

This doesn't feel like an idiomatic use of enums to me; the individual methods on A and B are superfluous and could just as well be written inline in the switch. As you said, if it's interesting for A and B to both share the API individually and as a single abstraction, then a protocol makes much more sense.

Setting aside temporary limitations, I think it'd help to look at the fundamental difference between enums and protocols to help derive guidance about when each is ideal. For a protocol, the "cases" are always types with individual identity, and the protocol itself does not directly have a type identity. (Existential types exist, sure, but from a model perspective I think it makes sense to consider these structural types derivable from generic constraints rather than something fundamentally tied to protocols.) On the other hand, an enum is the reverse—the cases have no type identity, but the sum of them does. That suggests that enums are at least the best tool for generic abstract sums like Optional and Result, since a Some<T> or Success<T> type would have little use outside of the enum, and for enums with lots of no-payload cases, since each empty case is likewise of little use as a type independent of the collection of cases.

mpangburn · January 9, 2019, 6:10pm

In alignment with the notion of identity you're describing, protocols have the advantage of describing subsets of types via inheritance:

sealed protocol ABC { }
struct A: ABC { }
struct B: ABC { }
struct C: ABC { }

func takesABorC(_ abc: ABC) { /* ... */ }

protocol AB: ABC { }
extension A: AB { }
extension B: AB { } 

func takesAorB(_ ab: AB) { /* ... */ }

With enums, there's no simple way to express restriction among values without creating a separate type and a function to convert between the two.

enum ABC { case a, b, c }

func takesABorC(_ abc: ABC) { /* ... */ }

enum AB {
    case a, b

    var asABC: ABC {
        switch self {
            case .a: return .a
            case .b: return .b
        }
    }
}

func takesAorB(_ ab: AB) {
    /* requires a call to `asABC` to do `ABC` things with `ab` */
}

anandabits · January 9, 2019, 6:27pm

This is a great way of looking at it. The guideline would then be to use enums when the cases don't have a meaningful independent identity and the API offered is defined in terms of the entire sum. When the "cases" do have a meaningful independent identity a protocol should be used, even when the sum itself also has a meaningful identity (i.e. the existential).

In order to make this guideline as viable as possible we need to remove the current limitations of protocols. Generalized existentials will help to make protocols more viable in cases where a generic enum might be a better choice today. Exhaustive switch over (sealed and non-public) existentials will allow protocols to be used when without sacrificing the ability to organize code by method rather than by type or to switch inline in part of a larger operation if necessary. Ideally existentials of sealed and non-public existentials would also receive a representation that is just as optimized as enum representations as well (which would hopefully avoid the boxing existentials require in many cases).

Joe_Groff · January 9, 2019, 7:58pm

It should also be noted that making a protocol sealed as proposed is not by itself enough to have exhaustive knowledge of the types conforming without significant implementation complexity and compile-time cost, so this proposal probably should not promise optimizations or language features that depend on this knowledge. Private types and local types within a module are not normally visible outside of their scope, and if we were going to base language features like exhaustive pattern matching or optimizations on this knowledge, it would require that every translation unit exhaustively scan the source of the program to find any private or local types that might be conforming, which would not normally be necessary.

Furthermore, layout optimizations on existentials are not as trivial as they may seem, since existentials are structural types, and significant amounts of runtime code expects existentials to have certain layouts. Existential-specific layout optimizations could also end up pessimizing conversions between related existential types, since their optimized forms could have significantly different layouts. An optimization pass could conceivably replace an existential by a synthesized enum in places where it's a concrete type, but we would likely have to reabstract to the generic existential representation any time we dynamically manipulate the existential type. It would also be impossible to lay out existentials for sealed public protocols differently without breaking ABI if we want it to be resilient to remove sealed-ness.

There are other benefits to this proposal for sure, but if exhaustive knowledge of conforming types is an important goal of this proposal, it would need some adjustment to achieve that goal, such as possibly restricting the conformance of private or local types. Otherwise, the proposal text should not take this for granted and should probably not promise anything about layout optimizations or exhaustive pattern matching.

anandabits · January 9, 2019, 8:14pm

I didn't mean to imply that this proposal would promise any of that. @Karl has already made it clear that it is out of scope for this proposal. sealed protocols have plenty of merit without those features. But I would like to see these directions explored in the future as they would reduce the number of tradeoffs we face when making design decisions. It would be fine if they were subject to limitations (such as all conformances must be visible in order to switch exhaustively).

I don't know too much about layout optimizations but it would be nice to be able to choose the logical semantics of protocols and existentials without having to pay for boxing. Conversions between related existentials are not always necessary so that cost isn't always a factor. It would sometimes be acceptable as a tradeoff if we had the ability to influence the choice of layout. re: resilience, I think it would be fine to require @frozen sealed for layout optimization of public protocols.

Joe_Groff · January 9, 2019, 8:39pm

More specifically, the proposal draft at sealed protocols by karwa · Pull Request #972 · apple/swift-evolution · GitHub says:

Similarly, when the compiler has knowledge about the conforming types, it can use optimised operations to handle existentials.
Currently, we advise to make protocols which are only conformed-to by classes inherit AnyObject, but this then becomes part of the protocol's ABI
and clients may depend on it. sealed protocols have the possibility to lower this to an 'informal' optimisation within the declaring module,
and support more patterns between conforming types.

and makes passing reference to optimizability in other places, which is not really possible as proposed.

anandabits · January 9, 2019, 8:45pm

I see. Definitely makes sense to remove this text then. I think the feature has sufficient motivation without discussing optimizations.

Chris_Lattner3 · January 10, 2019, 6:20am

I agree with Joe. I fear it will bring up many resilience and access control discussions which are not really the important problems to be facing at this point in Swift's evolution. I'm personally very -1 on this feature, from a "it adds a bunch of complexity for very little gain" perspective.

Edit: The complexity I'm concerned about here is "language and conceptual complexity", not compiler complexity.

-Chris

anandabits · January 10, 2019, 2:23pm

Can you elaborate on this? Just the basic proposal without any of the enhancements that have been discussed would be enough to eliminate the need for this awful hack that prevents conformances from being added outside a module. I have needed to use that hack in a few places and would very much like to get rid of it.

The fact is that there are important library design techniques that don’t work if conformances can be added outside the library. I think enabling them is an important goal and reduces complexity for both libraries and users by allowing the library to more clearly state its intent within the language itself.

Karl · January 10, 2019, 5:24pm

Yes, please elaborate. There has been plenty of discussion on this topic over the years, and every time, people who write Swift libraries and applications every day share stories about the hacks and tricks they use to emulate this feature. You can't just drop a bomb like that and walk off ;)

The only access control discussion I can see is whether or not protocols should be sealed by default and made open instead of the reverse. It's worth having that discussion as soon as possible, but I think everybody understands that it's unlikely. I also don't see any "resilience" impact (in the @_fixed_contents/@_frozen sense); it's valuable even for source libraries to declare protocols as sealed. Again, writing good protocols is hard.

ben-cohen · January 10, 2019, 6:11pm

The alternative to that hack is to document that this protocol is not to be conformed to, and leave it at that. Protocols aren't just bags of syntax. You must always read and understand the documented requirements of a protocol when you implement it – or else you haven't actually implemented the protocol.

In this case, if you read the requirements and they are "don't implement this", things are pretty clear. Proposing a new keyword is no small thing, and the case would need to be made that this proposal delivers significant benefit over this approach.

gwendal.roue · January 10, 2019, 6:24pm

Yes. This is what the stdlib does today, and I don't hear much fuss about people trying to conform to StringProtocol.

I mean, the status quo is a very serious option, and nobody should feel bound to adopt the various hacks described in this thread.

This is especially true now that it has been clearly stated above that the pitch should not make any promise about possible compiler optimizations or extensions such as exhaustive switches.

beccadax · January 10, 2019, 6:36pm

It's not in the standard library proper, but I would hope SwiftSyntax would seal the Syntax protocol instead of having an internal _SyntaxBase to serve as the "real" protocol for conformers.

anandabits · January 10, 2019, 6:57pm

Some of us adopt a different philosophy. The hack is awful, but it doesn’t take that much code (which can even be generated making it almost as concise as using sealed). It provides an ironclad guarantee that the library is not abused. It is quite reasonable for a library to take relatively small measures like that to prevent abuse.

IMO, this is the most responsible approach to library development. Yes, protocols are also about semantics and users should read documentation and use them accordingly, but that doesn’t mean the language shouldn’t provide tools to prevent abuse.

Further, the reality is that not everyone bothers to read documentation. We shouldn’t modify our designs around that bad behavior, but we should still strive to offer the best experience possible for all Swift users. Every abuse a library is able to prevent by construction is a win in this regard. (I am not speaking of abuse with malicious intent here, primarily abuse by accident and / or naivety).

Tino · January 10, 2019, 7:00pm

I don't know about the actual use cases for sealed, but I guess hiding a protocol might be suffient in some situations.
In this context, the meaning of "hiding" would be using an internal protocol in a public method, with the effect that the module would expose overloads of that method for each type that conforms to the protocol (while keeping the connecting protocol secret).
This woudn't need new keywords, and afaics has no impact on backwards compatibility.

xwu · January 10, 2019, 7:07pm

anandabits:

Some of us adopt a different philosophy. The hack is awful, but it doesn’t take that much code (which can even be generated making it almost as concise as using sealed ). It provides an ironclad guarantee that the library is not abused. It is quite reasonable for a library to take relatively small measures like that to prevent abuse.

IMO, this is the most responsible approach to library development. Yes, protocols are also about semantics and users should read documentation and use them accordingly, but that doesn’t mean the language shouldn’t provide tools to prevent abuse.

Further, the reality is that not everyone bothers to read documentation. We shouldn’t modify our designs around that bad behavior, but we should still strive to offer the best experience possible for all Swift users. Every abuse a library is able to prevent by construction is a win in this regard. (I am not speaking of abuse with malicious intent here, primarily abuse by accident and / or naivety).

Agree with this philosophy generally, but if this is to be the tentpole advantage of sealed over the status quo then we are modifying our designs--of Swift itself no less!--around bad behavior.

anandabits · January 10, 2019, 7:15pm

This is not how I see it. The tentpole advantage is that it allows us to express our design in the language itself. Preventing conformances outside the module is a relatively common design constraint and it is unfortunate that it cannot be expressed in the language.

Expressing our designs clearly in the language is an important goal IMO. This is one of the advantages of Swift-style protocols over the duck-typed generics that some languages have. If we want libraries to be able to define their public contracts as clearly as possible in the language then we need to support features like this and not just fall back on RTFM.

anandabits · January 10, 2019, 7:16pm

This would not meet any of the use cases I have. The protocol and conformances to it must be visible to users of the library. They are just not allowed to add new conformacnes.