Introducing role keywords to reduce hard-to-find bugs

QuinceyMorris · March 29, 2018, 11:36pm

OK, here’s an idea I’m setting up as a target to be shot down:

Let’s ignore (for now) protocols that have an associated type.
Let’s avoid fixating on whether changes are source-breaking, but circle back around to that issue later if necessary.
Let’s stipulate that we eliminate the distinction between a protocol declaration and a protocol extension: a protocol is an amalgamation of itself and all its extensions. (I’m not sure of all of the implications of this.)
A member declaration ("requirement") in a protocol is by default both a behavior (visible to clients using conforming types) and a customization point (implementable by a conforming type). Such a declaration may or may not have a default implementation in the protocol itself. (If it doesn’t, all conformers must implement it, as now).
A member declaration may be marked final, meaning that it is a protocol behavior (visible to clients) but not a customization point (not implementable by conformers). It must have a default implementation in the protocol.

Edit: This doesn't imply that the member is statically dispatched. It just means that the default implementation can be supplanted. (Think "overridden", but let's not actually use that word except with classes.)

A member declaration may be marked private, meaning that it is not visible to clients, but may be implemented or used by conformers. (This is slightly abusing the meaning of private but not by much, I think.)
A member declaration may be marked both final and private, meaning that is it a "utility" property or method usable by conformers, but not implementable by them.
Conformers providing an implementation for a customization point must name their declaration P.m or P.m(…), where P is the name of the protocol with property or method m as a customization point. To ease the transition, the compiler should accept just m in the conformer declaration, with a warning that the P. should be specified, provided there’s no ambiguity between conformances to two different protocols. (If there’s ambiguity, the warning should be an error instead.)

Edit: It's not clear to me whether there's a use case for a single declaration that satisfies two unrelated protocols P and Q simultaneously. If so, then there's a potential syntax in (P&Q).m which is horrible but at least viable.

The point of #8 is to avoid adding a keyword to indicate that the member is intended to be a conformance. Any benefit in disambiguating multiple protocol conformances is just a side-effect.

At use sites, protocol members may be referred to as someVariable.P.m or someVariable.P.m(…), but if there is no ambiguity with other protocols, the more usual syntax someVariable.m or someVariable.m(…) may be used.

Edit: I hate this someVariable.P.m syntax. Something like it is only needed if the variable's type has separate conformances to unrelated protocols, as in #8.

Conformers are not allowed to declare a member with the same declaration name as a name in a protocol to which it conforms, but not have that method be a conformance.
[Some rule that resolves the problem where subclass override methods are ambiguous in regards to protocol methods to which a superclass provides an implementation that overrides a default implementation (if this isn’t somehow covered by the other rules).]

Tino · March 29, 2018, 11:39pm

I'm not aware of such discussion — but people not being aware of important other posts is one major problem here ;-)
Pure "consumption protocols" are imho an odd thing, so I can't think of any benefit a split would have.

But to refine that classification, I see two motivations for default implementations:
Making methods optional, and functions build on top of other parts of the protocol.

Looks like we have similar feelings ;-)

Tino · March 30, 2018, 12:26am

What about types that already have such an implementation? Would it be impossible for them to conform?

There's one thing that has huge impact on many protocol-ideas:
It's considered to be important that you can add conformance to a type that isn't even aware of the protocol in question, and afaics, it's quite common to design protocols after existing Cocoa types...

Paul_Cantrell · March 30, 2018, 1:13am

I’ve pondered this a lot over the last few years; here are my thoughts. Brain dump follows, with apologies for the length.

I see Swift’s protocols as being unified around the idea of “set of behaviors that a type may have,” with “behavior” defined in Barbara Liskov’s sense of the word. We use the presence of members with particular signatures as a proxy for (and a sanity check of) those desired behaviors, but a protocol is fundamentally a set of behavioral assertions, a semantic description, and not just a checklist of structural requirements.

Some of those behaviors may be deducible from others: “because any P has behaviors X, Y, …, it also has behavior Z.” For example, “because Collection it has an element count, we can ask if it is empty.” These behaviors are the things we can define in protocol extensions.

Some behaviors may have better (but logically equivalent) implementations: e.g. “because it has a first and last index, we can ask if it is empty,” and that may perform better if count is O(n) but first & last index are O(1).

I therefore don’t quite agree with this:

All of the things a protocol describes, whether in an extension or not, whether explicitly enumerated as members or implicitly described by the protocol’s name and documentation, all of them are guaranteed behaviors. It’s just that some of these guarantees we can prove are satisfied using only other behaviors of the protocol itself (extension members), whereas others we can only prove are satisfied on a type-by-type basis (and thus must be implemented by conforming types).

Conditional conformances are a natural extension of this perspective: “we can prove any P has behavior X if we can make a few additional assumptions about the specific type that implements it.”

This is also more or less in the spirit of your #3 above, Quincey.

All of the above makes perfect sense so far. Swift’s protocols just get into trouble when static dispatch runs into name collisions. It really just boils down to that.

Swift is not Smalltalk, and for various implementation reasons, some related to performance but others related to how witnesses etc work, it's not acceptable for Swift to use fully dynamic message-based dispatch under the hood for everything. I got schooled on this in swift-evo a couple of years ago by people who understand the compiler; I’ll let them speak to it if they care to.

What this means in terms of the above is that (1) methods defined only in (possibly retroactive) extensions and not in the protocol itself must be statically dispatched, and (2) statically dispatched extension methods can’t just say “because this value is a P, it can X;” they have to say the subtly, insidiously different “because you are using this value as a P, it can X.”

This appears to violate LSP: change the static type of a variable from P to Q, get a different behavior! The way to view this that preserves LSP is to think of each extension method that is not listed in the protocol itself as non-overridable, so if an extension to P and an extension to Q both happen to define something named foo, then P.foo and Q.foo are logically distinct.

With the vexing questions thus cut down to size, the only problem is that it’s hard to tell when you’re in this dangerous situation, and the compiler doesn’t help much. This proposal is an attempt to remedy that.

Quincey, there have been past discussions on swift-evo very much along the lines you’re thinking of here:

I once proposed a P::x syntax, i.e. a.P::x chooses P’s extension’s implementation of x, but didn’t carry its weight since (a as P).x already serves the same purpose.

@beccadax made a heroic effort to resolve the P vs. Q conflict at module import time, but it got hideously complex.

Neither of the above made it out of the pitch phase.

There was a long discussion of perhaps using final to make static dispatch clear at the point of declaration, but that didn't get much love precisely because of the “snowdrifts of modifiers” problem you mention: lots of required “final” modifiers in extensions. Also, can/should “final” mean something in the protocol itself? That discussion got messy, IIRC.

The role keywords proposal we have here, at least as I understand it, represents the best attempt to date to mitigate the pain of all the above with minimal language impact. It looks to add the minimal surface area to the language necessary for the compiler to help by flagging situations where programmer intent doesn’t match actual behavior in extension member lookup.

It is a snowdrift-aware proposal. It’s not possible to generate the warnings we’d want without a little more information from the developer about what they intend. The current state of the proposal, however, tries to minimize that additional information.

I’ve wanted something along these lines, for a practical reason: a protocol might say “if you have internal state X, then you can have behavior Y.” For example, you might say “if you have fooObservers: [FooObserver], then you can also have a notifyFooObservers(…) method.”

Unfortunately, the way protocols work now, if FooObservable is a public protocol, then both fooObservers and notifyFooObservers must be public. It would be nice to apply access restrictions to both of those things so that clients of implementing types can’t mess with the internal state of the observers.

However, this is an orthogonal question to this proposal. I’d love to take this problem on in another thread. It gets sticky!

QuinceyMorris · March 30, 2018, 2:53am

I guess the answer is that conformances imposed on a type retroactively (which, perhaps, means "across module boundaries"??) would accept conforming implementations without warnings? What does Erica's proposal imply about this situation? I didn't get a clear sense from reading it that it ruled on the matter, but maybe it did.

They're not a requirement on conformer implementations, because of the default implementation. I probably should have just stayed away from the word "requirement".

I need to go away and think about what you said here and in the rest of your response. I have to admit I don't understand the internal design of the compiler well enough to reason about the effect of behavioral promises (to consumers) or implementation requirements (on conformers) on what comes out dynamically dispatched or statically dispatched. It would be nice to be able to talk about syntax, orthogonal to such implications.

Edit: Are we talking about static dispatch because I used final? I don't think of that as meaning static dispatch. Rather, in type declarations, the compiler can infer static dispatch from final sometimes but not always. In protocol declarations, final would just mean the default implementation is the only one (can't be shadowed or "overriden"). Dispatch would be whatever default protocol implementations do today.

Absolutely! In my mind, I think of this as non-public conformance to public protocols. I agree it's highly desirable, and I agree it's orthogonal to this proposal.

Paul_Cantrell · March 30, 2018, 3:04pm

A defining feature of Swift, I think, is that it always keeps an eye on C-like compilation, or at least the possibility thereof, but aims for the highest-level programmer model it can within that constraint. It’s willing to pay some costs for its abstractions (cf Rust), but keeping runtime abstraction cost low and deterministic is a hard constraint when it comes does to the wire (cf Haskell, Java, Ruby). The language mostly feels like it sits at a C# or ML level of abstraction, but the compilation concerns do leak through sometimes. (For example, it’s notable that a concrete Array and not an abstract List protocol is the bread-and-butter list type.)

What all that means for the discussion at hand is that when the compiler authors say there are runtime limits we have to design around, then we design around them; we can’t have the luxury of always thinking about syntax or semantics orthogonal to implementation. That doesn’t mean making the Swift programmer must be aware of the underlying language implementation all the time — heaven forbid! It means that we look for a mental model from programmers that fits the constraints but doesn’t require understanding them, that is sensible and pleasant to work with, and that comes with culture and set of practices that make the model play out well in practice.

That’s the line of thinking behind my long, rambling post above. I'm looking for a sensible and pleasant way of conceptualizing protocols that fits the implementation constraints.

By “static dispatch,” I mean that changing the compile-time type of an expression without changing its value can cause different code to execute. That is something a programmer needs to have in their mental model even if they’re not thinking about implementation.

I don’t mean static dispatch because the word final is present; I mean that some extension methods must use static dispatch, and the question is how best to make programmers aware that this is happening, help them reason about it, and flag potential errors.

Using the word final was something that kicked around in previous discussions, but it didn’t survive the pitch phase. One reason is that it’s not clear whether people will expect “can’t override” to mean “can’t shadow.” The push has consistently been toward (1) fewer keywords and new constructs, (2) more implicit behavior that keeps things tidy if the programmer does understand the dispatch rules at play, and (3) more compiler warnings to help them realize when they don’t understand.

DevAndArtist · March 30, 2018, 3:13pm

Hello @Erica_Sadun, could you provide a statement why the newer version of the proposal has such a dramatic shift of design?

I had a quick reading over the discussion until the very recent responses (which I have not read yet), and I had the impression that the support of the idea was towards a single new mandatory keyword design using default for default implementations (originally pitched by @hartbit).

I also think it is very important that we, the Swift community, find an agreement on the design first, before we're asking someone to implement it. It would be really fatal to let someone implement such a huge change with lot's of different keywords and rules, then throw everything away and re-implement it again because the design wasn't good enough.

I personally do not like this shift of the design. What I really would wish is Swift to adopt a mandatory default keyword and that's about it. default perfectly describes that the current extension supposed to be a default implementation. If it's mandatory then you get compile time guarantees that the implementation was satisfied. By this date I don't know if default implementations with default parameters are real default implementations. It just works so I'm happy with that fact, but I don't have strong guarantees I can rely on. From the readers perspective we also get informations about the dispatch behaviour of the extension just by checking if the keyword is absent or not.

required does not make any sense to me because there is nothing required in a sense of a default implementation, a protocol is not required to have any default implementations at all.
extended does not add anything new to the language (this has been mentioned in the very beginning of this thread) and can stay implicit.
I'm not sure I recall it correctly, but I have a feeling that override does not work because of retroactive conformances?! Can someone please clarify this.
final is completely orthogonal to this proposal because it adds a new functionality instead being a nice code guard like default would be. In fact, I pitched this idea 2,5 years ago but never pursued it really:
- Proposal: Finalization in protocol extensions and default implementations

Paul_Cantrell · March 30, 2018, 3:32pm

Oops! I’ve been looking at the old one.

Yes, that does seem like a lot of keywords. I see that there is explanation in the proposal of why it didn’t go with other alternatives. See especially the second item.

hartbit · March 30, 2018, 6:38pm

I totally agree with what @DevAndArtist said. I’m also worried about any solution which introduces too many keywords and would prefer the solution to mirror how override works today. If that means we introduce many warnings, then so be it. This is a lesser price to pay than extra design complexity IMHO.

hlovatt · March 31, 2018, 5:18am

Erica’s et al’s write up is great and spells out some of the problems with protocols and a possible solution.

My preferred options in order are:

Fix protocols properly with breaking changes that make the listed errors, errors.
Erica’s et al’s proposal.
David Hart’s proposal that only the default keyword is added.
Do nothing.

I favour fixing protocols properly, option 1 above, because the longer they stay in the language as is the more problems people will have and let’s face it they are an unfortunately all too common problem. Yes there will be pain now, but much less than the accumulated pain over the lifetime of Swift if they are not fixed. We are accumulating technical-debt every day.

jawbroken · April 1, 2018, 3:32pm

I must admit that I really have no idea what “fixing protocols properly” means at this point, perhaps you could elaborate, but making these issues errors sounds like something that would break retroactive conformance and other code evolution. The difficulty in this space is that it's very easy to come up with many designs that work great in a single module example that you can post on the forums, but very difficult to design something that survives contact with the fact that the protocol and conformance can be in different modules, and those modules can both evolve separately in ways that shouldn't break.

Erica's new pitch seems pretty heavyweight to me, and I would probably prefer some carefully designed warnings (which obviously can't break code) and possibly the addition of a single keyword, perhaps to silence those warnings. Something like this already exists for near-misses in protocol conformances, or at least it was planned to.

xwu · April 1, 2018, 11:23pm

This topic is near and dear to my heart, as it's the very first one that got me involved in Swift Evolution to begin with.

I agree with what I think @jawbroken is saying, that "fixing protocols properly" by fundamentally changing Swift syntax for protocol conformance is almost certainly out of the question as it'd be massively source-breaking. Erica's pitch, with optional keywords, very nicely balances various needs such as source compatibility, retroactive conformance, and clarity.

In the many earlier discussions about this issue, the core team has mentioned that they were very well aware of real-world difficulties with conforming to protocols, and that they intended to improve the Swift compiler in ways that don't require syntax changes. Since last year, some of these changes have landed in the repository which change the real-world user experience. Because releases necessarily lag the top-of-tree, it's likely that many users haven't had much time to experience the new-and-improved diagnostics.

New near-miss diagnostics are quite clever (IMHO) and build on the increasingly accepted style of stating conformances in their own extension. When a user uses that style, near-misses are diagnosed with a compiler warning that can be silenced by moving the near-miss to a separate extension or making the implementation private:

protocol P {
  func foo(_: Int) -> Bool
}
extension P {
  func foo(_: Int) -> Bool { return true }
}

struct S { }
extension S : P {
  func foo(_: Int) { print("Hmm.") }
}
// Warning: Instance method 'foo' nearly matches defaulted requirement 'foo' of protocol 'P'

Note how the compiler is able to make this diagnosis without the use of any keywords. Moreover, we have more information than can be obtained from a required keyword: we can also see which specific protocol's requirement the author intended to implement but missed.

In the past few years, the compiler has also gained the ability to insert, via fixit, missing required members for a protocol conformance. It certainly doesn't work perfectly--yet: with complex protocol hierarchies, sometimes duplication declarations are inserted and sometimes some are missing. However, with ongoing work in SwiftSyntax and other parts of the project, I would anticipate that this can only improve with time. My dream would be eventually to have diagnostics also for unintended mutual recursion when concrete implementations of some methods call defaulted implementations of others that rely on the first.

Between the new near-miss diagnostics and fixits for missing required members, I think the real-world experience of conforming to protocols has improved significantly, changing the pros and cons of additional keywords. As a result, I'd advocate for allowing users more time to experience these improvements in Swift 4+ before forging ahead with additional syntax-based solutions.

Douglas_Gregor · April 2, 2018, 5:16am

Xiaodi has captured my views on the subject fairly well, although I have a couple of comments to add.

Having only been turned on initially in Swift 4.1, the near-miss diagnostics still need a bit of tuning to see if they can address the problem "enough". I do think we're close enough that it doesn't make sense to add a new keyword (or several keywords!) to an already-crowded space. The tools could also do better at handling missing required members, but as Xiaodi says, this will improve over time.

I do think we should tackle the disambiguation problem when there are two protocols with the same requirement. I don't think it comes up often, but the lack of a disambiguation mechanism when it does happen bothers me. Specifically, consider this example:

protocol P {
  func foo()
}

protocol Q {
  func foo()
}

struct X { }
extension X: P {
  func foo() { ... }
}
extension X: Q {
  func foo() { ... } // error: redeclaration of 'foo', so I can't have P's foo and Q's foo be different
}

My proposal is that one can qualify the declaration to specify which requirement it is satisfying. Such a method is only reference able via the protocol:

extension X: P {
  func P.foo() { ... } // only used to satisfy the "foo" requirement of P
}

extension X: Q {
  func Q.foo() { ... } // only used to satisfy the "foo" requirement of Q
}

Note that we've eliminated the ambiguity, but we've also provided a specific mechanism by which we can state our precise intent to satisfy a particular requirement. It's better than a keyword because we've stated which protocol we're referring to, so it's clearer. It also helps fill in the gaps where near-miss checking doesn't work so well. For example, near-miss checking depends somewhat on the convention that one writes a new extension for each protocol conformance. However, you might not be able to do that if your conformance relies on something that must be written in the main type definition, such as a stored property or a required initializer. For example:

protocol Initable {
  init()
}

protocol Name {
  var name: String { get set }
}

class C : Initable, Name {
  var name: String  // can't move this to an extension!
  required init() { } // can't move this to an extension!
}

We can't benefit from near-miss checking when we can't move those conformances to extensions, but we could be more specific with my proposal:

class C : Initable, Name {
  var Name.name: String  // can't move this to an extension!
  required Initable.init() { } // can't move this to an extension!
}

Doug

Jon_Shier · April 2, 2018, 5:51pm

It seems like this syntax could also be used in other cases, like extensions from multiple modules having the same name. The ability to disambiguate between Module1.String.camelCased and Module2.String.camelCased would be great.

Erica_Sadun · April 2, 2018, 7:44pm

The revision does two important things:

It showed the entire scope of what could be accomplished with role annotations.
It got people to actually finally pay attention to the proposal.

Proposals seem to go nowhere unless they get some team member onboard (and even then it can still sit with a pull release, which this one does not yet have) without action or feedback. It can be extremely frustrating.

What I need to gauge is what parts to push and what to ditch. Experience suggests that "alternatives considered" have not ever been considered in the core review if they expand the proposal's scope. It's a better bet to over-propose and have the team cut away what they don't want.

I'm open to feedback as to what will give the biggest bang for the buck with the least disruption. That said, I'm leading with the full solution (which is what I have currently posted in the gist) because that one actually solves the most problems and proactively prevents the errors listed at compile time.

VladimirS · April 2, 2018, 9:27pm

As X is a struct, foo() is a method of X in first place. So how then we can call .foo() on X instance ?
X.foo() // which foo() should be called?
X.Q.foo() // do you propose this syntax ?
What if X itself had a foo() method in its declaration block?

Tino · April 2, 2018, 9:32pm

The easiest answer is

let x = X()
(x as P).foo()

(and imho that case is not common enough for additional sugar on top)

Douglas_Gregor · April 2, 2018, 9:33pm

They'll be ambiguous, so you'll need to go through something that treats X as a P or a Q, e.g.,

func callPFoo<T: P>(_ t: T) { t.foo() }

It's worth considering some kind of scope resolution reference, yes. X.Q.foo() itself introduces other ambiguities, which is why X.Q::foo() has come up before as an unambiguous alternative.

I assume that it would hide P's foo and Q's foo.

Doug

VladimirS · April 2, 2018, 10:10pm

Probably you are right and I'm missing something, but FWIW I can't accept the assertion that instead of having strong tools(new keywords, syntax, attributes etc) to fix the problem(no, I don't understand the fear of new keyword/syntax/attribute for such an important problem), we should rely on some magic "very intelligent" diagnostic, that should(aha) warn you in all cases when "something is wrong".

Compiler just can't decide if the method in protocol extension block was really a default implementation or it is just extension method. Only author of code can explicitly mark such method to prevent hard to find bugs in future(for example, when method in protocol declaration was renamed, but method in protocol extension was not). The same is IMO true for other problems related to protocols - no one diagnostic will guard from these hard-to-find bugs.

Yes, diagnostic can help in some concrete situations(as was said), and should additionally help to watch on code related to protocols, but we need concrete and clear solution in first place.

Strong +1 to Howard Lovatt regarding the preferred options.

VladimirS · April 2, 2018, 10:26pm

Yes, seems (x as P).foo() is a nice solution in this case. And seems like the proposed syntax can help with default implementations without new keyword:

protocol P {
func foo()
}

extension P {
func P.foo() {..} // default implementation of protocol requirement
func fuu() {..} // "just" protocol extension method
}