Confusion when reading large open source library codebases

xwu · February 7, 2020, 2:29pm

Neither of these is even desirable to guarantee.

There will often be members semantically related to protocol requirements, sometimes refactored out of them, that are not themselves requirements. It makes sense to place them in proximity to the requirements to which they relate.

There can be members that are requirements of more than one protocol which happen to compose well together. There can be members that happen to be requirements of some protocol, but are semantically so related to other members of the type that have nothing to do with the protocol, and which would exist on the type whether the protocol conformance were implemented or not. It can make sense to organize these members elsewhere, apart from other protocol requirements.

No, the reason one should use extensions to structure protocol conformances has to do with neither of these points.

Rather, this strategy allows you to build up a series of conformances to a protocol hierarchy in layers, and to have the completeness of your conformance at each step checked by the compiler. This is not just a nice-to-have, but rather it is essential to avoid creating infinite recursion inadvertently.

Consider the following toy protocol hierarchy:

// written freehand; pardon any typos
protocol A {
  var isFrobnicated: Bool { get }
}
protocol B : A { }
protocol C : B {
  func frobnicated() -> Int
}
extension C {
  func frobnicated() -> Int { isFrobnicated ? 42 : 0 }
}

Now consider this conformance:

struct S : C {
  var isFrobnicated: Bool { frobnicated() != 0 }
}

Yikes! Now you've got an infinite recursion. What went wrong?

Default implementations of protocol requirements are built from requirements that aren't defaulted. (It couldn't be otherwise, if you think about it.) You always run the risk of infinite recursion if you implement a non-defaulted protocol requirement by calling a default implementation provided in the same or a more refined protocol in the hierarchy. Even if the default implementation today doesn't call the requirement you're implementing recursively, a future version could.

So how do we decrease the risk of making this mistake unintentionally?

In my experience, it's usually pretty obvious when a requirement of A calls a default implementation of another requirement of A. Possibly in part because, when you're working on it, these requirements are all at the forefront of your mind.

It's much more difficult to reason about requirements that have default implementations elsewhere in the protocol hierarchy; it can easily slip your mind which members you've implemented yourself and which ones you didn't, particularly when requirements differ only by argument or return type. Sometimes, with this kind of overloading, you may think you're calling a member required by a less refined protocol which you've already implemented, but mistakenly call a member required by a more refined protocol which you haven't implemented, but which has a default implementation!

This is where it helps greatly to build up the conformance layer by layer. If first we ensure that S has a working conformance to A without trying to conform to B or C, and then we ensure a working conformance to B, and then to C, we can avoid this pitfall. This is because an implementation when conforming to A can't accidentally call a defaulted requirement of C while the type doesn't yet conform to C. The natural way to do this is to write extension S : A { ... } first, ensure that everything compiles and works as expected, then proceed to extension S : B { ... }, and so on.

Do examples of such unintentional infinite recursion actually happen? Yes! Does this strategy of building up conformances in layers actually help? Yes--only by adhering to this stringently was I able to avoid accidentally causing infinite recursion in implementing certain DoubleWidth functions (now sadly relegated to a prototype instead of part of the standard library).

Now, could you take this same strategy but in the end lump everything together again without using a series of separate extensions? Sure, but that's depriving your reader of the ability to follow along in the building-up of the type in the same way that helped you write the code.