Pitch: Protocols with private fields

Alvae · August 14, 2021, 11:45am

In a protocol, all fields (properties and methods) will get the access visibility of the conforming type. For instance, conforming to a protocol with a public type will prompt all of its requirements to be public.

public protocol TreeVisitor {
  mutating func visit(node: Node)
  mutating func visit(leaf: Leaf)
}

public struct TreePrinter: TreeVisitor {
  private var parentKeys: [Int] = []

  // 'visit(node:)' must be 'public'
  public mutating func visit(node: Node) {
    parentKeys.append(node.key)
    node.lhs.accept(&self)
    node.rhs.accept(&self)
    parentKeys.removeLast()
  }

  // 'visit(leaf:)' must be 'public'
  public mutating func visit(leaf: Leaf) {
    print(parentKeys.map(String.init(describing:)).joined(separator: " "), leaf.key)
  }
}

Private fields in protocols have already been discussed and I completely agree with the conclusions that have been drawn in the past:

A protocol describes an API, guaranteeing to its clients that a particular set of fields will be present in the conforming type. Thus, it makes no sense to hide parts of that API.
Conforming types must be able to "see" what requirements they are compelled to implement. If a protocol had fileprivate fields, for instance, then they would be technically "invisible" to a type declared in another file.

One problem, though, is that we can't define default implementations that rely on an encapsulated state (a.k.a. stateful mixins in other circles).

Imagine, for instance, that I would like to create a reusable implementation of a tree walker that simply traverses a tree, calls methods before and after visiting each node, and also records the keys it sees during the traversal. Clearly, that is simple to implement with a class:

open class TreeWalker {
  public private(set) final var parentKeys: [Int] = []

  open func willVisit(_ tree: Tree) -> Bool { true }
  open func didVisit(_ tree: Tree) -> Bool { true }

  public final func walk(_ tree: Tree) -> Bool {
    guard willVisit(tree) else { return true }
    return traverse(tree: tree) && didVisit(tree)
  }

  internal func traverse(tree: Tree) -> Bool {
    // The implementation is irrelevant to this discussion.
  }
}

Unfortunately, using a class prevents me from defining tree walkers with value semantics (well, to be fair, we could write classes that behave like values, but that's generally not straightforward).

If I wanted to turn this class into a protocol with a default implementation of the traversal logic, I wouldn't be able to make parentKeys a read-only property. That breaks the encapsulation principle, because clients of a TreeWalker need only to know about willVisit(_:), didVisit(_:) and walk(_:).

We can achieve some level of encapsulation by defining implementations in extensions, without matching requirements.

public protocol TreeWalker { ... }
extension TreeWalker {
  // 'traverse(tree:)' is not visible beyond the module boundary
  internal func traverse(tree: Tree) -> Bool { ... }
}

Unfortunately, that does not solve the issue for parentKeys, because of two problems:

I'd like that property to be publicly visible, just not writable. However, I can't distinguish between these two capabilities: I must either define an implementation that is completely hidden, in an extension, or expose both read and write access as part of the protocol's API.
I'd like to provide a default implementation of storage.

Note that I can't simply define parentKeys as a read-only requirement and define a default, writable alternative in extension, because of the second issue. There is no way to avoid the conforming type to implement public (up to its own visibility), writable storage, thus breaking encapsulation.

I can imagine a solution for each of these problems. Both are orthogonal, but would work together, I think. The first idea is probably much more realistically implementable. If anything, I'd like that thread to focus on that one.

Interpret access modifiers as lower bounds

A simple way to address the visibility issue would be to state that access modifiers in protocols define a lower bound on the visibility of a particular requirement in the conforming type. The absence of any modifier would denote the same semantics as now.

protocol P {
  var foo: Int { get }
  var bar: Int { fileprivate(get) }
  internal func ham() -> Int
}

public struct S: P {
  // 'foo' must at least 'public'
  public var foo: Int
  // 'bar' must be at least 'fileprivate'
  internal var: Int
  // 'ham()' must be a least 'internal'
  func ham() -> Int { 42 }
}

With that approach, I could write a default implementation that relies on fields that are not necessarily part of conforming type's public API, and that can be encapsulated. Nonetheless, the client would still be allowed to decide the access level of any requirement, should they decide to expose them anyway. That last bit is important, because we wouldn't want conformance to a specific protocol to exclude conformance to another.

Here's how I could keep parentKeys encapsulated:

public protocol TreeWalker {
  var parentKeys: [Int] { get private(set) }
  /// ...
}

// In another module

public struct ClientWalker: TreeWalker {
  // read access must be 'public', because 'ClientWalker' is 'public'
  public private(set) var parentKeys: [Int] = []
  // ...
}

One minor problem with this approach is that it changes the meaning of an access modifier in the context of a protocol declaration, which might be counterintuitive. For instance, fileprivate would mean "only visible in the files defining the types that conform to this protocol", not "only visible to this file".

Encode state in the witness table

The previous feature would still require conforming types to provide an implementation of every stateful properties. In the walker example, that means that we need to add a writable property parentKeys in all conforming types (as in the snippet above). While that is a minor inconvenience, it implies that we cannot provide default implementations that also take care of storage.

It should be stressed that the restriction makes complete sense w.r.t. the way conformance is implemented (to the extent of what I think I know and understand). If S conforms to P, then the corresponding entry in S's protocol witness table is just a collection of pointers to functions wrapping the actual implementation of each requirement defined by P. No storage required.

We could imagine that states provided by default implementations be referred in the witness table as well. We would add another entry: a pointer to a block of memory backing non-computed properties for which the conforming type provides no implementation.

More concretely, consider the following example. The keyword synthesizable indicates that conforming types need not to provide a default implementation of that property.

protocol Counter {
  synthesizable var value: Int { get set }
  mutating func inc()
}
extension Counter {
  func inc() { value += 1 }
}

struct S: Counter {}

The entry for Counter in S's protocol witness table would look like that (written in Swift for clarity):

struct SCounterWitnessTable {
  let getValue: FunctionRef
  let setValue: FunctionRef
  let modifyValue: FunctionRef
  let inc: FunctionRef

  // That would be new
  let defaultStorge: UnsafeMutablePointer<SCounterWitnessStorage>
}

struct SCounterWitnessStorage {
  var value: Int
}

Of course, the value witness table of S would also need to care about that default storage to properly copy and destroy an existential container, should the protocol contain synthesizable properties. Otherwise, there would be no impact on the current behavior, as all conforming types would still be compelled to provide an implementation.

With that feature, combined with the solution to the visibility problem above, I could rewrite TreeWalker as a protocol:

public protocol TreeWalker {
  synthesizable var parentKeys: [Int] { get private(set) }

  mutating func willVisit(_ tree: Tree) -> Bool { true }
  mutating func didVisit(_ tree: Tree) -> Bool { true }
}

extension TreeWalker {
  public mutating func walk(_ tree: Tree) -> Bool {
    guard willVisit(tree) else { return true }
    return traverse(tree: tree) && didVisit(tree)
  }

  internal mutating func traverse(tree: Tree) -> Bool {
    // The implementation is irrelevant to this discussion.
  }
}

Paul_Cantrell · August 25, 2021, 2:39am

Hi, Dimitri. I’ve also wanted something that’s quite a bit simpler than your many ideas here, but I think related. It consists of two closely related features:

Allow a protocol to specify a requirement that will be available to extension methods but not to clients of the protocol, and
allow types to expose specific members to specific protocol conformances.

In short, I’d like to be able to decouple “exposed to protocol implementation” from “exposed to protocol clients.”

Using your example of tree traversal, here is (1) in action:

protocol Tree {
  implementation var children: [Self] { get }  // ignore strawman keyword; just note the semantics

  func traverse(_ visit: (Self) -> Void)
}

extension Tree {
  func traverse(_ visitor: (Self) -> Void) {
    for child in children {  // ✅ `children` visible to protocol implementation
      child.traverse(visitor)
    }
    visitor(self)
  }
}

var foo: Tree = ...
foo.children  // ❌ `children` not visible to protocol clients

…and (2) in action:

struct Doodad: Tree {
  var children: [Doodad]  // not private, so nothing special needed here
}

struct Widget: Tree {
  @exposed(to: Tree, as: children)  // again, ignore strawman syntax and note the semantics
  private var subwidgets: [Widget]  // private, but explicitly allowed to participate in protocol conformance
}

…or maybe you expose members in the conformance, I don’t know, syntax is to work out if the bigger idea seems compelling:

struct Widget {
  private var subwidgets: [Widget]
}

extension Widget: Tree(exposing: children) {
  private var children: [Widget] { subwidgets }
}

I’ve run across a handful of situations where the ability to do this would have made code much less awkward.

It seems consistent to me with the philosophy of protocols, and of witness tables in particular, which say both (1) “T is a P”, and also (2) “here is how T is a P.” It makes sense that something could be hidden to 1, but visible to 2.

Dmitriy_Ignatyev · September 25, 2021, 7:26pm

Hello, thanks for writing this pitch. Let me share some of my thoughts and ideas.

We can add something new, similar to protocol, but with some features and limitations. Let's say call it ReifiedProtocol. It can be used as an implementation constraint or for reuse of common logic.

Example:

public protocol TreeVisitor {
  mutating func visit(node: Node)
  mutating func visit(leaf: Leaf)
}

reifiedProtocol TreeVisitorReified: TreeVisitor {
  var parentKeys: [Int] { get set }
  var someData: [String: String] { get set }

  func usefulFunc() -> String

  mutating func visit(node: Node) {
    parentKeys.append(node.key)
    node.lhs.accept(&self)
    node.rhs.accept(&self)
    parentKeys.removeLast()
    
    if let value = someData["key"] {
      print(value)
    }

    print(usefulFunc())
  }
}

struct TreeVisitorImp: TreeVisitorReified {
  private var parentKeys: [Int] = [] // Compiler requires to add this property
  private(set) var someData: [String: String] = [:] // Compiler requires to add this property

  private func usefulFunc() -> String {} // Compiler requires to implement this method declared in TreeVisitorReified

  mutating func visit(leaf: Leaf) {} // this method is only declared in protocol and not implemented in TreeVisitorReified, so we need to implement it here
}

The rules are:

We can't use ReifiedProtocol as a Type

let visitor: TreeVisitorReified = TreeVisitorImp() // compiler Error: TreeVisitorReified is not a Type, use TreeVisitor instead

Compiler requires to implement all properties and methods declared in ReifiedProtocol, as with Protocols.
In implementation we can use any access level for declarations in ReifiedProtocol.
Protocol methods can be implemented in ReifiedProtocol.
Here we need to think:

they behave the same as protocol default implementations and can be overridden
or
they are treated as final implementation and can not be overridden

When implementing concrete type, we have a choice:

use TreeVisitor protocol and implement everything form the ground
use TreeVisitorReified with default implementations

This theme also correlates with deferred pith for abstract classes, which have similar abilities but also have reference semantics.

What do you think about it?

Stan_Smida · November 29, 2021, 11:06pm

Can we maybe take a step back and go from this pitch(es) to a discussion?

When it comes to value types, I also feel that POP lacks to provide common implementations in comparison to class inheritance. It is obvious that subclassing is something completely different from adopting a protocol, yet the legendary Protocol-Oriented Programming in Swift from WWDC15 advertised that a need for a common implementation should no longer talk into reference vs. value type decisions.

Or should?

protocol Countable {
    var count: Int { get implementationSet }
    mutating func add(_ value: Int)
}

extension Countable {
    mutating func add(_ value: Int) {
        count += value
    }
}

Can this ever be legitimate? A humble (even heretic) thought in context of protocols while a matter of course in context of classes. What makes that difference?

Could a protocol ever define requirement exposable only to default implementation? Is there a problem with that in the language or in a paradigm?

Alvae · December 1, 2021, 4:02pm

It is obvious that subclassing is something completely different from adopting a protocol

In what ways?
I think I don't agree but I am not sure to understand what you mean.

Can this ever be legitimate?

Why it wouldn't?

Obviously, adding elements to a presumable countable collection should modify its value. Why would you impose reference semantics?

Again, I'm afraid I'm not sure to follow.

Could a protocol ever define requirement exposable only to default implementation? Is there a problem with that in the language or in a paradigm?

I think a protocol could, which is why I proposed this pitch.
If you see any problem in the language or the paradigm that would prevent that, I'd love to know!

Avi · December 1, 2021, 4:30pm

One cannot refine the default implementation by conforming and then calling the default implementation from the conforming method.

Alvae · December 1, 2021, 4:37pm

Thanks for clarifying.

I still do not agree that subclassing is completely different from protocol conformance. A lot of class hierarchies are defined only for the purpose of polymorphism, not for the purpose of overloading the base class' behavior.

There are also ways to circumvent this apparent shortcoming. We could split part of a method's implementation so that "overridden" methods would be able to call the common behavior.

protocol P {
  func commonBehavior(arg: T) -> U {
    // some default implementation
  }
  func specializedBehavior(arg: T) -> U
}

struct S: P {
  func specializedBehavior(arg: T) -> U {
    doSomethingSpecial()
    return commonBehavior(arg: arg)
  }
}

Finally, I do not see any obvious limitation preventing the language from offering a syntactic construct allowing us to access default implementations.

Avi · December 1, 2021, 4:52pm

Accessing the default implementation is comparable to the direct subclass of a superclass. How would you deal with chains or hierarchies of subclasses without reinventing subclassing?

Regardless of the merits of classes over PoP, there are, and will be, patterns that are trivial with one or the other.

Alvae · December 1, 2021, 4:59pm

I am not claiming that one should try to reimplement subclassing and I am not arguing against the concept of subclassing.

I believe protocols are not meant to replace class hierarchies, they meant to provide an alternative strategy to achieve polymorphism. Chaining overridden implementations is a very specific pattern that is not essential to polymorphism.

Stan_Smida · December 1, 2021, 9:11pm

How can it?

Protocol just defines interface. What would be a meaning of an interface defined yet not accessible?

For an instance, this definition means that an adopter must implement foo, but allows adopting type to restrict accessibility just within file. Apparently it will be accessible within the protocol extension implementations.

protocol P {
    var foo: Int { fileprivateGet }
}

But what is the point to restrict access to foo in a conforming type, if anyone who has an access to the protocol definition knows how to access it?

extension P {
    var hijackedFoo: Int {
        foo
    }
}

This is why protocol requirements cannot have lover access level requirements than the protocol itself.

When it comes to common default implementation that needs fileprivate access levels, I'm afraid we are limited to do such implementation with all adopting types in that file.

This is an example how var foo: Int { get fileprivateSet } can be achieved:

protocol P {
    var foo: Int { get }
    func bar()
}

private protocol _P: P {
    var foo: Int { set }
}

extension _P {
    func bar() {
        foo = -1
    }
}

struct S: _P {
    fileprivate(set) var foo: Int
}

Now you can call S(foo: 0).bar() outside of the file and foo will be set to -1 via default implementation.

Alvae · December 5, 2021, 3:36pm

Perhaps you misunderstood my pitch.

First, let me stress that I agree with your premise:

We disagree on the conclusion:

Currently, a protocol prescribes that all its requirements have at least the same access level as the whole conforming type. You can (although that's probably useless) implement the requirement with a higher access level. So, in fact, requirements have a lower bound defined by that of the protocol's access level.

The heart of my pitch is to let allow protocols to specify other lower bounds on individual requirements.

So, in your example, fileprivate would have a different meaning than in the context of a standard type declaration. It would indicate that conforming types should at least provide a fileprivate implementation of that requirement. Consumers of the protocol would not be allowed to expect the requirement to be visible, unless they are declared in the same file as the protocol.

// in P.swift
public protocol P {
  var foo: Int { get fileprivate(set) }
  var bar: Int { get private(set) }
}

// In S.swift
public struct S: P {
  public internal(set) var foo: Int
  public fileprivate(set) var bar: Int
}

// In main.swift (same module)
let s = S(foo: 1, bar: 2)
print(s.foo) // OK, getter is public
s.foo = 3    // OK, setter is internal
s.bar = 3    // Error, setter is fileprivate

let p: P = s
print(p.foo) // OK, getter must be at least public
p.foo = 3    // Error, getter might be as low as fileprivate
p.bar = 3    // Error, getter might be as low as private

Notice that the conforming type is allowed to choose the access level with which it wants to expose its requirements, as long as they are higher than what the protocol prescribes.

internal protocol Q {
  var foo: Int { get set }
}

// The extension is well-typed because S chose to expose
// foo's setter as internal.
extension S: Q {}

What the conforming type cannot do is to implement the requirement with a lower access level than prescribed, as it would violate the assumptions that consumers of the protocol can make.

struct T: P {
  private var foo: Int // Error, `foo` must be a least fileprivate
}

I believe that we could encapsulate behavior in default implementations using that approach. I provided an example in the original post. I'll add another based on your example as template:

// in Q.swfit
public protocol Q {
  var foo: Int { private(get) private(set) }
}

extension Q {
  public mutating count() -> Int {
    foo += 1 // OK, a type can always access its own
             // properties regardless of their access level
    return foo
  }
}

// In U.swift
public struct U: Q {
  // foo is invisible to the consumers of this type
  private var foo: Int = 0
}

// In main.swift
var q: Q = U()
print(q.foo)     // Error, getter might be as low as fileprivate
print(q.count()) // OK, prints 1
print(q.count()) // OK, prints 2

Stan_Smida · December 5, 2021, 4:02pm

Oh sorry, I can see it now. Yeah... I think I like it!

dabrahams · December 5, 2021, 5:11pm

Could you accomplish the same things with scoped conformances?

Alvae · December 5, 2021, 5:45pm

There is definitely some overlap, but I think the goals are a bit different.

IIUC, scope conformance would only allow to provide a conformance that does not need to be exposed outside of an access' boundary. However, I would like to expose the conformance, only without having all internal details exposed (in particular w.r.t. mutation) at the same level.

One way to illustrate is to think of an AST library. Inside the library is defined a visitor protocol whose default implementation just walks an AST and calls one or several methods to interact with the visited nodes. Clearly, consumers of that library may need access to such a protocol, but they might not be interested in the internal shenanigans that the library does to configure the state of the walker.

In that specific example, maybe we can achieve a similar design with scoped conformance using two protocols. One Walker protocol describing the general public API and another _WalkerImpl protocol to deal with the "internal shenanigans". Specific walkers would conform to _WalkerImpl internally and expose their conformance to Walker.

In a more general setting, though, I think that requirement bounds are more flexible. The problem with _WalkerImpl is that it would be internal, preventing consumers from inheriting default implementations defined over there. If it was also exposed, then we would loose the advantage of trying to encapsulate behavior in the first place.

Further, requirement bounds might be slightly simpler. In particular, they would not change Swift's current conformance resolution strategy, AFAICT, and the dynamic example from the generic manifesto would not require the user to "think in scopes" to build their own mental model of how dispatch should behave.

That being said, scoped performance would have one advantage over my approach: the ability to actually scope the conformance itself and the associated benefits of that feature .

dabrahams · December 5, 2021, 7:05pm

Sure. I just wonder if the same thing might not be accomplished by composing public protocols, but using one private or internal conformance. What I have in mind is,

// Module A
public protocol X { ... }
public protocol EasyXImpl {}

extension X where Self: EasyXImpl {
  // implementations of X requirements in terms of EasyXImpl requirements
}

// Module B
import A
public struct Y: X private EasyXImpl {
  // EasyXImpl requirements
}

That would have a few advantages:

The protocol system would retain its simplicity
A protocol would have a single, well-understood meaning that doesn't change across access levels
We'd avoid adding another language feature, since I believe we need scoped conformances anyway.
You'd still have the option to create X conformance without EasyXImpl. In fact there might be several EasyXImpl variants for different means of achieving that conformance.

Stan_Smida · December 5, 2021, 10:52pm

No, unfortunately scoped conformances won't help. They allow to limit visibility of a conformance. The problem we are talking about is quite orthogonal to this, we want to limit visibility of a required property but expose it to default implementation.

xwu · December 6, 2021, 12:15am

Actually, I think this use case would be served quite well with scoped conformances, and I agree with @dabrahams that scoped conformances would be a more expressive feature that additionally enables other use cases.

Here, you’d have the public API guaranteed by protocol P, and then the implementation-only property would be a requirement of a distinct protocol Impl: P to which the same type would have a scoped conformance, and the default implementation of P’s public requirements would be implemented in extensions of P where Self: Impl.

Stan_Smida · December 6, 2021, 7:22am

I don't see it there. Can you please rewrite last example from this post to work with scoped conformances?

xwu · December 6, 2021, 9:06am

It's exactly as @dabrahams has just outlined above:

public protocol Q {
  mutating func count() -> Int
}

protocol QImpl { // Optionally, QImpl may refine Q.
  var foo: Int { get set }
}

extension Q where Self: QImpl {
  public mutating func count() -> Int {
    foo += 1
    return foo
  }
}

public struct U: Q, private QImpl {
  fileprivate var foo: Int = 0
}

gwendal.roue · December 6, 2021, 9:23am

Not quite: in the original outline from @dabrahams, his EasyXImpl (your QImpl) is public. This is a notable difference. The pitch is about private implementation details.

Can scoped conformances deal with an EasyXImpl / QImpl protocol which is not public? If not, could they become able to do it?