ValueSemantic protocol

anandabits · November 2, 2020, 4:56pm

I think it's crucial to observe at the outset of this discussion that this ValueSemantic protocol is a refinement of ActorSendable. The refinement introduces an additional semantic requirement internal serialization of mutation is not sufficient. Instead, copy-on-write must be used for all heap allocated data.

I have long maintained that there is a useful notion of "value" that is closely related to (but independent of) the notion of referentially transparent pure functions. As has already been pointed out, many "value" types include members that are not referentially transparent: Array.capacity, Date.init, UUID.init, Int.random(in:), etc. The relationship as I view it is that "value" types predominantly contain pure members. Impure members are the exception to the rule.

We must acknowledge and embrace this distinction between "value"-ness and purity if we are going to have any hope of providing clear semantics for this protocol. A correct conformance to a protocol can only depend on code that is under control of the author of the conformance. As has already been pointed out, anyone can add a trivial impure extension to any type. For this reason, purity of members cannot be considered a necessary condition of a valid conformance. Further, I argue that if the semantics of this protocol cannot depend on the definition of purity then it isn't necessary to provide a definition at this time (that can be defined separably in a future proposal to introduce support for pure functions).

Semantics of `ValueSemantic`

If the semantics of this protocol aren't (directly) about purity and referential transparency then what are they? To answer this question I'll begin by quoting @Chris_Lattner3's comments about ActorSendable:

At this point it is worth discussing why the requirement is named “unsafe”SendToActor. The reason is that manual implementations (which are rare, see below) could implement it incorrectly and break memory safety: for example, this would be unsafe (and very unwise!) to add to your codebase:
extension NSMutableString : ActorSendable {
  func unsafeSendToActor() -> Self { self }
}
This is why it is important to include the word “unsafe” in the method requirement. However, this power is also extremely useful -- it allows authors of powerful types to conform their types to this protocol, and thereby allow clients of their libraries to pass their types around safely. This is what allows the proposal to support immutable reference types, references to internally synchronized types that use lock free data structures, types that want to be deep copied, etc.

ActorSendable is about ensuring memory safety when references are moved across a serialization boundary. A type that includes stored properties of a non-ActorSendable type a manual conformance is necessary to ensure data races are not introduced. This manual conformance bears great responsibility to uphold the intended semantics of the protocol.

In my view a refining ValueSemantic protocol should be defined using a very similar strategy. The differences are that this time we're talking about a "boundary" that is crossed every time a value is copied. The property we seek to uphold is "no sharing of mutable data". Each instance of the type must be semantically independent. (note: independence of copies is subtly different than purity / referential transparency: Array.capacity is semantically independent but not referentially transparent)

In order to protect against mutable sharing when stored properties include references to heap data it is essential for these references to be encapsulated: they must not be non-public. When all references are encapsulated in this way it is possible for the author of the type to avoid mutable sharing using copy-on-write. Trivially, access to global mutable data is off limits: a type that stores and uses an Int index into global mutable data should not provide a conformance.

To summarize, in my view this protocol is really about controlling access and use of references to heap allocated data in order to avoid mutation of shared data. While this protocol does not introduce any new requirements, its semantics do support the default implementation of unsafeSendToActor that simply returns self.

Relationship to purity

When we speak of value semantics we are usually talking about more than just the definition provided above. The members of types conforming to the ValueSemantic protocol as defined above will usually be referentially transparent, especially when all parameters also meet the ValueSemantic requirement. In the latter case, as long as a member avoids access to global mutable data it will be referentially transparent (with rare exceptions such as Array.capacity).

Given this close relationship and the conventional assumptions made about value semantic types I think it's important to alert programmers when a member of a value semantic type breaks referential transparency. Because there is no language support, learning about these exceptions and why they matter is often difficult for programmers not already steeped in FP (and even for some who are).

I also think it would be good for the language to understand the referential transparency of members of value semantic types without requiring annotation beyond the syntax used today. A ValueSemantic conformance could imply purity of members, with an annotation required where non-referentially-transparent members are necessary and intended by the programmer. However, this is a topic for a separate thread (either now or in the future).

Conclusion

Even without language understanding of purity I believe a ValueSemantic protocol would aid significantly in everyday reasoning about code. Crucially, it would distinguish types intended to behave in the conventional value semantic way from value types that have nothing to do with value semantics. This distinction is one that is subtle and difficult to teach. This protocol, along with synthesized conformances would go a long way towards helping programmers understand when they are stepping outside the world of "trivially composable" value semantics in their type declarations.

The ValueSemantic protocol is not a replacement for language understanding of purity and referential transparency, but it is a valuable complement to it.

(Note: we could bikeshed on the name of this protocol if the community wants to preserve "value semantic" as referring to purity / referential transparency. Regardless of its name, I think a protocol with the semantics described above would be a good thing)

ValueSemantic protocol

Semantics of ValueSemantic

Relationship to purity

Conclusion

Semantics of `ValueSemantic`