ValueSemantic protocol

Chris_Lattner3 · November 7, 2020, 5:56pm

Yes, my original proposal included unsafe in the name. That thing would only have to be used for reference types and aggregates that compose over other things that are not value semantic.

Purity of operations is a related concept to value semantics, but isn't directly aligned with it. Consider a few problems: many operations on a value semantic type are considered part of its behavior (e.g. you could have a method that goes out and touches global state... like a print method). Pure methods are mostly only useful on structs/tuples (which are already effectively covered by the proposal without them). When you start getting into reference types, you have aliasing and a bunch of other problems that pure doesn't help with.

The analogy with Equatable is important: we could try to invent compiler support to make sure that equatable is implemented correctly, but sometimes you want to ignore mutable state in your type, sometimes you want to look through pointers at unanalyzable stuff (think equatable on Array), etc. Trying to define this model is just not worth it IMO.

That said, I agree that safety and static checking is important :-), but I don't think you can define that at the level of ValueSemantics. I think you could have language support for subcases of this though - e.g. language support for COW types, and other sub cases. I would rather we define the general model and then look at whether it is worth introducing language complexity to solve for these subsets of the problem.

-Chris

Andrew · November 7, 2020, 10:46pm

I’m thinking about this from a user mental-model perspective more than a formal definition.

I think VS has no properties or methods on its protocol, but should conform to some kind of Copyable protocol (which might also be used on class types to implement CoW).

It might be good to define VS in a purity sense as only having access to method parameters, and recursively all stored properties must also conform to VS.

To be more precise you can also access newly constructed or static let VS instances, self is a method parameter, and properties are treated like methods.

With that definition the average user can build up types on existing types, and not need any complex mental model or complex static analysis. You can synthesise VS recursively like we do with Hashable. Some core types will need a trust-based conformance, or a more rigorous definition.

I think you’ll need some kind of trust-based conformance for objects coming from other languages anyway.

VS conformance would have to be in the module the type is defined, and extensions would need to be checked to ensure they are VS pure.

Classes would be special cased to not be allowed to conform, but you could add things like a RefBox, a CoW protocol (copy), and a Memoize to handle common use cases.

This doesn’t actually formally define what VS actually means... but I think for most users you can get away with something loose, with examples using CoW and Equality.

So I guess I’m wondering if you can sidestep the issue of VS needing a robust generalised static analysis pass by just defining it in terms of itself, and have a few building blocks that are valid but whose implementation is not statically guaranteed.

dabrahams · November 7, 2020, 11:43pm

Good; that's an important angle.

I think VS has no properties or methods on its protocol, but should conform to some kind of Copyable protocol (which might also be used on class types to implement CoW).

Well, yeah, typealias Copyable = ActorSendable.

And I think Copyable or Clonable is a better name for ActorSendable, and copy() or clone() would be better names for unsafeSendToActor.

It might be good to define VS in a purity sense as only having access to method parameters, and recursively all stored properties must also conform to VS.

I don't think we can do that. It would break the ability to implement types with value semantics (such as Array) in terms of types without.

With that definition the average user can build up types on existing types, and not need any complex mental model or complex static analysis. You can synthesise VS recursively like we do with Hashable. Some core types will need a trust-based conformance, or a more rigorous definition.

So I guess I’m wondering if you can sidestep the issue of VS needing a robust generalised static analysis pass by just defining it in terms of itself, and have a few building blocks that are valid but whose implementation is not statically guaranteed.

Having two definitions, one of which is not rigorous, is strictly worse than having one definition that always works. The definition I gave works without any “static analysis path.” It lets users compose ValueSemantic types from other ones without any complex mental model. So I don't understand why you're trying to come up with a different definition.

anandabits · November 7, 2020, 11:52pm

Chris very specifically wanted to allow for internally synchronized reference-semantic types. Those names wouldn’t be appropriate for the semantics he intended. Do you disagree with those semantics?

Andrew · November 8, 2020, 12:04am

Sorry, it might have been unclear from what I said. My intention was that the core language would provide types for adapting non-VS types into VS types.

I guess the difference is mine only tries to say “VS types are made from VS types”, yours has a lot of other requirements, like around mutability. Perhaps my suggestion can be thought of as moving the semantics themselves into the implementation of a few core adapter types (where yours is primarily about defining those semantics).

Those semantics themselves, and likely the requirements of yours, are enforced through those adapter types (like CoW<T: Object + Copyable>).

Perhaps that’s too constraining, but iirc the implementation of things like Array are essentially a class wrapped in a CoW like abstraction.

Edit: perhaps it’s just that it doesn’t solve that well. It wouldn’t generally allow any non-VS type to be used in a VS type safely. It would just provide adapters for common patterns where you’d normally require a non-VS type.

Maybe I’ve missed something and trying to solve a problem that doesn’t exist, in which case sorry for the noise

dabrahams · November 8, 2020, 12:19am

Yep.The shortest layman model of the properties is:

A type has value semantics when every variable of that type has an independent value that can't be observed or changed through other variables of that type

That glosses over the thread-safety property, though. While most types satisfying the above are as threadsafe as Int, a CoW type implemented with non-threadsafe reference counting is not. So the shortest complete layman model is:

A type has value semantics when:

Every variable of that type has an independent value that can't be observed or changed through other variables of that type, and

Two threads cannot race by reading or writing the values of distinct variables of that type.

That said… I don't think you can implement that non-threadsafe CoW type in Swift today, so maybe the simple model is everything we need.

That said, I think everyone, including the layman, also needs to understand the composability and other properties of a type with value semantics, that we get as a consequence of that model, so I'm working on a separate article about that.

dabrahams · November 8, 2020, 12:22am

No, that's a good point. I guess that leads to: ValueSemantic refines Clonable which refines ActorSendable.

dabrahams · November 8, 2020, 1:19am

Ah, now I see what you're up to. Thanks for explaining. To do that I guess you need to identify all the value-semantic adapters needed to build all the value-semantic types we'll ever want. Sounds hard to me.

Perhaps that’s too constraining, but iirc the implementation of things like Array are essentially a class wrapped in a CoW like abstraction.

Yeah, sort of… but I think you'll find if you try to encapsulate all the value semantics in adapter types, you can't efficiently implement lots of the mutations. CoW does not actually copy in all cases where the buffer is not uniquely referenced; if you do that for an insertion at the beginning of the array, for example, you end up writing the same memory twice.

Anyhow, I think it's an interesting exercise to try. Good luck!

Karl · November 8, 2020, 2:44am

What would be the relationship between these and the built-in assignment operator? Would let x = y call y.clone() if the type conforms to VS?

dabrahams · November 8, 2020, 3:10am

Nope, I'm not proposing that. C++ interop may imply that we'll get eagerly-copied value types with dynamic storage eventually, but I'm not comfortable adding that to the core language as a byproduct of solving the value semantics problem.

Chris_Lattner3 · November 8, 2020, 8:23pm

Actually I'm curious - do we want to make sense for things that are value semantic but not threadsafe? E.g. a CoW type that isn't implemented with atomic RC?

Not saying this is a good thing, but we could implement that model and having ValueSemantic be separate from ActorSendable to model this. Array would conform to both, but NonAtomicCoWArray could conform to VS but not AS.

-Chris

dabrahams · November 8, 2020, 9:46pm

I don’t think we do want to (though note this is not about what most people think of as “threadsafe types,” whose instances can be shared—not just copied or moved—across threads). Especially with the cost of uncontended atomic operations coming down, carving out a special category for types less threadsafe than Int doesn’t make sense to me. The principle of concept requirement clustering says that, if we are going to have ActorSendable, these things go together.

I have some serious reservations about actors, but that’s mostly a different story.

Joe_Groff · November 9, 2020, 8:47pm

On further thought, the refinement of ActorSendable doesn't even really make sense as a definition of "value semantics", because the set of things that are sendable-as-self includes among other things actor references, references to safe concurrent data structures, and unique references to isolated object graphs, none of which probably meet our intuition of what "value type" normally means. If a refinement of ActorSendable makes sense as a way of enabling specialized behavior (such as collections sending as self when their elements send as self), then ValueSemantics would be the wrong concept to tie to that specialization.

dabrahams:

Requirements of Value Semantic Types

When we say “type X has value semantics,” we mean:

Each variable of type X has an independent notional value .

A language-level copy (e.g., let b = a ) of a variable of type X has an
equivalent value.

Given a local variable a of type X , safe code cannot observe the value of a
except via an expression that uses a .

Given a variable a of type X , safe code cannot alter the value of a
except by one of the following means applied to a or to a property of a
that reflects all or part of a 's value.

assignment.

invocation of a mutating method.

invocation of a mutating accessor of a property or subscript

passing the expression as an inout parameter.

Concurrent access to the values of distinct variables of type X cannot
cause a data race.

At least until Swift gains move-only types, these properties hold for every type. The value of a class type is the reference. It seems to me that the closest thing you can get to a type level "value semantics" concept in Swift is ultimately the "copyable" ability.

dabrahams · November 9, 2020, 9:56pm

Joe_Groff:

dabrahams:

Actually, as I said in my reply to Joe, that fact is incidental to the reason I opened the thread. I'd be interested in the answer to the question, as I described it to Joe, even in the absence of ActorSendable . That said, I'd consider a definition of ValueSemantic to be a failure if it admitted types where the default implementation {self} for unsafeSendToActor was incorrect.

On further thought, the refinement of ActorSendable doesn't even really make sense as a definition of "value semantics", because the set of things that are sendable-as-self includes among other things actor references
references to safe concurrent data structures, and unique references to isolated object graphs, none of which probably meet our intuition of what "value type" normally means.

No they do not, but I can't understand why you would think that implies that the refinement doesn't make sense. It seems as though maybe you're really just saying there should be an intermediate “sendable-as-self” protocol.

dabrahams:

Requirements of Value Semantic Types

When we say “type X has value semantics,” we mean:

Each variable of type X has an independent notional value .

A language-level copy (e.g., let b = a ) of a variable of type X has an
equivalent value.

Given a local variable a of type X , safe code cannot observe the value of a
except via an expression that uses a .

Given a variable a of type X , safe code cannot alter the value of a
except by one of the following means applied to a or to a property of a
that reflects all or part of a 's value.

assignment.

invocation of a mutating method.

invocation of a mutating accessor of a property or subscript

passing the expression as an inout parameter.

Concurrent access to the values of distinct variables of type X cannot
cause a data race.

At least until Swift gains move-only types, these properties hold for every type. The value of a class type is the reference.

Not in many programmers' mental programming models, and definitely not for some immutable final classes. If that were the case we wouldn't have === distinct from ==, we wouldn't allow Equatable conformance for classes, etc.

As I say further along in that post, any type could conform to ValueSemantic and the meaningfulness of that conformance depends on having a definition of “value” for the type. If you want to say every class “has value semantics” but its value is reflected only by ObjectIdentifier(x) and ===, that's your perogative, but I don't think that will be a very useful definition for most people.

David_Sweeris · November 9, 2020, 9:58pm

Should we consider adding a nonmutating keyword for class functions & computed properties to let them get in on the fun?

Joe_Groff · November 9, 2020, 9:59pm

Well, that again gets to the point that you can't really define a notion of "value" independent of some set of operations that apply to that notional value, and that it's the properties of the operation that ultimately matter. As is, you can define a value over some set of operations for every type.

dabrahams · November 9, 2020, 10:06pm

Joe_Groff:

dabrahams:

Not in many programmers' mental programming models, and definitely not for some immutable final classes. If that were the case we wouldn't have === distinct from == , we wouldn't allow Equatable conformance for classes, etc.

As I say further along in that post, the meaningfulness of “has value semantics” depends on having a definition of “value” for the type. If you want to say every class “has value semantics” but it's value is reflected only by ObjectIdentifier(x) and === , that's your perogative, but I don't think that will be a very useful definition for most people.

Well, that again gets to the point that you can't really define a notion of "value" independent of some set of operations that apply to that notional value, and that it's the properties of the operation that ultimately matter. As is, you can define a value over some set of operations for every type.

Yes, and what I'vewe've done is to tie “value” and “safety” together in such a way that they allow us to reason about the behavior of operations on compositions of types whose notional value has been defined by the programmer, without repeating the same mental work for every single new operation that's written. These are the common cases that it's important to be able to work with most easily. And, I've explained how to extend the model to cover all compositions in a way that builds on the basic model.

Chris_Lattner3 · November 10, 2020, 5:13am

Yes, I generally agree with you, but for a different reason. Swift runs on a lot of non-apple platforms and that is important to consider. My rationale for agreeing with you is just that it fragments the world too much and would cut against the goal of making this safe by default. We could definitely model this, but it isn't worth it IMO.

-Chris

dabrahams · November 10, 2020, 5:23pm

That's actually the same reason; the principle of concept requirement clustering is all about avoiding fragmentation. The granularity of concepts should be enough to make the important distinctions between models, not every conceivable distinction. I don't think the category of types that are less threadsafe than Int is important enough to warrant the complexity that introducing a separate concept adds. No matter what the cost of atomic refcounting is (and there are ways to mitigate those costs), people still want to program with threads, so we simply don't often hear of types in that category playing a crucial role.

P.S. (and this is really incidental) I assumed that whatever advances Apple silicon was making in the efficiency of atomics was likely to be occurring across the industry. Without asking you to reveal any trade secrets, of course, I suppose you might have some insight into that. Care to comment?

dabrahams · November 10, 2020, 5:41pm

Credit where it's due

The statement above, as I originally wrote it, hasn't been sitting right with me, so I've made a correction. I mean this: if I managed to finally write down a usable definition for ValueSemantic, it's only because of the work we did here together.

In particular

if @Chris_Lattner3 hadn't prodded me to start the thread, it wouldn't have happened.
@anandabits's post yielded the crucial insight that if value semantics was to be used for thread-safety then a large category of operations had to be considered unsafe, and that category included all the operations that expose reference semantics on types that would otherwise be considered values.
Everybody who engaged in the discussion provided the environment of focus and collective inquiry that I, at least, depend on for insight.

Lastly I want to thank everyone for an experience that reminded me of why I believe so strongly in collaborative open-source development.

-Dave