ValueSemantic protocol

dabrahams · November 5, 2020, 9:14pm

I believe what I wrote accommodates such types without the need for any exceptions. If you see something in the document that makes you think otherwise, please explain.

True, these degenerate cases do exist. But what is the value in carving out a special category for them?

anandabits · November 5, 2020, 10:34pm

I believe so as well. I’m think your definition is a great start! Thank you for putting the effort into laying it out so clearly and for including pragmatic conventions.

If the community adopts this definition and we move ahead with a proposal to bring support into the language what do you think we do with function types? Would we do something similar to what @Chris_Lattner3 suggested here for ActorSendable?

I would argue that it should conform. Not having this conformance would break the use of Never as a bottom type for APIs that are constrained to ValueSemantic. I have written lots of generic code that uses Never in this way and would use ValueSemantic.

Chris_Lattner3 · November 5, 2020, 10:52pm

This is a nice proposal Dave, I think it is well considered. I think it would be useful to include some sort of high level conceptual framework so that there is a short "layman" mental model as well.

I thought there was a proposal that Never should conform to all protocols implicitly?

-Chris

xwu · November 5, 2020, 11:03pm

I didn't think I was articulating an exception, but what I understand your proposal would imply in the case of types that have value semantics but offer withUnsafe* APIs.

Again, I wasn't articulating an exception. My point was that, if I understand your rules correctly, then it would be reasonable to apply the reasoning in such a way as to define away entirely the question of whether Unsafe* types have value semantics thus defined.

In the practical sense, yes. I was referring to the question of how it informs our understanding of the semantics of value semantics. In that respect, I claim that the answer to the question of whether Never has value semantics doesn't particularly do much for us, because it is a degenerate case. Similarly, I claim that answering the question of whether UnsafePointer has value semantics doesn't do much for us either, because it is a degenerate case for another reason (that reason being that its raison d'etre, and therefore most if not all of its salient APIs, are unsafe).

dabrahams · November 5, 2020, 11:11pm

Thanks! I've tried and known I was failing at this so many times—it's a great relief to have finally come up with something I believe in. Hopefully it stands up to poking from the likes of @Joe_Groff.

If the community adopts this definition and we move ahead with a proposal to bring support into the language what do you think we do with function types?

I don't think a useful concept of “value” exists for functions in Swift. Unlike UnsafePointer, we can't even compare two values of function type for equality. That points pretty clearly toward functions not having value semantics. Also, when there is a plausible notion of “value” that could be reflected by what happens when you call the function, e.g.

var a = 1, next = { (a, a+=1).0 }, next1 = next

next and next1 clearly don't exhibit the value-independence that would be required. You can, however, use callAsFunction to define value semantic types that are callable.

You mentioned bringing “support into the language.” The one obvious way I see to do that would be to introduce a ValueSemantic protocol whose semantic requirements match the bullet list at the top of my definition. We should definitely have that [edit: and implicit conformance of struct/enum/tuple aggregates of ValueSemantic types]. Beyond that, I'm not yet sure what kind of language features might be used to support the concept of value semantics further. I guess it would be a shame if we somehow introduced a type-system representation of purity that didn't also help us prevent errors in the definition and/or use of ValueSemantic-conforming types.

Would we do something similar to what @Chris_Lattner3 suggested here for ActorSendable ?

You mean create some refinement relationship? Yes, ValueSemantic should refine ActorSendable, as I have suggested elsewhere, because every type with value semantics has a trivial way to efficiently satisfy the ActorSendable requirements. I'd consider doing the same for Codable (the inverse of what @xwu suggested there), but:

The way Codable can satisfy the ActorSendable requirements ain't necessarily so efficient.
X: ValueSemantic, Codable would now create an ambiguous conformance.

Yeah, Never should conform to every protocol, or at least every one with no init or static requirements. If we want a less magical degenerate example, consider struct X {}. Making that conform to ValueSemantic is neither good nor bad, IMO, because while it has no useful notion of value, it also has no operations. As far as I'm concerned, go ahead if you want to

dabrahams · November 5, 2020, 11:29pm

I don't think the fact that a type has Unsafe in its name implies that the type doesn't have value semantics, if that's what you mean. If there's a natural definition of “value” for the Unsafe type, such that it obeys the rules at the top of my post, declaring the type to have value semantics lets us use it safely in more ways without documenting an impracticable number of special cases.

We could say that UnsafePointer instances don't have a value, but I don't think that matches most programmers' intuition: instead we think the pointer's value is entirely defined by the bits of the pointer. If we take that commonly accepted view, UnsafePointer meets all the requirements for value semantics and we can and should declare it so.

I don't know what you are hoping it would do for us, but being able to easily reason about the threadsafe use (and single-threaded isolation properties) of Array<UnsafePointer<Int>> is super-valuable to me.

anandabits · November 6, 2020, 1:56am

You must be thinking of SE-0215. Only Equatable, Hashable and Comparable were included. Never can’t conform to some protocols (such as ones with an initializer requirement) so the design work to generalize conformance was left as future work.

Yes, and the function type topic above is a great example. Functions that only capture immutable values of types that conform to ValueSemantic and have a safe implementation would be perfectly reasonable “values”. They could also conform to Equatable if all captures conform to Equatable as was discussed in the thread I linked previously.

In my experience, I often want a function that is expected to behave as a value but I have no way to express that in the type system. It would be great to be able to do that. Further, I often find function types viral in making other types unable to conform to Equatable. There are cases where I would be willing to limit the functions to only having Equatable captures in order to gain an Equatable conformance on the function value.

No, I was talking about how he suggested making function types conform when all captured values conform by using an @actorSendable attribute.

Right, but assuming that doesn’t automatically happen before ValueSemantic goes in it should get a conformance so we don’t have to add it outside the standard library.

dabrahams · November 6, 2020, 3:59am

Yes, if we could represent “only capture immutable values of types that conform to ValueSemantic and have a safe implementation” (actually you need stricter constraints than that—see below) in the type system, those types could be said to “have value semantics.” But then that would be a different type from an ordinary function type that we have today.

You can do that without any language features

/// A function of one argument that has value semantics.
///
/// The “value” of this function is observable via the semantics of
/// its `callAsFunction` method.
struct PureFunction<Argument, Result>: ValueSemantic {
  /// Creates an instance that uses `f` to implement its `callAsFunction` method.
  ///
  /// - Precondition: `f` does not mutate anything and is memory-safe.
  init(_ f: (Argument)->Result) { self.implementation = f }
  
  /// Invokes the implementation.
  func callAsFunction(_ x: Argument) -> Result { implementation(x) }

  /// The function that implements `self`
  private let implementation: (Argument) -> Result
}

But a first-class language feature that makes a category of pure function types would be lots cleaner to use, would allow us to type check the composition of pure functions, and could handle inout.

They could also conform to Equatable if all captures conform to Equatable as was discussed in the thread I linked previously.

Oh, I think not, unless you want to say the behavior of the function is independent of its value! There's no way to compare the behaviors of two functions for equality.

In my experience, I often want a function that is expected to behave as a value but I have no way to express that in the type system. It would be great to be able to do that.

Yeah, sure, design away! That sounds great. Could get messy though; you might need to represent safe in the type system too.

Further, I often find function types viral in making other types unable to conform to Equatable . There are cases where I would be willing to limit the functions to only having Equatable captures in order to gain an Equatable conformance on the function value.

Yeah, I really don't think that's good enough. You can define distinct Equatable callAsFunction-able types that have equatable properties and don't store a value of function type. But once you get to built-in function types in Swift there's no way to control the semantics that they expose, and I don't think we'll ever be able to compare them reliably.

I was talking about how he suggested making function types conform when all captured values conform by using an @actorSendable attribute.

problematic, as noted above.

Never should conform to every protocol, or at least every one with no init or static requirements

Right, but assuming that doesn’t automatically happen before ValueSemantic goes in it should get a conformance so we don’t have to add it outside the standard library.

SGTM.

Chris_Lattner3 · November 6, 2020, 4:25am

You're right. It is more precise to say that Never should conform to protocols without init and static requirements. I think that ValueSemantics is an example of this.

I'd encourage us to think of ValueSemantics in two stages: provide a model that captures the core semantics in an opt in way, then possibly find a safe framework where such behavior can be inferred. The later doesn't block the former, and we have some basis (structs, tuples, enums) that show that such composition doesn't require defining properties over all operations of the derived types.

-Chris

dabrahams · November 6, 2020, 4:25pm

To be explicit, I think you're suggesting:

Stage 1: Create a ValueSemantic protocol (including its documentation) and apply it to the standard library.
Stage 2: Allow ValueSemantic conformance to be inferred for tuples, structs, and enums of ValueSemantic-conforming types.

If that's what you mean, sure. I'm all for staging, although I don't see a huge advantage in this case because stage 2 is pretty trivial.

anandabits · November 6, 2020, 4:45pm

Chris was already talking about introducing something different than ordinary function types with @actorSendable.

I agree this would be great. Do you imagine that these functions would conform to ValueSemantic? Fwiw, I inout would work fine with my suggestion. Only mutable captures would need to be banned.

Sure, but then you lose all of the language support for function types.

I don't see why it is a problem to say the value of the function is related to source identity. If we take the C++ approach and view closures as syntactic sugar for anonymous structs then equality becomes clear. It looks like @Joe_Groff thinks this is a reasonable idea.

What specific problems do you see with the above approach?

Unfortunately I don't have time for this exercise right now. I hope we can have ValueSemantic function values someday though.

Agree.

I also think this inference should be supported from the start. I'm hoping Chris has something more in mind related to enforcing safety.

dabrahams · November 6, 2020, 5:13pm

Assuming for the sake of discussion that the annotation used to distinguish these value-semantic functions is pure…

Yes, and my point is that when thinking about how nice this would be for you, don't overlook that it will introduce some type mismatches that you don't have to deal with currently if you're doing it with plain function types, e.g. you wouldn't be able to implicitly convert an ordinary function into a pure function.

Do you imagine that these functions would conform to ValueSemantic ?

Yes.

Fwiw, I inout would work fine with my suggestion. Only mutable captures would need to be banned.

Pretty sure that's not the whole list. Don't you need to ban access to mutable globals and calls to non-pure functions?

If Joe thinks it's reasonable maybe I'm mistaken, but I was under the impression that reabstraction thunking means that the source identity of some function values is inaccessible. And if it's useful for you to have a system where

{ x:Int in x } != { x:Int in x }

to say nothing of

{ x:Int in x + 1 } != { x:Int in 1 + x }

then I suppose Equatable conformance is okay. You have to say that the source identity of the function is notionally part of its value but otherwise invisible to the user. Although that's a semantically coherent point-of-view, it's weird, and IMO we should ask ourselves whether it's too counterintuitive for Swift before cargo-culting it from C. It seems to me that in a system like this you can use equality of functions to do some manual optimizations (e.g. “don't rebuild the cache if the computation is known not to be changing”), but not for much else(?).

I also think this inference should be supported from the start.

To be perfectly clear, I am not taking a position on that point yet.

anandabits · November 6, 2020, 9:31pm

Yes of course not, that's the whole point of having the distinction in the first place!

If we're talking about pure then yes, of course. But that wasn't my original intent here. You specifically avoided entangling pure and ValueSemantic. I was just putting together your approach with Chris's approach to ActorSendable.

That said, as long as we get pure functions someday I'm on board with waiting for those. There is no name available to mark unsafe if the implementation isn't in fact safe so I can see the argument that existing function types shouldn't conform to ValueSemantic even if they only immutably capture ValueSemantic values.

dabrahams:

And if it's useful for you to have a system where
{ x:Int in x } != { x:Int in x }
to say nothing of
{ x:Int in x + 1 } != { x:Int in 1 + x }
then I suppose Equatable conformance is okay. You have to say that the source identity of the function is notionally part of its value but otherwise invisible to the user.

Yep, this is what I would find useful. I think the advantages would outweigh the disadvantages in practice. There are plenty of cases where the mismatch would not cause problems in practice.

Conceptually, I view functions as existentials of single-requirement protocols. I think this point of view (if followed through all the way with language support) leads to a more consistent and powerful semantics and type system. If you adopt that point of view then I don't think it's strange to have the view that different "conformances" will not be equal. If you don't like this point of view then I understand why it might seem weird.

Fwiw, I did not cargo-cult this from C. Function types as they currently exist virally destroy our ability to conform to Equatable in a coherent way. I view this as a serious problem.

I'm well aware that (as with pure) supporting them introduces a new type that isn't compatible with all function values. They definitely shouldn't be reached for reflexively. But as with any code working against an abstraction, there are use cases where imposing the Equatable constraint would be very useful.

Fair enough. If there are no implementation challenges I can't imagine why inference would be deferred though. What would be the purpose of deferring this convenience?

dabrahams · November 6, 2020, 10:05pm

I can't say that I fully understand what you're trying to accomplish here, but I simply can't imagine any category of function values that, for its integrity, depends on banning mutable captures, but does not equally depend on banning the function from accessing other mutable state.

Conceptually, I view functions as existentials of single-requirement protocols. I think this point of view (if followed through all the way with language support) leads to a more consistent and powerful semantics and type system. If you adopt that point of view then I don't think it's strange to have the view that different "conformances" will not be equal. If you don't like this point of view then I understand why it might seem weird.

Well, there are a few inherent problems with it. The first is that today we can meaningfully describe the concept of existentials in terms of functions, but in your world that would become circular. But even if that doesn't bother you, your view implies something that I doubt will never match up with reality in Swift: that there's a different underlying type, somewhere, for different functions having the same signature.

Fwiw, I did not cargo-cult this from C.

Sorry, no offense intended. All I mean is that “the C++ approach” as you call it might not be appropriate for Swift, and we should think about it carefully.

Function types as they currently exist virally destroy our ability to conform to Equatable in a coherent way. I view this as a serious problem. I'm well aware that (as with pure ) supporting them introduces a new type that isn't compatible with all function values. They definitely shouldn't be reached for reflexively. But as with any code working against an abstraction, there are use cases where imposing the Equatable constraint would be very useful.

Are there many types that contain function values for which it makes sense to propagate the source identity of the contained function into a notion of the outer type's value? I'd like to hear about some examples.

If there are no implementation challenges I can't imagine why inference would be deferred though. What would be the purpose of deferring this convenience?

One thing is that people might not be happy with proliferating implicitly-generated conformances for code size reasons. Another is that they might not be happy with what's needed to suppress the implicit conformance (embedding a non-ValueSemantic empty struct). We wouldn't want either consideration to block protocol ValueSemantic.

anandabits · November 6, 2020, 10:29pm

Fair enough.

There would certainly need to be a change to make it happen and you would know better than I how likely that is.

Sure.

I'll think about what would make a good example.

Sorry, I meant synthesis of declared conformances. I agree that implicit conformance is stickier and shouldn't block the protocol.

dabrahams · November 6, 2020, 10:44pm

Oh, that's much easier; there's nothing to synthesize. There would be no structural requirements for ValueSemantic, only semantic ones. If ActorSendable becomes a reality and we decide to make ValueSemantic refine it, the one requirement can come from a simple default implementation with the body { self }.

anandabits · November 7, 2020, 12:00am

Oh, of course!

Chris_Lattner3 · November 7, 2020, 6:08am

I don't think we can or need to enforce safety for ValueSemantic. We don't do that for Hashable or Equatable after all. The definition of ValueSemantic is something that only makes sense at the human level of expectations, I don't think that adding type system machinery or imposing hard rules is going to be productive and worth the complexity. I'm totally willing to reconsider this on a case by case basis of course!

To be explicit, I don't think that a pure function annotation is necessary or sufficient (but I could misunderstand the intention here). You'd need a very complicated model to make it correct in practice for things like "capacity" and other non-trivial cases.

I agree with Dave the equality for functions is "hard" (for the examples he cites and many more) and probably not worth it. We could include a hash of the source code in the closure (or something) but I don't see how the complexity would be worth it. When do we need equality checks for closures? Given that we don't have them yet, don't we have evidence that those uses can be solved other ways?

In any case, this is a bit of a distraction: it is orthogonal to ActorSendable and ValueSemantic, so I'd recommend taking the discussion to a new thread (or resuscitating and old one).

I agree that this isn't obvious, and I'd rather default to being conservative on this. I mention a long list of caveats with automatic conformance in the 'alternatives considered' section of the ActorSendable proposal that talks about these issues.

-Chris

michelf · November 7, 2020, 4:32pm

I find this statement puzzling. What are the objectives pursued here?

If ValueSemantics ends up implying the value can be sent across actors, then it is a memory safety issue if not checked by the compiler. Mislabeling a type with ValueSemantics and sending it to actors could result in memory corruption if multiple threads access it concurrently. And if it does not allow sending values across actors, then I wonder what it is good for.

Or maybe the protocol should simply have ‘unsafe’ in its name?

If this is about thread safety, we don’t need to worry about capacity. It is perfectly thread safe to call Array.capacity. It might not be pure given some sense of pure, but does purity matters here?

I used to think a value semantics protocol did not matter and the only thing that mattered is what you do with the data. I think I was both right and wrong on this.

A model of restrictions at the operation level is useful to guaranty memory safety in the presence of concurrent access to the same data. But by itself it does not facilitate moving data between actors, it just validates what functions can be called safely when the current context does not have ownership of some data.

A model of value semantics is a model of things that are transferred by value, either by eager copy (like C++ containers or NSString (sort of)), CoW, or because everything is immutable. It’s a model for things where the data is passed in such a way you don’t need to worry about ownership: the current context can always assume to be the owner.

I think value semantics can help make a restriction regime more manageable. If you pass something to another actor and the compiler checks everything for memory safety, in the general case that thing you passed needs to be flagged in the type system as having an external owner. And so those flags can become viral annotations: any function you pass the data to need to promise it won’t dereference what isn’t owned by the current actor. However, if a type has value semantics there is no external owner to worry about and the flag can be safely dropped. This means the restriction regime has much less to manage and many annotations can be dropped.

I might be confusing ActorSendable with ValueSemantics here. To me it seems ActorSendable is basically the same as ValueSemantics except it can also apply to things like NSString where the object isn’t guaranteed to strictly be a value (could be a mutable string) but nevertheless represents a value that can be copied across safely. (The address of an NSString isn’t really what I’d call a salient attribute that needs to be preserved.) I somewhat doubt making the distinction between the two is very valuable.

dabrahams · November 7, 2020, 4:41pm

I agree about need; we can get a lot of value out of ValueSemantic without any language features. As for can, nerdsnipe accepted for sometime in the future. Please consider this an I.O.U.

I don't think that adding type system machinery or imposing hard rules is going to be productive and worth the complexity.

That may indeed turn out to be the case. I'm just not sure yet.

To be explicit, I don't think that a pure function annotation is necessary or sufficient (but I could misunderstand the intention here). You'd need a very complicated model to make it correct in practice for things like "capacity" and other non-trivial cases.

Depends what “pure” means, of course. If it meant “safely accesses only the value part of non-global ValueSemantic types,” for example, that could be enough (of course you need an escape clause so you can implement safe things with unsafe constructs). But I'm trying to put this off! (You're a very effective nerdsniper, Chris. )