ValueSemantic protocol

Responding to several points above:

I don't think there is any productive type-level definition of ValueSemantic that can take into consideration "every possible" function/extension that would use values of the type. We have to define the property based on the logical behavior of the type's implementation and contract itself.

To stress test this, I think it is fine for UnsafePointer to be ActorSendable (and ValueSemantic), for the same reason that Int is ActorSendable even if you use it to index into a global dictionary: the definition of the value itself obeys value semantics, even though some external operations (dereferencing, indexing) can depend on additional state. This may not seem obvious given that we want to preserve actor isolation here. However, observe that UnsafePointer is explicitly unsafe right there in the name. In contrast, a SafeReferenceSemanticArray should not be actor sendable.

I think the crux of the issue (as many people describe above) is "what is the encapsulated state represented by the value" and "does a copy (with equal sign) give value preserving semantics" over that state?

This is the difference between Array, UnsafePointer, and SafeReferenceSemanticArray: Array and SafeReferenceSemanticArray logically encapsulate the elements in the array. UnsafePointer is not an encapsulation of all global memory.

To be explicit, yes, I'm arguing that Array<UnsafePointer> should conform to ValueSemantic. :)

-Chris

You said above that you believe UnsafePointer should conform to ActorSendable but didn't specifically say it should conform to ValueSemantic. That seems to be implied by your comment about Array<UnsafePointer> here, is that correct?

Right - updated the post to make that explicit, thx

Gotcha. I trust your judgement on this. The argument that "unsafe" makes this ok seems reasonable.

This is a great way of looking at it.

If an Int value could be said to encapsulate the memory at a certain address, then it could equally be said to encapsulate the memory at any offset from that address and, by induction, all memory. That would not be useful.

1 Like

Defining Value Semantics for Swift

(with thanks for editorial input to the Swift for Tensorflow Team, and to @anandbits for a crucial insight)

For years we've talked informally about certain types “having value semantics”
and used that statement to draw correct conclusions about the behavior of some
code. But until now, we've never really nailed down what it means for a type to
have value semantics. This document provides a clear definition on which
language and/or library extensions could be based, and explores its
implications.

Requirements of Value Semantic Types

When we say “type X has value semantics,” we mean:

  • Each variable of type X has an independent notional value.

  • A language-level copy (e.g., let b = a) of a variable of type X has an
    equivalent value.

  • Given a local variable a of type X, safe code cannot observe the value of a
    except via an expression that uses a.

  • Given a variable a of type X, safe code cannot alter the value of a
    except by one of the following means applied to a or to a property of a
    that reflects all or part of a's value.

    • assignment.
    • invocation of a mutating method.
    • invocation of a mutating accessor of a property or subscript
    • passing the expression as an inout parameter.
  • Concurrent access to the values of distinct variables of type X cannot
    cause a data race.

Safety

Swift has an existing definition of safety, which just implies “memory safety.”
The above definition says that for value semantic types, some memory-safe
operations—those that can cause value access on distinct instances to race—are
also classified as unsafe. This constraint on the meaning of “safety” does not
break any existing code because no types have yet been classified as value semantic
under this definition.

Corollaries

From the above properties and Swift’s semantics, we can conclude that for a type
with value semantics:

  • The value of a let-bound instance is constant for all time.
  • All instances have the same properties as variables.
  • Assignment causes the left-hand side to have the same value as the right-hand
    side.
  • Any struct, enum, or tuple whose value is composed of the values of its
    stored value-semantic properties also has value semantics.
  • The type's value semantic properties are preserved by any new APIs
    (e.g. extensions) implemented using only safe operations.

Usefulness

Knowing that a type has value semantics is useful because:

  • Variables can be safely copied or moved across concurrency boundaries
    without introducing race conditions in code that otherwise appears to be
    safe.
  • The value of a non-global variable is immune from “spooky action at a distance.”
  • Value-semantic types support equational reasoning.

Defining Value

The statement that “X has value semantics” is obviously meaningless without a
concept of the value of an instance of X. This concept must be clear for each
type with value semantics. Likewise, to be useful, the operations that access a
value of an instance of X must be clear. In many ways, the semantics of those
operations constitute a definition of the type's value.

Depending on how narrowly one defines a type's value, any type can be said to
have value semantics. For example, UnsafeMutablePointer can have value
semantics if pointee and the data accessed by subscript (and the operations
that affect them) are excluded from its value but its other members are
included.

Documenting Value

For types that are consumed only by their authors (e.g. in a quick script), a
clear concept of value can be maintained in the author's brain. For other types,
the value must be documented.

For most types with value semantics, it would be difficult to describe the type
without describing its value, but it could be cumbersome to explicitly and
separately document the value of every such type. Fortunately we can avoid
some of that description by acknowledging the patterns these types usually
follow. I suggest the following convention for types documented as having value
semantics:

Unless explicitly documented otherwise,

  • public and internal properties are implicitly part of a type's value.
  • Data accessed by public and internal subscripts is implicitly part of a
    type's value.
  • private and fileprivate properties and subscripts are implicitly not
    part of the type's value.
  • The results of witnesses for any Equatable, Comparable, Hashable, and
    Codable conformances depend entirely on the type's entire value.
  • Non-salient attributes of the type (like an Array's capacity) can be
    assumed to be safely accessible on distinct instances, across threads.

Other APIs probably don't need any special attention in documentation for value
semantics, since it is pretty much impossible to describe their semantics in
documentation without describing how they use, reflect, and/or affect the type's
value.

Deployment in Practice

It follows from the definition of “value semantics” that any APIs that don't
uphold the requirements value semantics imposes on safe operations are unsafe, and
should be labeled as such in public APIs, by the usual Swift convention.

It follows that “Unsafe” in a type name should mean that one is to assume that
operations are unsafe unless otherwise specified. For example, the safe
operations on UnsafePointer should be documented as such to support its claim of
value semantics.

It's important that clients of APIs exposing non-salient attributes know that
these attributes are not part of the instance's value.

Open Questions

I think UnsafePointer should be declared to have value semantics. Whether a
similar thing should be done for UnsafeBufferPointer is an interesting
question; it would imply that there are some Collections whose elements are
not part of their value. Whether that's OK may depend on how extant Collection
algorithms are documented.

14 Likes

I think this is a very workable definition at first blush.

Since you argue that value semantics confer additional "safety" above memory safety to types, I think it would be reasonable to consider unsafe operations and unsafe types generally as exceptions.

That is, a type can have value semantics and still offer withUnsafe* APIs that aren't thread-safe, and users will have to read the documentation for those operations to use them correctly across concurrency boundaries.

Similarly, since we don't have to name operations unsafe on types that are already themselves Unsafe*, it makes sense to say that those types neither have nor do not have value semantics. If every operation available on a type is unsafe, it would be a vacuous statement to reckon whether it has value semantics. (For the same reason, it'd be not very useful to argue whether Never has value semantics.)

I believe what I wrote accommodates such types without the need for any exceptions. If you see something in the document that makes you think otherwise, please explain.

True, these degenerate cases do exist. But what is the value in carving out a special category for them?

I believe so as well. I’m think your definition is a great start! Thank you for putting the effort into laying it out so clearly and for including pragmatic conventions.

If the community adopts this definition and we move ahead with a proposal to bring support into the language what do you think we do with function types? Would we do something similar to what @Chris_Lattner3 suggested here for ActorSendable?

I would argue that it should conform. Not having this conformance would break the use of Never as a bottom type for APIs that are constrained to ValueSemantic. I have written lots of generic code that uses Never in this way and would use ValueSemantic.

1 Like

This is a nice proposal Dave, I think it is well considered. I think it would be useful to include some sort of high level conceptual framework so that there is a short "layman" mental model as well.

I thought there was a proposal that Never should conform to all protocols implicitly?

-Chris

5 Likes

I didn't think I was articulating an exception, but what I understand your proposal would imply in the case of types that have value semantics but offer withUnsafe* APIs.

Again, I wasn't articulating an exception. My point was that, if I understand your rules correctly, then it would be reasonable to apply the reasoning in such a way as to define away entirely the question of whether Unsafe* types have value semantics thus defined.

In the practical sense, yes. I was referring to the question of how it informs our understanding of the semantics of value semantics. In that respect, I claim that the answer to the question of whether Never has value semantics doesn't particularly do much for us, because it is a degenerate case. Similarly, I claim that answering the question of whether UnsafePointer has value semantics doesn't do much for us either, because it is a degenerate case for another reason (that reason being that its raison d'etre, and therefore most if not all of its salient APIs, are unsafe).

Thanks! I've tried and known I was failing at this so many times—it's a great relief to have finally come up with something I believe in. Hopefully it stands up to poking from the likes of @Joe_Groff.

If the community adopts this definition and we move ahead with a proposal to bring support into the language what do you think we do with function types?

I don't think a useful concept of “value” exists for functions in Swift. Unlike UnsafePointer, we can't even compare two values of function type for equality. That points pretty clearly toward functions not having value semantics. Also, when there is a plausible notion of “value” that could be reflected by what happens when you call the function, e.g.

var a = 1, next = { (a, a+=1).0 }, next1 = next

next and next1 clearly don't exhibit the value-independence that would be required. You can, however, use callAsFunction to define value semantic types that are callable.

You mentioned bringing “support into the language.” The one obvious way I see to do that would be to introduce a ValueSemantic protocol whose semantic requirements match the bullet list at the top of my definition. We should definitely have that [edit: and implicit conformance of struct/enum/tuple aggregates of ValueSemantic types]. Beyond that, I'm not yet sure what kind of language features might be used to support the concept of value semantics further. I guess it would be a shame if we somehow introduced a type-system representation of purity that didn't also help us prevent errors in the definition and/or use of ValueSemantic-conforming types.

Would we do something similar to what @Chris_Lattner3 suggested here for ActorSendable ?

You mean create some refinement relationship? Yes, ValueSemantic should refine ActorSendable, as I have suggested elsewhere, because every type with value semantics has a trivial way to efficiently satisfy the ActorSendable requirements. I'd consider doing the same for Codable (the inverse of what @xwu suggested there), but:

  • The way Codable can satisfy the ActorSendable requirements ain't necessarily so efficient.
  • X: ValueSemantic, Codable would now create an ambiguous conformance.

Yeah, Never should conform to every protocol, or at least every one with no init or static requirements. If we want a less magical degenerate example, consider struct X {}. Making that conform to ValueSemantic is neither good nor bad, IMO, because while it has no useful notion of value, it also has no operations. As far as I'm concerned, go ahead if you want to :wink:

1 Like

I don't think the fact that a type has Unsafe in its name implies that the type doesn't have value semantics, if that's what you mean. If there's a natural definition of “value” for the Unsafe type, such that it obeys the rules at the top of my post, declaring the type to have value semantics lets us use it safely in more ways without documenting an impracticable number of special cases.

We could say that UnsafePointer instances don't have a value, but I don't think that matches most programmers' intuition: instead we think the pointer's value is entirely defined by the bits of the pointer. If we take that commonly accepted view, UnsafePointer meets all the requirements for value semantics and we can and should declare it so.

I don't know what you are hoping it would do for us, but being able to easily reason about the threadsafe use (and single-threaded isolation properties) of Array<UnsafePointer<Int>> is super-valuable to me.

You must be thinking of SE-0215. Only Equatable, Hashable and Comparable were included. Never can’t conform to some protocols (such as ones with an initializer requirement) so the design work to generalize conformance was left as future work.

Yes, and the function type topic above is a great example. Functions that only capture immutable values of types that conform to ValueSemantic and have a safe implementation would be perfectly reasonable “values”. They could also conform to Equatable if all captures conform to Equatable as was discussed in the thread I linked previously.

In my experience, I often want a function that is expected to behave as a value but I have no way to express that in the type system. It would be great to be able to do that. Further, I often find function types viral in making other types unable to conform to Equatable. There are cases where I would be willing to limit the functions to only having Equatable captures in order to gain an Equatable conformance on the function value.

No, I was talking about how he suggested making function types conform when all captured values conform by using an @actorSendable attribute.

Right, but assuming that doesn’t automatically happen before ValueSemantic goes in it should get a conformance so we don’t have to add it outside the standard library.

Yes, if we could represent “only capture immutable values of types that conform to ValueSemantic and have a safe implementation” (actually you need stricter constraints than that—see below) in the type system, those types could be said to “have value semantics.” But then that would be a different type from an ordinary function type that we have today.

You can do that without any language features
/// A function of one argument that has value semantics.
///
/// The “value” of this function is observable via the semantics of
/// its `callAsFunction` method.
struct PureFunction<Argument, Result>: ValueSemantic {
  /// Creates an instance that uses `f` to implement its `callAsFunction` method.
  ///
  /// - Precondition: `f` does not mutate anything and is memory-safe.
  init(_ f: (Argument)->Result) { self.implementation = f }
  
  /// Invokes the implementation.
  func callAsFunction(_ x: Argument) -> Result { implementation(x) }

  /// The function that implements `self`
  private let implementation: (Argument) -> Result
}

But a first-class language feature that makes a category of pure function types would be lots cleaner to use, would allow us to type check the composition of pure functions, and could handle inout.

They could also conform to Equatable if all captures conform to Equatable as was discussed in the thread I linked previously.

Oh, I think not, unless you want to say the behavior of the function is independent of its value! There's no way to compare the behaviors of two functions for equality.

In my experience, I often want a function that is expected to behave as a value but I have no way to express that in the type system. It would be great to be able to do that.

Yeah, sure, design away! That sounds great. Could get messy though; you might need to represent safe in the type system too.

Further, I often find function types viral in making other types unable to conform to Equatable . There are cases where I would be willing to limit the functions to only having Equatable captures in order to gain an Equatable conformance on the function value.

Yeah, I really don't think that's good enough. You can define distinct Equatable callAsFunction-able types that have equatable properties and don't store a value of function type. But once you get to built-in function types in Swift there's no way to control the semantics that they expose, and I don't think we'll ever be able to compare them reliably.

I was talking about how he suggested making function types conform when all captured values conform by using an @actorSendable attribute.

problematic, as noted above.

Never should conform to every protocol, or at least every one with no init or static requirements

Right, but assuming that doesn’t automatically happen before ValueSemantic goes in it should get a conformance so we don’t have to add it outside the standard library.

SGTM.

You're right. It is more precise to say that Never should conform to protocols without init and static requirements. I think that ValueSemantics is an example of this.

I'd encourage us to think of ValueSemantics in two stages: provide a model that captures the core semantics in an opt in way, then possibly find a safe framework where such behavior can be inferred. The later doesn't block the former, and we have some basis (structs, tuples, enums) that show that such composition doesn't require defining properties over all operations of the derived types.

-Chris

1 Like

To be explicit, I think you're suggesting:

Stage 1: Create a ValueSemantic protocol (including its documentation) and apply it to the standard library.
Stage 2: Allow ValueSemantic conformance to be inferred for tuples, structs, and enums of ValueSemantic-conforming types.

If that's what you mean, sure. I'm all for staging, although I don't see a huge advantage in this case because stage 2 is pretty trivial.

1 Like

Chris was already talking about introducing something different than ordinary function types with @actorSendable.

I agree this would be great. Do you imagine that these functions would conform to ValueSemantic? Fwiw, I inout would work fine with my suggestion. Only mutable captures would need to be banned.

Sure, but then you lose all of the language support for function types.

I don't see why it is a problem to say the value of the function is related to source identity. If we take the C++ approach and view closures as syntactic sugar for anonymous structs then equality becomes clear. It looks like @Joe_Groff thinks this is a reasonable idea.

What specific problems do you see with the above approach?

Unfortunately I don't have time for this exercise right now. I hope we can have ValueSemantic function values someday though.

Agree.

I also think this inference should be supported from the start. I'm hoping Chris has something more in mind related to enforcing safety.

Assuming for the sake of discussion that the annotation used to distinguish these value-semantic functions is pure

Yes, and my point is that when thinking about how nice this would be for you, don't overlook that it will introduce some type mismatches that you don't have to deal with currently if you're doing it with plain function types, e.g. you wouldn't be able to implicitly convert an ordinary function into a pure function.

Do you imagine that these functions would conform to ValueSemantic ?

Yes.

Fwiw, I inout would work fine with my suggestion. Only mutable captures would need to be banned.

Pretty sure that's not the whole list. Don't you need to ban access to mutable globals and calls to non-pure functions?

If Joe thinks it's reasonable maybe I'm mistaken, but I was under the impression that reabstraction thunking means that the source identity of some function values is inaccessible. And if it's useful for you to have a system where

{ x:Int in x } != { x:Int in x }

to say nothing of

{ x:Int in x + 1 } != { x:Int in 1 + x }

then I suppose Equatable conformance is okay. You have to say that the source identity of the function is notionally part of its value but otherwise invisible to the user. Although that's a semantically coherent point-of-view, it's weird, and IMO we should ask ourselves whether it's too counterintuitive for Swift before cargo-culting it from C. It seems to me that in a system like this you can use equality of functions to do some manual optimizations (e.g. “don't rebuild the cache if the computation is known not to be changing”), but not for much else(?).

I also think this inference should be supported from the start.

To be perfectly clear, I am not taking a position on that point yet.

1 Like

Yes of course not, that's the whole point of having the distinction in the first place!

If we're talking about pure then yes, of course. But that wasn't my original intent here. You specifically avoided entangling pure and ValueSemantic. I was just putting together your approach with Chris's approach to ActorSendable.

That said, as long as we get pure functions someday I'm on board with waiting for those. There is no name available to mark unsafe if the implementation isn't in fact safe so I can see the argument that existing function types shouldn't conform to ValueSemantic even if they only immutably capture ValueSemantic values.

Yep, this is what I would find useful. I think the advantages would outweigh the disadvantages in practice. There are plenty of cases where the mismatch would not cause problems in practice.

Conceptually, I view functions as existentials of single-requirement protocols. I think this point of view (if followed through all the way with language support) leads to a more consistent and powerful semantics and type system. If you adopt that point of view then I don't think it's strange to have the view that different "conformances" will not be equal. If you don't like this point of view then I understand why it might seem weird.

Fwiw, I did not cargo-cult this from C. Function types as they currently exist virally destroy our ability to conform to Equatable in a coherent way. I view this as a serious problem.

I'm well aware that (as with pure) supporting them introduces a new type that isn't compatible with all function values. They definitely shouldn't be reached for reflexively. But as with any code working against an abstraction, there are use cases where imposing the Equatable constraint would be very useful.

Fair enough. If there are no implementation challenges I can't imagine why inference would be deferred though. What would be the purpose of deferring this convenience?

Terms of Service

Privacy Policy

Cookie Policy