Strict Value Semantics

DevAndArtist · January 22, 2019, 10:56pm

What do you mean by Foo.age, seems like a typo to me, did you mean x.age!?

anandabits · January 22, 2019, 10:59pm

Yep, thanks for catching that mistake.

DaveZ · January 23, 2019, 12:20am

I can't find any precedent for protocols changing the behavior of "peer" protocols when they are a part of a composition, so I'm deeply reluctant to propose such behavior. This idea/goal feels separable.

Agreed.

What would the argument be for Array "preserving" non-strict value semantics? What problem would that solve?

Right. If your goal is a stronger functional programming model, then yes, my proposed "strict value semantics" aren't strict enough. And yes, as I outlined at the start of this thread, this proof-of-concept patch is a prerequisite for some research I'm doing into an atomicity/concurrency/reentrancy model, so if you want to describe this patch as being "data race free", then that's fair.

anandabits · January 23, 2019, 12:37am

AnyObject clarifies the semantics of other protocols when used in combination. A type that conforms to a protocol with a mutable property clearly has a reference-semantic conformance. It seems reasonable to take the same clarification of semantics in the other direction with AnyValue. This approach also seems like it would interact best with protocols that have already been designed, especially protocols where the issue of value vs reference semantics is intentionally left up to the conforming type.

I don't really understand the question here. I said that I see Array as preserving value semantics when its Element has value semantics (using the stricter FP-like definition). If Element does not have value semantics then value semantics cannot be preserved because they don't exist in the primitive which is aggregated by an array value. So to say "preserving non-strict value semantics" doesn't fit my mental model as a sensible thing to say. The mental model I have here is analogous to rethrows, but for types / values instead of functions / effects.

An FP-like model that embraces Swift's mutability model is indeed what I am most interested in. The concurrency model you are pursuing is of course also extremely valuable and could be a foundation for stricter semantics in the future.

I don't think we disagree as much as we have different motivations and therefore see slightly different things in the term "value semantics". Eventually if this moves forward the Swift community will need to choose a common definition for what "value semantics" means and perhaps use a different term for the other one of these models. FWIW, I don't have a strong opinion about how this is resolved.

DaveZ · January 23, 2019, 3:49am

I think the exceptions model (throws, etc) is both a good and bad analogy. It's a good comparison because the type system machinery is very similar; but it's a bad comparison because the implications are "backwards".

With exceptions, if a callee can throw, then the caller must deal with it or rethrow.

With strict value semantics, the obligation flows in the opposite direction, from caller to callee. If the caller has strict value semantics, then the callee must have strict value semantics. If the opposite were true, then class methods couldn't use any value types, including integers.

Therefore, because Array never accesses members of the elements, the semantics of the elements don't matter. They can be strict value types, non-strict value types, or even reference types.

Joe_Groff · January 23, 2019, 4:00am

Right, "value semantics" is an "anti-effect", it's really "doesn't have value semantics" that propagates like "throws", "async" or other effects would.

If you don't mind, I'd appreciate it if we could defer syntax discussions until the fundamentals are discussed first. Specifically, is this soft proposal even something the core team is open to or interested in?

You can express AnyValue with this patch like so: protocol AnyValue : !class {}

One fundamental thing IMO is this fact that @michelf pointed out—value semantics is a property of operations, not of types or protocols as a whole. As such, it doesn't really make sense to me to talk about protocols as having value semantics or not, or for AnyValue to be a type level concept—AnyValue is Any when you're in a pure value semantics function context. The individual member requirements of a protocol can individually require pureness (or not). There may be syntactic advantages to having certain defaults at the type level, but to suss out the underlying model, I think that thinking at the level of types or protocols is obscuring rather than enlightening.

Joe_Groff · January 23, 2019, 4:05am

It seems reasonable to say that things like capacity just aren't pure from the model's perspective, so they can't be used without explicitly escaping from the guarantees it provides.

DaveZ · January 23, 2019, 12:56pm

The design of the patch is mostly if not completely as you describe. I haven't tested the ability to mark individual protocol methods as having explicit reference semantics or strict value semantics, but it can work if people want that.

Personally, I think it will be hard to avoid talking and thinking about strict value semantics at the type level if structs/enums default to strict value semantics. Also, marking an entire protocol as being "!class" is convenient and less error prone.

DaveZ · January 23, 2019, 1:07pm

Right. This is also fairly precedented. Value equality does not imply identical underlying representations. Two equal "color" values might be represented as RGB and YUV respectively. Similarly, two equal strings might have different underlying encodings. The fact that two equal arrays might have different capacities is fine.

Also, I think we're talking about different definitions of "pure", right? GCC/clang/LLVM defines it as a "read only" attribute, but I get the sense that @anandabits means "pure" from a functional programming perspective. Am I mistaken?

anandabits · January 23, 2019, 4:34pm

They don't matter to the implementation of the array. The semantics of the elements certainly matter to the users of the array!

Maybe this gets back to definitions and goals then. I absolutely agree that the semantics of operations is crucial. But I also think the state stored by a type and how it relates to the meaning of the type matters a lot and I don't agree that it is obscuring rather than enlightening to think about this.

Some types are basically just plain old data (including pure functions!) combined with an interpretation. These types don't have an identity independent of equality. There is a clear delineation between essential and incidental parts of the state (defined by equality). Values of these types don't hold resources, don't listen and respond to messages as long as a reference to them is held, etc.

Not at all coincidentally, the types for which we would want different defaults for operations are exactly the types that have the above properties. We always say protocols are about more than just syntax, they are also about semantics. AnyValue models a very important set of semantics for a type and also provides a stricter default semantics for operations, along with a rationale for that stricter default: the type is just data with an interpretation as well as transformations of that data into other data (also with an interpretation).

Clearly defining the semantics of the state of the type itself allows us to rely on strong guarantees about the absence of effects when storing values of that type (we aren't storing references to any resources, etc).

I think it would also help in very pragmatic ways by informing programmers when they are creating a value that is not a trivial composition of other values and therefore requires a more sophisticated implementation. An error that tells you that your type isn't a strict value because it stores a reference to a UIView is a lot more helpful for most people than errors for every method saying it doesn't have value semantics. The latter may still appear in the AnyValue model, but you would also get an error message that gets closer to the root of the problem which I think has more explanatory power.

I have seen people write value types that clearly don't have the semantics they intended due to mixing of value semantics and reference semantics. When I see this happen the fundamental issue usually has to do with storing references in value types. Right now the language doesn't help people spot the potential for problems when they make these kinds of mistakes. If we had a clear line between strict value types and other value types people would be subtly encouraged to understand the semantics of their value types better.

How would the model accomplish this without distinguishing essential and incidental state?

I can't speak to the compilers but I definitely mean FP-like referential transparency. The difference from FP being that in Swift we can view things like inout and throws as not violating referential transparency and therefore allow "pure" functions to throw and to take inout parameters (including self in mutating methods). I haven't given much thought to how this might extend to future language features but it's possible there are other things that come up that are outside of the traditional FP notion of "pure" while still preserving referential transparency.

DaveZ · January 23, 2019, 5:40pm

Of course, and when users of the array access the elements, they get the semantics they expect. What makes you think that users will observe surprising semantics with this patch?

Agreed. It seems to me that you want this patch to be about functional programming. It is not. This patch is about tightening up the definition of "value semantics" in Swift to the point where other features and optimizations can be enabled (in my case, a simple and efficient atomicity/concurrency/reentrancy model).

Again, I think you're projecting a functional programming perspective onto this patch. That is not my goal and therefore there is nothing wrong with a strict value type storing reference types within as long as the strict value type doesn't access the members of the reference type (a.k.a. dereferencing, calling methods, getting/setting properties, etc).

I think this dimension of control is solvable and separable from this proposal. And yes, it too would make value types more strict, which is a good thing. Are you okay if we defer this to another proposal/discussion? And for whatever it may be worth, it could still end up as a part of this patch, depending on how it evolves. My point is that this level of control is currently a non-goal for this patch.

Thank you for confirming. While functional programming is interesting, I think it beyond the scope and goals of this patch.

Dave

Joe_Groff · January 23, 2019, 5:43pm

I guess from my perspective that policy-level stuff is primarily "syntax" rather than the underlying model. If the syntax policy ends up adding complexity by introducing new modifiers, new magic protocols, etc., though, I'd be worried about whether it's the correct policy.

Joe_Groff · January 23, 2019, 5:45pm

I've been using "pure" interchangeably with "value semantics" because they're really the same thing. Value semantics as Swift presents it is pretty much exactly referential transparency, the primary difference from classical functional purity being inout and the ability to perform locally scoped mutations in an isolated and still ultimately referentially transparent way.

DaveZ · January 23, 2019, 5:54pm

Thanks Joe. For whatever it may be worth, I don't feel well versed in what counts as FP or not (and I'm more of a bottom-up programmer), so I didn't feel comfortable using the word "pure" outside of how GCC/clang/LLVM defines it. Also, as I wrote earlier in this thread, FP isn't the goal of this patch, so I'm trying to manage expectations accordingly.

Joe_Groff · January 23, 2019, 5:56pm

That's fair. I know the word pure has been heavily overloaded in the C world. I think for at least a large contingent of Swift users, many of which come from functional backgrounds, what they would expect when they hear the word "pure" is closer to what we've been talking about as "value semantics" than what GCC or LLVM means by the term.

anandabits · January 23, 2019, 6:12pm

I'm not saying that. It's a difference in perspective and goals (which are not incompatible).

From my point of view, [Int] and [String] are fundamentally different than [UIViewController]. The former are just data where the latter is a collection of resources (that may be performing side effects as long as they live). All you need to consider when storing the former is the space the representation uses. There is a lot more to consider when holding a strong reference to a resource. Copies of strict values are therefore much simpler to reason about.

From the point of view of value semantics of the operations of the array (using your definition of value semantics) there is no difference at all.

I don't mean to come across as trying to tell you what to do. That is certainly not my intent. I am trying to articulate semantics I would like to see in Swift. This is related to and perfectly compatible with the semantics in your patch.

Again, I am only articulating where I would like to see value semantics in Swift go. Your patch is a strong step in that direction so I am very supportive of and appreciative of the work you are doing. However, I think it would be unfortunate to adopt this definition of "strict value". This is not how I think of "strict values" and I think many in the community would also find it an uncomfortable definition of "strict value".

Absolutely! I just want to be able to see the big picture and see space in it for the semantics I am interested in before we push forward. As the author of your patch you have every right to define its scope!

DaveZ · January 23, 2019, 7:30pm

I agree that from a FP perspective, the above are different. That being said, FP has never been applicable to the kinds of problems that interest me, so I fail to appreciate the difference like a functional programmer does. Sorry.

In any case, I think a more disciplined value type that can hold (but not access) reference types is useful and a trend in the right direction. What do functional programmers call such a value type?

anandabits:

DaveZ:

Again, I think you're projecting a functional programming perspective onto this patch. That is not my goal and therefore there is nothing wrong with a strict value type storing reference types within as long as the strict value type doesn't access the members of the reference type (a.k.a. dereferencing, calling methods, getting/setting properties, etc).

Again, I am only articulating where I would like to see value semantics in Swift go. Your patch is a strong step in that direction so I am very supportive of and appreciative of the work you are doing. However, I think it would be unfortunate to adopt this definition of "strict value". This is not how I think of "strict values" and I think many in the community would also find it an uncomfortable definition of "strict value".

If the FP community would like to own the definition of "strict value semantics", that's fine with me. What then would the FP community call this feature/patch?

anandabits · January 24, 2019, 3:28pm

I am not an expert at Haskell (I can read it reasonably well but have never written anything significant in it), but as far as I know, there is nothing that directly corresponds to what we know as reference types in Swift so I don't think there is a name for this. The closest I can think of is IORef which requires all interaction with the reference to go through the IO monad (Data.IORef).

I don't know of any hybrid languages with a compiler-enforced pure subset while also supporting imperative styles including low-level C-like features and higher-level OO features. I am hoping Swift becomes such a language. One of the challenges is designing in this direction is that you can get a middle ground where you have a "value type that can hold (but not access) reference types" where there is no established vocabulary for describing such a thing.

As far as I can tell, all "strict value" means in your patch is "a type whose members must have strict value semantics by default". I don't have a specific suggestion to offer, but something in the direction of concurrency safety seems more appropriate to me. I don't know for sure, but maybe Rust has some relevant terminology in this area.

When I hear the term "strict value" I think of a type that has the same general properties as Int - it is just data with a specified interpretation. In my view, a "strict value" is independent, therefore having no semantic entanglements with the rest of the system. The implementation might be quite sophisticated, including using CoW buffers, etc, but the semantics properties of the type remain simple to reason about.

DaveZ · January 24, 2019, 4:12pm

If "pure" means not holding references at the language level, then I think I can implement that. The standard library will still have privileged access to raw memory, but if that counts as "pure" as @Joe_Groff suggests, then yes, I think "pure" is within reach.

What do people think about "@semipure" for what this patch does today? Or "@pure(access)" This would make "@impure" be the natural attribute for disabling purity when a type as a whole defaults to being "pure" or "semipure". Unfortunately, the problem with the word "impure" is that it implies moral judgement.

nuclearace · January 24, 2019, 4:17pm

I much prefer this kind of annotation. It has parallels with other attributes such as private(set), or @_effects(releasenone)