[Second review] SE-0395: Observability

John_McCall · June 28, 2023, 1:41am

Making the observation tracking data copy-on-write for value types seems better than making it truly shared, but it still doesn't seem right.

I don't think I can explain what I see as the right thing here without a little bit of theory. I think folks might be confusing values and locations. These are both basic formal concepts of languages, tied deeply into the language semantics. They are defined slightly different by different languages, and talking about them in the abstract can get pretty circular. Let me try to provide a quick intuition of what we mean by this in Swift.

A value is what you can return from a call, pass as a (non-inout) argument, and so on. Ignoring reference types for a second, you can talk about values independently of concepts like memory. Fundamental types can be thought of as fundamental values, like particular integers and strings, and structs can broken down recursively into the component values they store in their stored properties. For example, I might say that a particular value is Ball(diameter: .03, color: Color.orange). Here I've written the value as if I were calling a memberwise initializer with all the values of the stored properties; this works to denote the value even if I didn't actually build it that way, or even if my type doesn't actually have a memberwise initializer.
A location is part of the memory of the abstract machine. Every location has a type, and it stores a value of its type. For example, when you declare a mutable local variable, a new location is created dynamically when that variable comes into scope, and it is destroyed when the variable goes out of scope (and all the captures of it go away). Creating a location of a struct type means creating locations for all the stored properties of that struct.

A value of class type is a reference to an object. When a class object is created, it includes locations for each of the stored properties of the class. When you copy the value around, it's still a reference to the same object, giving access to those same abstract locations. Now, we say that class objects have a notion of identity, and we expose that identity in the language through e.g. the === operator. But even if we didn't have that, class objects would have some formal measure of identity innately through the semantics of the locations of their stored properties, because the independent mutability of locations is itself a kind of identity.

Mutation in Swift is all about location. For example, the name of a local variable is tied to the location that is created dynamically for the current entry into the variable's scope; if you evaluate an expression that mutates that variable, you're changing the value stored in that location. Crucially, you do not change values stored in other locations, even if the value in this location was copied from them or vice-versa. People would be surprised if it worked any other way.

I think observation needs to work the same way. It makes sense to observe a mutable location, but what that means is that you're interested in changes to the value stored in that location, not somehow in changes to the value itself. Values cannot change! Only the value stored in a location can change. And this goes deep into the basic mechanics of values. If you copy a value out of one location and into another, where you then mutate it, observers of the original location should not be notified about those changes. If you replace the value in one location with a totally different value, any existing observers of that location should of course be notified about that change, as well as any subsequent changes to that location. The observers of a location are extrinsic to the value actually stored there.

Sometimes we think about structs as if they were just classes where everything was allocated inline, or vice-versa. And sometimes that's perfectly reasonable. But when we talk about extending observability from classes to value types, that intuition does not serve us well. Abstraction over class values naturally preserves locations within the location, so it makes sense to set up observation of a class object (or its properties) by passing around the class value. Abstraction over value types simply does not work like that: there is nothing in Swift right now that you can add to the value of a value type that will make it naturally track this abstract concept of location. Perhaps we could change that, but I don't think it's the right thing to do. Tracking semantic location for arbitrary values would be hard; there's a lot of subtle complexity there that we've intentionally not made part of Swift's basic language model.

For example, in many common situations, this abstract concept of location aligns with a physical storage address at runtime. This is because, when the language implementation allocates an abstract location into a particular place in memory, it usually doesn't have a good reason to move it. One might be tempted, then, to implement location-specific information like value-type observers by collecting it normally in the value but making value copies and relocations drop it. But there are some reasons why abstract locations need to be relocatable in memory, especially in the library. For example, the elements of an Array are naturally locations that are identified by their index. If you add more elements to an Array, the array storage may need to be reallocated, and so existing elements will need to be relocated. This kind of relocation must not reset observers because the abstract location remains the same. And the reverse is also a problem. For example, if you consume a value from one variable and assign it to another, the value is semantically in a new location, and so information like observers should be reset. However, we'd really like the Swift optimizer to be smart enough to avoid relocating the value if it doesn't have to, which would be a problem if that's the only way to trigger that reset. Copy-on-write collections can exhibit both of these problems, because location-specific information will naturally be preserved if the collection is copied without copying the buffer, and then the inheritor of the buffer can be unpredictable. In brief, the design of Swift is not meant to support this kind of location-specific tracking of values; it is not part of the language model.

As a result, I'm very skeptical that we should try to include observability for value types in this release. It seems to me that there are deep conceptual problems that need to investigated before we can make any progress here. We should leave room for it in the ABI, as well as for observing objects that don't conform to a Swift AnyObject constraint (which future move-only classes may not); in particular, we should make sure that the Observable protocol doesn't have unnecessary constraints about the kind of value that can be observed. But I think it's perfectly acceptable to not make the macro handle types that will semantically misbehave under observation unless they're used in a way that very carefully never disturbs that innate sense of location-identity. We know how to make classes work well, and we should be content with that for now.