[Prospective Vision] Accessors

Hello, Swift Community.

The Language Steering Group would like to gather feedback on a prospective vision for accessors in Swift. Vision documents help describe an overall direction for Swift. The actual Swift changes for executing on the vision will come as a series of separate proposals, so concrete details (e.g., specific syntax, API names, etc.) are less important than the overall direction. There is more information about the role of vision documents in the evolution process here.

The text of the introduction of the vision follows. The vision is quite long, so if you find it interesting, please follow the link above to read more.

A Prospective Vision for Accessors in Swift

Swift properties and subscripts can be implemented by providing one or more "accessors" that retrieve or update the value. Most Swift developers are familiar with the get and set accessors that are used to define computed properties:

struct Foo {
  var value: Int {
    get { ... provide a value ... }
    set { ... accept a value ... }
  }
}

In this case, the get accessor behaves just like a nonmutating method that returns a value of the property's type, while the set accessor behaves just a mutating method that receives a value of the property's type as an argument.

The get and set accessors are ideal for implementing operations that copy the current value of the property:

let copy = myFoo.value // calls the get accessor for Foo.value

or that overwrite the current value of the property:

myFoo.value = 51 // calls the set accessor for Foo.value

Other kinds of operations can also be compiled in terms of get and set. For example, if you pass a computed property as an inout argument:

myFoo.value += 10

Swift will use get to initialize a temporary variable, pass that variable as the argument, and then write the new value back with set:

var tmp = myFoo.value // calls the get accessor for Foo.value
tmp += 10
myFoo.value = tmp // calls the set accessor for Foo.value

However, this approach has significant problems. The biggest is that the get accessor has to return an independent value. If the accessor is just returning a value stored in memory, which is very common for data structures, this means the value has to be copied. This is unfortunate on three levels:

  1. It adds the runtime performance and memory overhead of copying the inline representation of the value. For example, if the value is an Array, the internal buffer of the array must be retained.
  2. It can make subsequent uses of the value less efficient. For example, if the value uses a copy-on-write representation like Array and Stringdo, mutating a copy is likely to dramatically less efficient than mutating a variable in place. (We will explain this in more detail later.)
  3. It requires the value to be copyable at all, and so it inherently cannot work for values of non-Copyable type.

These problems are amplified when properties and subscripts need to be abstracted over. When Swift knows exactly how a declaration is implemented, the compiler can access it in the best way possible given the implementation. For example, if Swift can see that a property is stored, it can emit code to directly access that memory instead of calling an accessor. However, when Swift doesn't know how the declaration is implemented, it must call some kind of accessor instead, and so it is limited by the capabilities of that accessor.

This kind of abstraction is necessary in several common situations:

  • when the declaration is being accessed through a protocol requirement,
  • when the declaration is a non-final member of a class, or
  • when the declaration is from a different library that's been built with library evolution enabled.

For example, suppose we have this code:

struct Person: Nameable {
  var name: String
}

protocol Nameable {
  var name: String { get }
}

func printNameConcretely(_ person: Person) {
  print(person.name)
}

func printNameGenerically(_ person: any Nameable) {
  print(person.name)
}

In printNameConcretely, Swift knows that name is a stored property of Person, and it can just load that value directly from person and pass it to print. In printNameGenerically, Swift does not know how name is implemented, and it must call a get accessor to copy the current value of the name. To avoid those costs, the Swift optimizer would have to specialize this function for the specific type that is being passed in; this is something that Swift can and does do, but only as a best-effort optimization, which is not always good enough. And, of course, this code would be ill-formed if String were a non-Copyable type, because the only way to satisfy a get requirement for a stored property is to copy the current value.

As a result, Swift has explored a variety of other accessors throughout its history, none of which have ever been officially added to the language through the Swift Evolution process. (The observing accessors, willSet and didSet, are officially in the language but are arguably in a different category because they don't serve as complete operations.) Many of these have been adopted in the standard library for years, but we've been reluctant to make them official because they are variously incomplete, unsafe, or complex.

This vision document lays out the design space of accessors for the next few years, as Swift continues to advance its support for non-Copyableand non-Escapable types. It explains Swift's basic access model and how it may need to evolve. It explores what developers need from accessors in these advanced situations. Finally, it discusses different kinds of accessors, both existing and under consideration, and how they do or not fit into the future of the language as we see it.

This is a prospective vision which has not yet been reviewed by the Language Steering Group. Even if it is approved by the Language Steering Group in exactly this form, it is merely laying out a high-level vision for the language design and does not constitute pre-approval of any specific ideas in this document. Everything in this document will need to be separately proposed and reviewed under the normal Swift Evolution process before it is part of the Swift language.

39 Likes

It's nice to see this all in one place. I do feel like it could spend some more time on both library evolution and ABI compatibility concerns, e.g.:

  • A borrow accessor cannot be later changed to a yield accessor in a source-compatible way (I think?) or an ABI-compatible way (this one I'm pretty certain about).
  • A mutating set cannot be later changed to a nonmutating set in an ABI-compatible way by default (I think?), though if it were important there could be an attribute for it.
  • Existing code in Apple's libraries provide borrowing get, mutating set, and mutating _modify unless otherwise indicated; how does this restrict the options going forwards?
  • What are the code size implications for which accessors should and shouldn't get synthesized, particularly for libraries with binary compatibility concerns?

…because I think valuing source compatibility alongside stored/computed property flexibility may affect the ultimate recommendations for library authors, as will the limitations of the existing Swift 5 ABI on Apple platforms.

6 Likes

While I understand it mostly focuses on the lower-level aspects of accessors (in particular to support ~Copyable and ~Escapable types), I think it could add a note about async, throws, and KeyPath extensions to support these new accessors.

As for the general approach - personally, I am still hoping that we will add generalised coroutines to the language. All of these features are nice, but they are only usable with computed properties and subscripts, so if I need to take a parameter, my only option is a subscript and I lose the ability to define a base name.

I encountered this issue with my WebURL library. I wanted to offer users a mutable view over a string of key-value pairs, in a URL component and using a schema of their choosing:

// My library is forced to look like this:
url[keyValuePairsIn: .fragment, schema: .formEncoded].append("new key", "new value")

// I want it to look like this:
url.keyValuePairs(in: .fragment, schema: .formEncoded).append("new key", "new value")

I also think generator functions are just plain better than the iterator model we currently use for sequences in Swift. They are much, much easier to write - because the next() function doesn't return and lose all of its context between elements (instead it yields and keeps all of its state), you don't need to spend a bunch of effort capturing that context in state machines and instance properties in your iterator. That includes not having to store the borrow of the base sequence - that borrow can be passed in as a parameter, and the generator can suspend while keeping the borrow.

I think coroutines have the potential to be a transformative feature for this language, so I would really like to know more about our overall plans for them before judging which coroutine-driven access patterns need specific language support and how.

17 Likes

I said it in the other thread too, but my general impression is that this

  • is unnecessarily complex
  • does a lot of "damage" to the language

Complexity

When I look at these use-cases, I'm largely comparing to Rust, which has the same constraints (non-escapable references and non-copyable types) to work with. Rust doesn't have properties, but generally provides access to state via fn xxx(&'a self) -> &'a T (roughly, "borrow" in this vision) and fn xxx_mut(&'a mut self) -> &'a mut T (roughly, "mutate" in this vision). Combined, these cover the vast majority of use cases — it's very rare in Rust to see other ways of accessing state.

In cases where these two things don't work, Rust generally simply returns another — nonescaping, noncopyable — object, which might (or might not) implement Deref to allow use as a "smart pointer". This object is then free to include cleanup logic (the tail of the coroutine, in this vision) in its drop (deinit). HashMap access works this way — looking up by a key vends an Entry, which can be used to access, insert, or replace in that position. It's not a perfectly ergonomic API, but it does reflect the realities of interacting with a map.

I think, if Swift had the concept of first-class non-escaping references (and we're so close — Span is it, except that it's a range instead of a single element — we wouldn't need more accessors than just "give me an immutable nonescaping reference" and "give me a mutable nonescaping reference". It might be useful/ergonomic to steal Deref too.

Damage

Currently, if a protocol needs to read a value, it specifies the requirement as get, or to write, as set. Obviously, these don't currently work for noncopyable or nonescaping types, which is why we're here! But in a world with six different accessors, what can a protocol safely require? get/set are no longer the most general, but borrow/mutate won't work in all cases either; nor will yield/yield inout. It kind of drives a wedge into the ecosystem, where some packages will (reasonably!) optimize for performance, others will (reasonably!) optimize for maximum compatibility, and still others will (reasonably!) optimize for ease-of-use in common cases, and no amount of "best practice" will cover all cases, or make those packages interoperable.

Even the sheer amount of knowledge about which combinations of noncopyable and nonescapable work with which accessors, which accessors can witness which other accessors, etc. is going to render this difficult for experts, and impenetrable to beginners. And beginners are going to see it, when they command-click through to the Dictionary subscript to find not set but yield inout.

What, then?

I agree, get/set in their current forms don't work. But I think that this six-pronged approach throws Swift's baby out with the bathwater. It's an "easy" solution — provide every feature anyone ever wanted — but it doesn't take into account the complexity and divisions that it leaves behind.

In an ideal world, we would

  • keep get/set as the only accessor keywords, to avoid dividing the ecosystem
  • allow noncopyable/nonescaping types to participate — though I'm unconvinced this is strictly necessary. If certain ergonomic APIs are only available to Copyable types, and "weirder" types have to use "weirder" APIs, that's not the end of the world, surely?
  • have the compiler perform whatever heavy lifting / accessor style matrixing / etc. is required to have these things interoperate.

I'd at least like to see a discussion of why we can't get any closer to this than the six accessors proposed…

In the other thread I made a proposal that I still think is interesting to add only two of these four (new) accessors: [Pitch] Modify and read accessors - #40 by KeithBauerANZ

But what if we just… didn't add any of these? We're already trying to say that noncopyable and nonescaping are in some way "advanced" features of the language, why should we demand that these be able to be exposed through properties at all, given the complexity of doing so? Would it be the end of the world if Array looked something like:

subscript(_ index: Int) -> Element where Element: Copyable { ... }

borrowing func borrow(_ index: Int) -> Ref<Element> { ... }
borrowing mutating func mutate(_ index: Int) -> MutableRef<Element> { ... }

Copyable? life as normal. ~Copyable? slightly less ergonomic, but still usable.

or if Dictionary looked something like

subscript(_ key: Key) -> Value? where Key: Copyable, Value: Copyable { ... }

borrowing func borrow(_ key: borrowing Key) -> Ref<Value>? { ... }
borrowing mutating func entry(_ key: consuming Key) -> Entry { ... }

struct Entry: ~Escapable {
    mutating func orInsert(_ value: consuming Value) -> MutableRef<Value> { ... }
    mutating func replace(with value: consuming Value)
}

Up to you if Dictionary of Copyables gets to keep using the private coroutine setter, or if it gets deprecated in favor of something somebody outside the stdlib can implement. ~Copyable? Jump through the hoops to make that work.

Or maybe it's all fine

Having spent an hour writing all this, I'm now arguing myself around to "maybe this vision provides this anyway" — everyone should still standardize on get/set, these "weird" accessors are only for weird edge-cases anyway, that most code should never need to interact with. The docs for Array and Dictionary can lie to beginners and claim { get set } if they need to.

18 Likes

My first though is that it would make a nicer model if this could be "simplified" to just get/set/inout. Then add modifiers in the same vein as nonmutating to change the behavior. For instance, yielding get (or some other name) could be equivalent to today's _read.

The reason is to give some guidance: you can provide at most these three. And then for each one you can figure out what the appropriate modifiers are when the default behavior isn't the right one.

Even pointer accessors could be modifiers such as unsafeAddress get and unsafeAddress inout.

8 Likes

How do willSet and didSet fit into this? Are they called for any modification, or only specifically set?

I agree with the general vision here. I think that each of the proposed accessors here fulfill a dedicated and helpful role.

I think the naming could be simplified here — having six accessor names is a lot of complexity. Perhaps the coroutine names should be yield borrow and yield mutate, so that it's easier to remember?

I think it's important that we clarify how these accessors work with protocols, though I think it'll probably be fine if that's done in a proposal rather than the vision document.

1 Like

The more I think about it, the more problematic two-identifier accessor names seem; it probably needs to be yieldInout and so on. I don’t know if that changes your feelings.

(I’ll respond to the other posts here, just taking a few days right now.)

1 Like

Could you elaborate on how two-identifier accessor names would be problematic? Is this a parsing or source-breaking concern?

I do think we should encourage people to use the coroutine versions of the borrowing read and modification accessors over the routine ones, so giving them a longer name isn't ideal. We could make borrow and modify default to specifying a coroutine, making it so that the routine versions would have to be specified as nonyielding borrow and nonyielding modify.

1 Like

It is reasonable to ask whether Swift could categorize all read accesses into these two kinds based on context, the same way that it distinguishes reads from assignments. This would be straightforward at a technical level, but it is controversial as a design direction because the most obvious definitions would be very aggressive about borrowing values. This could cause surprising semantic problems for Swift programmers, especially those working extensively with classes, and it could break the behavior of existing code. This remains an open question that this vision does not take a stand on.

Does this mean it’s out of scope for this discussion?

You're absolutely right that there should be a section on source/binary compatibility.

It really comes down to what guarantees we want to make based on how we can see that a storage declaration is currently implemented. Non-frozen properties are currently always accessed via get/set/_modify accessors across a stable binary interface, regardless of implementation. (This is impossible if the when the value type is non-Copyable, so the story gets more complicated, which is already a subtle stability issue that should be documented in a centralized place!) Naturally, I would guess that we'd want a similar rule for things implemented with borrow or mutate, unless they're frozen.

Lifetimes do make this also a source-stability issue: e.g. if we take advantage of the stronger lifetime guarantee of a borrow accessor, we prevent the declaration from using an implementation that only satisfies a weaker guarantee in the future. Note that this is a basic issue with lifetimes and isn't really specific to accessors: stored properties make similarly strong lifetime guarantees, which (if taken advantage of in client modules) would prevent the property from becoming computed in the future. To me, this is part of a broader issue where Swift needs real controls for source stability; we've been coasting for years without them. That's mostly been okay because most language features have a clear interface/implementation dichotomy; for example, just changing a function body can't directly break callers to the function.[1] There have been exceptions to that from the very first, however, like the way that knowing the full set of cases affects enum exhaustiveness.

Here, we have a few different options for source stability:

  1. Always limit the lifetime guarantee across module boundaries, making it impossible to take optimal advantage of lifetimes even for declarations that will never change.
  2. Never limit the guarantee across module boundaries, creating a somewhat nightmarish compatibility problem for library authors.
  3. Only limit lifetime guarantees for non-frozen declarations in binary-stable modules, like we did with enum exhaustiveness.
  4. Add language features to let declarations control their cross-library guarantees.

(1) does not seem acceptable: libraries should be able to make stronger lifetime guarantees that their clients can use. (2) breaks the ability of a binary-stable library to turn a stored property into a computed one. (3) narrowly solves that problem for binary libraries, but leaves it around for source libraries; the LSG has no appetite for adding more rules like this.[2] So really, I think the only choice is (4). Fortunately, I think the existing features for binary stability work perfectly well here: you can make a property frozen to guarantee that its stored properties won't change, and you can make a computed property @inlinable to guarantee that its implementation won't change. In both cases, that would be enough to make the same lifetime guarantees we could make within a module. And we could loosen that rule within packages, so that co-developed modules could still make those optimal assumptions.


  1. This would not be true if Swift had language features that relied on interprocedural analysis, but we've always resisted that — partly to avoid breaking this dichotomy and thus opening Pandora's box of source-compatibility problems, but mostly to protect compile times. ↩︎

  2. I think it would also be a disaster if we tried. The enum problem is already a serious issue for library authors; this would be exponentially worse. ↩︎

3 Likes

I can add a short section about async/throws effects in accessors, sure.

KeyPaths are interesting. I hadn't thought about them at all, honestly. The KeyPath runtime today does support get/set/_read/_modify semantics. I don't think we're likely to add a refinement of KeyPaths that narrowly only allows paths with borrow/mutate-like semantics; it'd be a whole extra dimension that we'd have to add to the KeyPath types, and key paths are dominantly used in a highly-abstracted way that I think people recognize isn't always going to be the fastest. But it's worth discussing.

Yeah, none of this conflicts with the ability to define these more generally.

You can't combine these with other accessors; they're modifiers on underlying storage only (either inherited or stored). They are called in both modifications and assignments. This is not new — Swift has been semantically distinguishing modification and assignment for (IIRC) its entire public release.

Because willSet has to be called with the previous value still in the underlying storage, there's only one way to define it, semantically, which is as sugar for defining a set accessor:

set(newValue) {
  willSet(newValue)
  underlyingStorage = newValue
}

Every other accessor is then derived exactly like it would be for something with a get and a set.

didSet is a bit different and actually changed in SE-0268. If didSet takes the old value, it's still just sugar for a set exactly like willSet:

set(newValue) {
  let oldValue = underlyingStorage
  underlyingStorage = newValue
  didSet(oldValue)
}

However, if it does not, it can be composed more efficiently with both assignment and modification:

set(newValue) {
  underlyingStorage = newValue
  didSet()
}
_modify {
  yield &underlyingStorage
  didSet()
}
1 Like

It's a parsing concern for both programmers and the compiler. Even in concrete definitions, accessor names are sort of jumbled together with modifiers:

var storage: T {
  mutating set throws { ... } // n.b. we don't currently allow effects on `set`
}

So the idea of throwing a compound-identifier accessor name into that is very unappealing:

var storage: T {
  mutating yield inout throws { ... }
}

But it's even worse in protocols, where we lose the bodies entirely and so the accessor names are just immediately adjacent:

var storage: T {
  yield inout
}

Is that illegally only declaring a yield inout accessor without any way to perform a read access, or do we disambiguate it as declaring both yield and inout?

Hmm. With yielding, we're now talking about interpreting it as a modifier. That does work, and it would work as a syntax on func, too. That's a really interesting idea, thanks!

One problem with assuming yielding is that I do think we might need a yielding get (in conjunction with non-Escapable value types); see the section on values that represent accesses. Clearly we can't default get to yielding get the same way that we would for the others, and we certainly wouldn't default normal funcs to yielding. But, that said, we do default specific accessors to mutating when literally nothing else in the language has that default, so maybe it's not that weird that some specific accessors would default to yielding and need to be made nonyielding to get the stronger rule.

6 Likes

I've updated the vision with sections talking about each of those issues.

1 Like

I don't think it's likely to be a useful conversation in the abstract, no. It needs a real investigation, and maybe then we could talk about what it would mean to be consistently aggressive. Right now, it's not on the table.

1 Like

An Idea that I thought of was for the ability for the denotation of a variable with a custom deterministic with case or a valid case that returns a Bool?.

deterministic will only run the setter if all contained variables remain static, this can prevent some expensive setter code from running every time but could lead to complexity while keeping track of relevant frame data.

valid would be run more often but would allow for changes, valid would run if a value it contained was modified and then would be stored until the next get call where it calls s

A demonstration

var A:Int = 1
var B:Int = 3

var Demo:Int {
    deterministic set { <#Expensive using `A` and `B`#> }
    
    valid(currentVal:Self, get:Bool) {
        // One option
        return true
        // True stops updates
        return false
        // False could cause a constant re-evaluation(Define properly)
        return (current % A) >= (B % A)
        // No odd behavior
        return nil
        // Default behavior( `set` always permitted, `get` always permitted)
    }
}