Pitch: didSet Semantics

suyashsrijan · August 13, 2019, 6:17pm

didSet Semantics

Proposal: SE-NNNN
Author: Suyash Srijan
Review Manager: TBD
Status: Awaiting Review
Implementation: apple/swift#26632

Introduction

Introduce two changes to didSet semantics -

If a didSet observer does not reference the oldValue in its body, then the call to fetch the oldValue will be skipped. We refer to this as a "simple" didSet.
If we have a "simple" didSet and no willSet, then we could allow modifications to happen in-place.

Swift-evolution thread: didSet Semantics

Motivation

Currently, Swift always calls the property's getter to get the oldValue if we have a didSet observer, even if the observer does not refer to the oldValue in its body. For example:

class Foo {
  var bar: Int {
    didSet { print("didSet called") }
  }

  init(bar: Int) { self.bar = bar }
}

let foo = Foo(bar: 0)
// This calls the getter on 'bar' to get 
// the 'oldValue', even though we never 
// refer to the oldValue inside bar's 'didSet'
foo.bar = 1

This might look harmless, but it is doing redundant work (by allocating storage and loading a value which isn't used). It could also be expensive if the getter performs some non-trivial task and/or returns a large value.

For example:

struct Container {
  var items: [Int] = .init(repeating: 1, count: 100) {
    didSet {
      // Do some stuff, but don't access oldValue
    }
  }
  
  mutating func update() {
    for index in 0..<items.count {
      items[index] = index + 1
    }
  }
}

var container = Container()
container.update()

This will create 100 copies of the array to provide the oldValue, even though they're not used at all.

It also prevents us from writing certain features. For example, a @Delayed property wrapper may be implemented like this:

@propertyWrapper
struct Delayed<Value> {
  var wrappedValue: Value {
    get {
      guard let value = value else {
        preconditionFailure("Property \(String(describing: self)) has not been set yet")
      }
      return value
    }

    set {
      guard value == nil else {
        preconditionFailure("Property \(String(describing: self)) has already been set")
      }
      value = newValue
    }
  }
  
  private var value: Value?
}

class Foo {
  @Delayed var bar: Int {
    didSet { print("didSet called") }
  }
}

let foo = Foo()
foo.bar = 1

However, this code will currently crash when we set bar's value to be 1. This is because Swift will fetch the oldValue, which is nil initially and thus will trigger the precondition in the getter.

Proposed Solution

The property's getter is no longer called if we do not refer to the oldValue inside the body of the didSet.

class Foo {
  var bar = 0 {
    didSet { print("didSet called") }
  }

  var baz = 0 {
    didSet { print(oldValue) }
  }
}

let foo = Foo()
// This will not call the getter to fetch the oldValue
foo.bar = 1
// This will call the getter to fetch the oldValue
foo.baz = 2

This also resolves some pending bugs such as SR-11297, SR-11280 and SR-5982.

As a bonus, if the property has a "simple" didSet and no willSet, then we could allow for modifications to happen in-place. For example:

// This is how we currently synthesize the _modify coroutine
_modify {
  var newValue = underlyingStorage
  yield &newValue
  // Call the setter, which then calls
  // willSet (if present) and didSet
  observedStorage = newValue
}

// This is how we're going to synthesize it instead
_modify {
  // Since we don't have a willSet and
  // we have a "simple" didSet, we can
  // yield the storage directly and
  // call didSet
  yield &underlyingStorage
  didSet()
}

This will provide a nice performance boost in some cases (for example, in the earlier array copying example).

Source compatibility

This does not break source compatibility, unless someone is explicitly relying on the current buggy behavior (i.e. the property's getter being called even if the oldValue isn't referenced). However, I think the possibility of that is very small.

Effect on ABI stability

This does not affect the ABI as observers are not a part of it.

Effect on API resilience

This does not affect API resilience - library authors can freely switch between a didSet which does not refer to the oldValue in its body and one which does and freely add or remove didSet from the property.

Alternatives considered

Leave the existing behavior as is.

Future Directions

We can apply the same treatment to willSet i.e. not pass the newValue if it does not refer to it in its body, although it wouldn't provide any real benefit as not passing newValue to willSet does not avoid anything, where as not passing oldValue to didSet avoids loading it.

We can also deprecate the implicit oldValue and request users to explicitly provide oldValue in parenthesis (didSet(oldValue) { ... }) if they want to use it in the body of the observer. This will make the new behavior more obvious and self-documenting.

suyashsrijan · August 13, 2019, 6:19pm

I have created this pitch in the meantime because I am not sure if the core team will accept it without a proposal, but it would be great to get an official response. cc @Joe_Groff @John_McCall @Douglas_Gregor

John_McCall · August 13, 2019, 6:32pm

Changes in semantics that aren't just fixing obviously buggy behavior do need to go through evolution. I think there's a reasonable chance this would be accepted, so I'd encourage you to go ahead.

suyashsrijan · August 13, 2019, 6:40pm

Thanks! I'll let this pitch stay open for a while to collect and address feedback and then I'll open a PR on the Swift Evolution repo with the official proposal.

ahti · August 13, 2019, 9:29pm

How exactly do you propose this should behave? If I use oldValue only under some conditions will the getter only get called in those conditions? If I use oldValue multiple times, would it get called multiple times as well?

My first impressions is that oldValue should probably behave just like a lazy var, calling the getter on first use, but I haven't put too much thought into that.

suyashsrijan · August 13, 2019, 9:35pm

The only time the getter isn’t called if there’s zero references to the oldValue inside the didSet, otherwise there is no change in behaviour.

Thanks for asking though, I will rephrase what I meant because it’s probably a little confusing. When I say “do not use”, what I really mean is “do not reference”.

layoutSubviews · August 13, 2019, 9:52pm

Original Post

While we're planning potential semantic changes, should this proposal also include a corollary with willSet & newValue?
I don't have a use case handy, but one could easily imagine a willSet handler that doesn't access newValue either.

I had a brain fart . Indeed the same problem does not apply to willSet.

Rod_Brown · August 13, 2019, 9:57pm

Welcome @layoutSubviews!

willSet already has access to the newValue as it is being passed in as part of the setter. There is no additional access required, from my understanding. And you explicitly access the oldValue via property access.

This differs from didSet which needs to perform an implicit property access to get the oldValue.

John_McCall · August 13, 2019, 10:11pm

Right, not passing newValue to willSet doesn't avoid anything, whereas not passing oldValue to didSet avoids loading it.

This also means that, when added to a stored variable, an unparameterized didSet with no willSet could allow modifications to happen in-place. I think it'd be nice if that was part of the proposal, too.

That is, currently storage that adds observers is essentially generated like this:

var observedStorage: T {
  get { return underlyingStorage }
  set(newValue) {
    willSet(newValue)                   // if willSet is provided
    let oldValue = underlyingStorage    // if didSet is provided
    underlyingStorage = newValue
    didSet(oldValue)                    // if didSet is provided
  }
  _modify {
    var newValue = underlyingStorage
    yield &newValue
    observedStorage = newValue          // i.e. call the setter above
  }
}

And currently you're proposing this if didSet doesn't take a parameter:

var observedStorage: T {
  get { return underlyingStorage }
  set(newValue) {
    willSet(newValue)                   // if willSet is provided
    underlyingStorage = newValue
    didSet()
  }
  _modify {
    var newValue = underlyingStorage
    yield &newValue
    observedStorage = newValue          // i.e. call the setter above
  }
}

But it'd be nice if there was a further refinement where we did this if there wasn't a willSet:

var observedStorage: T {
  get { return underlyingStorage }
  set(newValue) {
    underlyingStorage = newValue
    didSet()
  }
  _modify {
    yield &underlyingStorage
    didSet()
  }
}

suyashsrijan · August 13, 2019, 11:33pm

Sounds good, I'll add an implementation and update the proposal shortly! I have a question about the implementation though - we synthesise the body for the coroutine in synthesizeCoroutineAccessorBody(). At the moment, we build a reference to the storage, wrap it in an InOutExpr and then yield it i.e. we'd end up doing something like this:

(brace_stmt implicit
  (yield_stmt implicit
    (inout_expr implicit type='inout Int'
      (member_ref_expr implicit type='@lvalue Int' decl=test.(file).Foo.value@/Users/suyashsrijan/Desktop/test.swift:2:7 direct_to_impl
        (declref_expr implicit type='Foo' decl=test.(file).Foo.<anonymous>.self@/Users/suyashsrijan/Desktop/test.swift:2:7 function_ref=unapplied)))))

for

class Foo {
  var value: Int {
    didSet { print("Hello") }
  }
}

let f = Foo(value: 0)
f.value = 1

Are you saying we need to yield the value directly?

John_McCall · August 14, 2019, 12:03am

Well, actually, if you can stand doing a significantly more complicated implementation, the right approach would be to recognize this as a pair of new kinds of WriteImplKind and ReadWriteImplKind: StoredWithNullaryDidSet and InheritedWithNullaryDidSet. So you'd suppress the special synthesis for the setter (if a setter is required, it can use the normal, opaque "direct-to-impl" access), and then in SILGen you'd emit write and read-write accesses by accessing the underlying storage and then calling didSet. But that's definitely a lot more involved.

@Slava_Pestov might have thoughts about this.

suyashsrijan · August 14, 2019, 12:51am

Yeah, I can look into doing that, but I’m not familiar with SILGen so it’s definitely going to be very tricky. However, that doesn’t mean I or someone else can’t improve it in a follow-up PR though (unless doing it via that approach is required for this proposal to be accepted).

For now, what would be a reasonable and simple approach to tweak it for the coroutine case? If there’s no way of doing it in code synthesis and SILGen is the only place then I’ll look into doing that.

John_McCall · August 14, 2019, 1:04am

A shorter-term tweak would be to make the ReadWriteImpl be Modify and then trigger the modify coroutine to be synthesized the same way that the getter and setter are, generating a body that yields a reference to the underlying storage (rather than to the current storage but direct-to-impl) and then calls didSet.

It's possible that you might have to suppress this in cases where the property is @objc.

suyashsrijan · August 15, 2019, 10:13pm

Proposal Updated

Title has been changed to "didSet Semantics". If you have a better suggestion, then please let me know.
The proposal now includes an additional change: when the storage does not have a willSet, but has a didSet which doesn't refer to the oldValue, then a modification can be allowed to happen in-place. The implementation has been updated to support that.

jrose · August 15, 2019, 10:53pm

I find this really subtle. I'd be happier if we said "using oldValue without explicitly declaring it is deprecated" (i.e. you should write didSet(oldValue), as you've always been able to do, if you want to use it). That way it's obvious from the code whether the getter will be called or not. The implementation would still be what you have here, since we can't actually make old code stop working, but we can push people towards being explicit.

(The question would then become whether we want the same rule for willSet.)

Effect on API resilience

This feature does not expose any new public API.

This section isn't about API that's exposed; it's about what changes library authors are now allowed or not allowed to make between releases. Is it okay to go from a didSet that doesn't take the old value to one that does? (probably yes) How about the other way? (probably yes) Is it okay to add a didSet to a property that didn't have one before? (still yes) Is it okay to remove it? (still yes).

suyashsrijan · August 15, 2019, 11:24pm

I'd be happier if we said "using oldValue without explicitly declaring it is deprecated" (i.e. you should write didSet(oldValue) , as you've always been able to do, if you want to use it). That way it's obvious from the code whether the getter will be called or not.

Yeah, that's a good point. I agree it makes the behaviour more obvious to anyone reading the code and we can certainly add a new diagnostic to flag this on Swift version >5.1. I guess we can add a fix-it to the diagnostic to auto-magically insert oldValue as well. The downside is having to explicitly write oldValue in parens, which was nice to drop unless you wanted to override the name. I am not sure how the community will feel about this, but I guess there's no harm in making the behaviour more self-documenting.

This section isn't about API that's exposed; it's about what changes library authors are now allowed or not allowed to make between releases

Oh, sorry. I will update that section of the proposal. Thanks!

jrose · August 15, 2019, 11:26pm

Re: the version: if we think the diagnostic is the way to go, I don't see any reason to limit it to new Swift versions. (Note that there's no -swift-version 5.1; we're still on -swift-version 5 at the moment.) The explicit syntax worked fine on old compilers and we add warnings all the time.

suyashsrijan · August 15, 2019, 11:31pm

Oh, but wouldn't that change once 5.1 ships officially? I don't know when this will go into review, so I thought the diagnostic would have to apply to >5.1, but sure we can certainly not limit it.

jrose · August 15, 2019, 11:47pm

Not every new compiler release automatically gets a new language version. The point of the language versions is to allow for backwards-incompatible changes, which is unfortunate because we have to keep supporting the old behavior too. If the core team decides that the behavior change here (not calling the getter) is significant enough to be considered backwards-incompatible, it'll have to wait for a release where the core team decides it's okay to introduce a new language version, and only apply it then. But if they think we can make the change apply to existing code, then it's okay to emit warnings in existing language modes.

Panajev · August 16, 2019, 6:50am

Considering the references to Swift 5.1 elsewhere (do not want to say Xcode beta docs) and what are you saying now, risking to sound mega dense, could you just clarify whether the upcoming Xcode 11 will get both ABI and module stability with the compiler that ships with it please?

(in theory it is with Swift 5.1 that module stability happens in practice and with the amount of third party developers using Swift for binary frameworks rising, it is starting to get a very very appropriate question :))

Kind Regards,

Goffredo