[Pitch] Observation (Revised)

Philippe_Hausler · March 20, 2023, 8:03pm

Correct, it is called once (and only once) if any of the accessed properties are about to be set (via willSet). It is worth noting that the AsyncSequence interface does not aim to solve object synchronization; it only bounds the transactions of properties together into a batched access.

Sadly that behavior had to be dropped due to both performance problems as well as memory impact - basically it was just too expensive to do generally. However if you need that - extracting the value into a computation and manually calling the withMutation can give a method to solve that (whereas @Published doesn't even really allow that easily).

Im not sure I follow the question here: are you talking about the changes(for:), values(for:) or the withTracking, I presume the former two by the emphasis. The answer is that asynchronous observation needs to be done in a task; but as soon as the iterator is constructed that will establish an observation to manage which properties changed.

The ValueObservation in GRDB seems quite similar to the ObservedValues AsyncSequence.

withTracking fires the onChange closure with the willSet of the first property that changes from the properties accessed within the apply. That ensures animations and such with no additional modification to work.

Notes: I have been working with the SwiftUI team to investigate additional relaxation of that for exploring future directions like actor support that initial results show that with some modification we might actually be able to do out-of-line animations. But it is worth noting that work is missing a few bits before it can be done and we feel that the current withTracking and its willSet callback behavior is ideal to start from.

That section isn't a future directions but instead an involvement from the SwiftUI team to give a preview of the thoughts we are working on. I know that is a bit un-orthodox in comparison to how traditionally SwiftUI stuff is released but to reenforce the statement - the community impact and feedback is really important here and we feel it is impactful enough that we bend the normal schedule of information.

This is perhaps one of the most "wow factor" demos we have been doing - nesting "just works". For example; we initially thought that things like @State would break observation.. but without any alteration it pretty much just worked. As you can guess folks are really excited about this.

I have an implementation in the main swift repo: https://github.com/apple/swift/blob/main/stdlib/public/Observation/Sources/Observation/ObservationTracking.swift. It is written in Swift (with some slight call-outs to interoperate with thread local storage, which is handled very gingerly to avoid getting tangled with async/await).

tcldr · March 20, 2023, 9:59pm

OK, I think I now understand withTracking and I agree – it sounds like an incredible mechanism and can see why the team is excited – it sounds really great.

However, I still have extremely strong reservations about the choice to use asynchronous sequences for the remainder of the API.

It's for all the same reasons that SwiftUI needs the update for its tracked properties delivered synchronously, that we also need a way to request observed key path changes synchronously. The currently proposed API (changes(for:), etc.) does not allow us to do that in any reasonable way. I think that's a big omission.

Right now, any view update triggered asynchronously by changes(for:), will lag any update triggered synchronously by withTracking. Practically speaking, it means all model updates will have to be tracked directly by the view as any attempt to perform intermediary observations will result in the desync issues described up thread.

gwendal.roue · March 21, 2023, 12:46pm

OK so let's imagine a system where an "observer" component wants to observe a value, while another component, the "modifier" wants to modify it.

We have to code the observer.

Let's add some constraints:

The observer displays the value on screen. It's just a way to make the observer concurrency-constrained. Displaying the value on screen requires running on the main thread/actor.
The observer should display the "latest" value. You can understand this informally, as as layperson would say it. For us developers, this means that the observer may miss some updates, but that the observer must always eventually catch up. For a sequence of changes 1, 2, 3, the observer might not display 1 and 2, but it must eventually display 3.
The modifier is the rest of the application. This constraint, or lack thereof, is there in order to explain that we code the observer as independently as possible from the modifier. Goals: local reasoning, decoupling, etc.

This is a lot of words to describe a very common need. But it looks like being very explicit is necessary. My apologies to other readers - I hope you can still recognize some of your own needs in this break down of a thought experiment.

Lemme: because the observer is concurrency-constrained, the "latest value" might be displayed a little bit late (until the change notification reaches the main thread). That's unavoidable, so that's ok. This does not bring any information, but I just want to make sure this has been understood.

So, how do we code the observer?

// First attempt at implementing the observer
func startObserver() { // sync
    // 1. Display the current value
    display(observedObject.value)
    
    // 2. Start an observing task
    Task {
        // 3. Listen to changes
        for await value in ... {
            // 4. Display fresh value
            display(value)
        }
    }
}

Maybe some @MainActor decorations have to be added - but this is the gist.

This first attempt above is not correct, because between the initial display, and the beginning of the observation, some changes may be performed, and they are not notified. We fail the second constraint "The observer should display the latest value".

Even if I relax the third constraint "The modifier is the rest of the application", and make the observer able to tell the modifier "hold on", and "ok I'm ready you can start modifying the value now", I still don't know how to fix the above sample code, because the observer never knows for sure when observation has really started, and the modifier can safely be unleashed. We don't want to unleash the modifier until the observation has started, but the async sequence does not emit anything until a modification has been performed -> we're stuck.

In the end, I don't know how to make a correct implementation of the initial requirements. Those requirements are very common. I actually expect that this is more or less explicitly expected by many developers from this pitch. Some developers might by surprised by the second constraint, which allows the observer to miss some values. Well, this is the consequence of the 1st constraint. We have of give up with synchronous dispatch of changes. It's not something which is easy to give up, I know. That's why I took the time to write the 2nd constraint as clearly as I could - so that everyone can decide if it's an acceptable trade-off for the loss of the synchronous change notification. I think that it is.

If you agree that the described setup is reasonable, maybe you can take this question as a fun challenge to test the pitch against?

I'm not asking for this use case to ship built-in in the pitch - I'm just curious about the mere ability of the pitched apis to support it. Later on, if we establish that it's actually frequent, and actually difficult to implement it correctly (that it's not trivially composable), we might proceed with some support from the standard lib. The first question is just "but is it possible, or not?"

tcldr · March 21, 2023, 1:55pm

I think Rx inspired libraries such as Combine handled this problem particularly elegantly:

// Combine attempt at implementing the observer
func startObserver() { // sync
    // 1. No need to display the current value. It's synchronous. Just as long
    //     as the publisher _immediately_ emits the current value. Your 2nd
    //     constraint is met
    // 2. Start observing for changes
    observablePublisher
      .sink { value in display(value) } // this is still synchronous on initial connect!
      .store(in: &cancellables)
}

I'm not saying we should go back to Combine necessarily. What I am saying is let's not 'throw the baby out with the bath water' and forget why the Rx/KVO libraries handled it this way initially.

Personally, the loss of intermediate values is not what concerns me – latest is fine. It's the loss of synchronisation between an object graph of Observables that's the bigger issue in my mind. We already know people use ObservableObject in this way, so there's good reason they will use Observables in the same way.

I've included a contrived example below. In this example when an Even number is displayed, the background of the view should always be red. When an Odd number is displayed it should always be green.

However, as the parent's observation of the ChildObservable's value property lags the View's observation of the ChildObservable's value property, the number and color will often fall out of sync and display an inappropriate color.

// OBSERVABLES 

@MainActor @Observable final class ChildObservable {
  var value = 0
  // synchronously fires the update to the view
  func plusOne() { value += 1 }
}

@MainActor @Observable final class ParentObservable {
  
  let subobservable = ChildObservable()
  var color = Color.red
  
  func startObserver() {
    Task {
      // asynchronously fires
      for await value in subobservable.changes(for: \.value) {
        // uh oh. value might be stale by now...
        self.color = value % 2 == 0 ? .red : .green
      }
    }
  }
}

// VIEW

@MainActor struct NumberView: View {
  
  @State private var model = ParentObservable()
  
  var body: some View {
    Text(verbatim: "\(model.subobservable.value)")
      .background(model.color)
  }
}

gwendal.roue · March 21, 2023, 2:03pm

Yes. And GRDB's ValueObservation as well. But this is not what is pitched. The pitched sequences do not emit anything until the first detected change is performed (syncronously or not):

I'm not sure the question was understood, as a matter of fact, because the answer is slightly off-topic. What's withMutation has to do with the ability of the sequences to start with an initial value (even if no change is performed)? Or I'm missing something.

tcldr · March 21, 2023, 2:18pm

Got it. I agree. That would be totally unexpected.

gwendal.roue · March 21, 2023, 2:33pm

Unless I'm mistaken, this need is acknowledged by the pitch:

It looks like the changes(for:) method(s) accepts a TrackedProperties which can be fulfilled with an array of keyPaths: this is how you observe a "graph". One gets automatic observation of multiple properties in one shot, with ObservationTracking.withTracking.

But one still has to load the full "graph" from one call to ObservationTracking.withTracking, if one cares about invariants. If one would perform two calls to this method and merges the results together, and the system goes from the (A1, B1) state to (A2, B2), then the merged change notifications might include invariant-breaking pairs such as (A1, B2) or (B2, A1).

That's my current understanding - I hope I didn't say anything wrong.

gwendal.roue · March 21, 2023, 2:35pm

@Philippe_Hausler said "Sadly" in his reply - maybe this is something that can be revisited.

gwendal.roue · March 21, 2023, 3:01pm

I wonder if my question is just "how can I reimplement the built-in ObservationTracking.withTracking(_:onChange:) from the pitched sequences?"

But why would I want to reimplement it if it's built-in?

In the sample code, I'm not sure about what scheduleRender is supposed to do, though. It is supposed to directly call render unless observation should stop?

Philippe_Hausler:

@Observable final class Car {
    var name: String
    var awards: [Award]
}

let cars: [Car] = ...

func render() {
    ObservationTracking.withTracking {
        for car in cars {
            print(car.name)
        }
    } onChange {
        scheduleRender()
    }
}

tcldr · March 21, 2023, 3:05pm

Yup. That's my understanding too. In practice it means that using changes(for:) for observations on the current actor would require very careful reasoning by the programmer to avoid breaking invariants. I'm not sure this would be the expectation.

In practice, I think it limits the use of changes(for:) and the other asynchronous sequence observation methods to across actor boundaries, which creates a strange dual with ObservationTracking.withTracking.

gwendal.roue · March 21, 2023, 3:18pm

Which expectation are you referring to?

tcldr · March 21, 2023, 3:21pm

I think it's just saying that the view tree should be re-rendered on the next frame. So basically there could be a bunch more mutations but render() (or even scheduleRender()) will be called a maximum of once per frame. And if there's no mutations in the current frame, scheduleRender() won't get called at all.

That the event received matches the current state of the actor.

Jason_Gregori · March 21, 2023, 3:47pm

gwendal.roue:

// First attempt at implementing the observer
func startObserver() { // sync
    // 1. Display the current value
    display(observedObject.value)
    
    // 2. Start an observing task
    Task {
        // 3. Listen to changes
        for await value in ... {
            // 4. Display fresh value
            display(value)
        }
    }
}

I definitely expect this to work and NOT lose values. If you can lose values between 1 and 3 here then what’s the point of even supporting async? Everyone will have subtle bugs and have to just know that this doesn’t work with async.

@Philippe_Hausler will this lose values when the object you are observing is on a different actor?

gwendal.roue · March 21, 2023, 3:54pm

Yup. I dropped this assumption a long time ago, due to my working on GRDB. Any value I read is a stale cache, and I can only touch the "real" state when I'm performing a write.

That's unavoidable as soon as notification of changes is asynchronous. As the new value is reaching its observing actor, some other writes can be performed. When the observer eventually gets the new value, the "state" may already be different. That's what I mean when I say that notified values are stale.

This looks scary, but this is not really important, as long as:

A "next" state is notified eventually, replacing the old stale value with another value, still stale, but fresher :-)
All invariant-linked values needed by the observer are observed together.
All writes that are initiated from a stale value are ready to discover a "real" state that is... different. For example, an "increment" button should not set the value to the last known value + 1. It should set the value to the current value + 1:
```
func increment() {
    // wrong
    state.value = myStaleValue + 1
    // correct
    state.value += 1
}
```
This sounds trivial, but you can imagine more subtle scenarios. For example, the number you want to increment no longer exists.

I dropped this assumption a long time ago, but it was a slow and difficult birth. Making sure all invariant-linked values are observed together requires discipline. The lack of composition of observation hurts the minds who like to build big stuff from multiple small stuffs. Here, you can't: one has to observe the big stuff right away.

tcldr · March 21, 2023, 4:05pm

100%. And that's what concerns me about this API, it feels like it would encourage lots of small async observations on the same actor, where it isn't necessary. There's a soft of forced constraint which is that parents can talk to their descendants synchronously, but descendants must talk to their ancestors asynchronously. Yes, across actor boundaries (such as talking to a DB), fine, it's necessary, but as you say, it requires some discipline to do it right.

The best analogy I can come up with is that it would be like having exclusively async closures. Life would be tough. Async closures have their place but we shouldn't use them everywhere.

gwendal.roue · March 21, 2023, 4:15pm

Maybe you say that the pitched api fosters lots of small observations, when it should instead foster a small number of big observations?

Indeed developers should observe (a, b, c, d) in one shot, instead of observing a, b, c, d independently, or they might suffer from broken invariants. (I repeat myself but I'm always concerned that people forget what we're talking about.)

ObservationTracking.withTracking is a step in the good direction, isn't it?

tcldr · March 21, 2023, 4:21pm

I don't think I can overstate how great I think ObservationTracking.withTracking is. I'm very excited by it, actually. (Nice work, all.) But it serves a very particular purpose, it would be very hard to use it for general purpose observation – It's custom built for triggering view refreshes.

The best outcome for me would be a synchronous equivalent of changes(for:) to complete the picture.

gwendal.roue · March 21, 2023, 4:24pm

I suppose you think about Observable.values(for:) that accepts a single keyPath argument:

protocol Observable {
    /// Returns an asynchronous sequence of changes for the specified key path.
    nonisolated func values<Member: Sendable>(
        for keyPath: KeyPath<Self, Member>
    ) -> ObservedValues<Self, Member>

I think I agree with you. I'm sorry to talk again about my experience, but GRDB used to have convenience observation methods for single database requests. They have since been removed, because they were indeed fostering code that was difficult to refactor.

Once one has written an observation of A, and an observation of B, one wants to merge them together, not to write a new observation of (A, B). It takes guts and experience to destroy code. And of course broken invariants happen infrequently, so incorrect code is easily merged into the main branch.

That's something to consider. Maybe Observable.values(for:) is a convenience method that fosters misuse.

Philippe_Hausler · March 21, 2023, 4:24pm

What do you mean by that? Changes happen asynchronously; if you are inferring some sort of callback then that doesn't compose well.

Philippe_Hausler · March 21, 2023, 4:27pm

One key portion to consider with that - the requirements of Sendable enforce some of what y'all are discussing. E.g. the values(for:) must be used upon elements that can be sent across actor domains - but that means that if you want to observe more than one value together as a unit - it means that unit must then be paired with its mutation. That leads folks into grouping values together in sendable structures.

Also the Sendable-ness of the AsyncSequences themselves is gated upon the observed type being Sendable. That means that to cross actor domains you must make the type Sendable.