Modify Accessors

GreatApe · June 15, 2020, 11:58am

Late to the game but thanks, this is exactly what has been confusing me. I can't see the purpose of the coroutine concept here, when all you really do seems to be to allow callers to run a modification closure on your backing property.

I can understand that there are other reasons for introducing (semi-) co-routines, perhaps related to SwiftUI or (let's hope:) async/await, and perhaps I am missing something.

Karl · June 27, 2020, 4:01pm

Is it, though? One could argue that by leaving such sharp edges around, we make it an advanced feature which is not accessible to newcomers.

Especially if this generalises to all coroutines. Suddenly coroutines themselves become an advanced feature because of how careful you need to be.

Given how the language has developed since this pitch, I’m more in favour of a yield statement with a mandatory cancellation handler, like this:

  yield &myThing else {
    // yield failed (was cancelled). Clean up.
    // similar to guard, you must return from this block.
  }
  // yield was successful. Do post-processing.

It’s simple to understand the possible control flow when you have it written there, and difficult to understand (with potentially disastrous consequences) if you don’t. Even experienced developers appreciate when things are made obvious.

But we should really move forward with this. I keep forgetting it’s not actually part of the language yet.

AnotherUser · June 27, 2020, 8:00pm

I like this. IMO you've cracked it.

orobio · June 27, 2020, 10:49pm

I still feel it makes more sense to have a mandatory end of yield handler:

yield &myThing and {
  // Clean-up. Regardless of how the yield ended.
}
// Continue here if the yield ended normally.

Promoting a model that makes inout yielding similar (but opposite) to inout parameters, of which the state is always stored, regardless of the function ending normally or by throwing.

AnotherUser · June 27, 2020, 11:31pm

We might even support something akin to both, but drawing upon ideas from syntax patterns that already exist:

If yield is preceded by a defer clause, then you may call it on its own:

defer {
  // Clean-up, regardless of how the yield ended.
}
yield &myThing
// Continue here if the yield ended normally.

Otherwise, it must be followed by an else clause, a-la guard else:

yield &myThing else {
  // Clean-up, since the yield failed.
  // You must return from this block.
}
// Continue here if the yield ended normally.

Many Swift users are already familiar with the guard else concept, where if a check fails then a return is necessary, possibly with some cleanup. If many such cleanup-before-exit scenarios exist, then a single defer statement cleans those up nicely.

Similarly, with the pattern outlined above, both options are available: cleanup in-place in case of yield failure, or cleanup in defer in case of any scope exit.

orobio · June 28, 2020, 9:22am

In my opinion this gives the preferred pattern the most unnatural syntax where things mostly execute out of order. The syntax should guide the developer into writing the preferred semantics. Especially if, as @Karl said, we want this to be a less advanced feature.

yield &x else {} communicates: You should clean-up differently depending on how the yield ends.
yield &x and {} communicates: You should clean-up after your yield, no matter what happened.

Since cleaning up regardless of how the yield ends is consistent with how inout parameters are handled, the syntax should make that pattern very natural.

Requiring and {} after every yield may not be the solution, but requiring else {} will in my opinion make the most natural way to write a yield have the wrong semantics.

It isn't really the same though. else doesn't feel right here since the yield is performed no matter what. Many lines of code could be executed before the yield ends with a throw. So, it's really not: yield or else do this.

Max_Desiatov · June 28, 2020, 3:11pm

To borrow from syntax used in other languages for try {} catch {} finally {} I think it could make more sense as yield &x finally {}.

Lantua · June 28, 2020, 3:14pm

Or maybe yield! and yield? pair? Just throwing it out there.

Karl · June 28, 2020, 4:07pm

I like the finally idea.

The one reservation I have is that it loses information about whether the access was terminated or completed successfully; you always first have to restore to a valid state, then do any post-processing. For generators, you only want to clean up when the yield is cancelled:

let file = openFile(...)
for line in file.lines {
  yield line finally { 
    // Uh-oh. Only want to close on cancel!
    closeFile(file)
  }
}
closeFile(file)

Maybe this is just better with a defer, but I think it should still be possible to do without it. We could introduce a cancelled flag, similar to how catch has an error parameter-like thing, and how willSet has newValue, etc. So you could optionally write:

yield line finally(wasCancelled) { 
  if wasCancelled {
    closeFile(file)
   }
}

I actually don’t think a shorthand is a good idea. If you don’t care about the branch in control-flow (i.e. you have no steps to perform to return to a valid state), it only costs you a single keyword and some empty braces, which is a pretty acceptable cost IMO:

yield? &x
vs
yield &x finally {}

At the same time, it makes things obvious for everybody reading or modifying your code. You don’t want them accidentally introducing undefined behaviour because they forgot to consider such a niche thing as throwing modifications!

Also, consider the general coroutine case: if x is a generator, we might want a syntax like yield? *x to mean “delegate to the generator x if it does not yield a nil value”. I mentioned upthread somewhere that JavaScript has this.

Dante-Broggi · June 28, 2020, 5:16pm

IIRC from my opinion back when this was last active, I believed that a modify accessor, possibly unlike more general coroutines should neither know nor care about whether it was thrown through, as distinguished from completing normally.

Lantua · June 28, 2020, 5:25pm

I, for one, am still a proponent to the design that yield should always succeed to be inline with get-set-storage.

AnotherUser · June 28, 2020, 5:53pm

Perhaps we should be more verbose about it, then.

yield &x catch {
  // ... Clean up for failure case
  // And an `error` parameter like try-catch
} finally {
  // This is always called.
}

Or maybe even extend the use of do-catch:

do {
  yield &x
} catch {
  // ... Clean up for failure case
} finally {
  // ... This is always called.
}

Or maybe:

do {
  yield &x
  // ... Clean up for success case
} catch {
  // ... Clean up for failure case
}

AnotherUser · June 28, 2020, 5:58pm

The biggest issue I see right now with the latter two options above is that it becomes less clear that our catch clause is supposed to return. A compiler error might fix that for yield's case, but that would confuse newer users when it comes to the try-catch case, which does not need to return.

@Lantua On reading your linked post, I must agree that it is far easier to reason about control flow from the yielder's perspective when I don't need to consider that yield might not continue to following statements, especially when considering Swift's heretofore consistent accessor model.

Karl · June 28, 2020, 7:31pm

See the original post (I appreciate it has been a long time) for why that isn't really viable. A generator should terminate if a user stops consuming from it, so it is necessary for yield in a generator to be a termination point for the function.

Having yield terminate in some contexts (generators) but not others (modify accessors) leads to a more complex language.

Rust's implementation of async/await exposes this cancellation problem, and it's an acknowledged pain point. This is what they say about it:

For something like yield &something finally { cleanup() }, I have reasonably high confidence that developers (including unknown third-party developers who worked on my open-source dependencies) will get it right when it comes to cancellation-tolerant code. They're forced to think about it, and if they weren't, I would be substantially less confident.

Given that, and the relatively low keystroke cost in cases where you don't need to do anything special to be cancellation-tolerant, and the catastrophic consequences when you get it wrong... I just think it's worth spelling it out.

Lantua · June 28, 2020, 8:50pm

We kinda already had this discussion, started somewhere around here:

So to avoid reiterating the discussion, I'll shut up for now (I don't have anything to add more that what I already said anyway)

To be clear, I meant to use non-failable yield on modify, and failable one on generator.

glessard · June 28, 2020, 9:11pm

That's not workable. That would prohibit the code calling your modify from throwing. We have to be able to write this:

try something(&myModifiableProperty)

Lantua · June 28, 2020, 9:24pm

Do you mean error thrown while yielding (by the caller)? My idea is that yield would succeed, and the error would be send to the caller once modify block finishes.

The only downside I saw/recall was what @Ben_Cohen said as this design isn't compatible with current _modify,

orobio · June 29, 2020, 3:37pm

Looking at your example, I've been trying out several of the mentioned ideas, including finally and do { yield... } blocks, but something seems not right with all of them. The fact that yield can terminate the function has a big impact on control flow, which makes it difficult to satisfy all use cases properly. So, I'm thinking: What if we make the control flow explicit?

Making the control flow explicit

By making the control flow explicit, we can make all use cases work naturally, while making it very clear to the developer when they have to think about termination. We can do this using the following:

Every coroutine gets an implicit variable, terminate, which is set to true whenever the coroutine should terminate.
Execution always continues normally after a yield. The yield will never terminate the coroutine. If something happens during the yield that should terminate the coroutine, like for example a throw, then the terminate variable will be set to true.
The compiler enforces that yield cannot be called if terminate may be true.
The compiler enforces that you cannot throw if terminate may be true.

The following examples hopefully make clear what this provides:

Modify

// Modify just works.
// Execution will always continue after yield, which makes the code natural:
modify {
  var value = getFromStorage()
  yield &value
  moveBackToStorage(value)
}

Generators

// A generator requires explicit control flow handling.
// The compiler will let you know:
func generate() {
  for index in self.indices {
    var value = getFromStorage(index)
    yield &value // error: yielding while terminate may be true
    moveBackToStorage(index, value)
  } 
}

// The generator can be fixed as follows:
func generate() {
  for index in self.indices {
    var value = getFromStorage(index)
    yield &value
    moveBackToStorage(index, value)
    if terminate { break }
  } // OK: Compiler can prove that yielding never happens while terminate == true
}

Additional clean-up and yields

// Additional code after the loop is no problem, if it follows the rules:
func generate() {
  for index in self.indices {
    var value = getFromStorage(index)
    yield &value
    moveBackToStorage(index, value)
    if terminate { break }
  }

  // We can do some additional clean-up here. As long as it doesn't
  // throw or yield, because terminate may be true at this point.
  someMoreCleanup()

  // Do we need another yield? That's fine, as long as
  // we check terminate first to keep the compiler happy:
  if terminate { return }

  var finalValue = something()
  yield &finalValue
  store(finalValue)
}

Advantages

This approach has the following advantages:

Being able to explicitly specify the control flow makes all use cases fit naturally.
Code that should be on the same level of indentation, can be written that way. It never felt right to me that the code preparing the yield and the clean-up code were on different levels.
The syntax doesn't require you to specify any additional keywords and/or code blocks if you don't need to.
If you do need to think about coroutine termination, the compiler will tell you.
To get the behavior of the proposed yield, all you need is: if terminate { return }, directly after the yield.

AnotherUser · June 30, 2020, 12:19am

@orobio This definitely has the advantage of not shoehorning any particular syntax format to yield, making the call itself really simple in the case of modify, and fairly easy to reason about in more complex cases (generators, etc).

Lantua · July 5, 2020, 11:36pm

I'm thinking that in multi-yield case like generator, it's actually better to have yield be a condition. If the caller stops anticipating the yielding element, the condition will fail, and any subsequent yield will (fail and) immediately return. So it'd look like this:

guard yield &self[i] else { return }

if yield &self[i] { ... } else { ... }

while yield &self[i] { ... }

// warning: result from `yield` is unused.
yield &self[i]

This way, the code structure is left to the author. They now decide when the function should terminate, and have much more control on how to write cleanup code.

Surely, it could lead to wasteful case, like not stopping after yield fails. Though, given that API usually yields in the middle of an invalid/transitional state, more natural control on termination should be given to the API author.

It also leverage yield to be normal functions mentally, which is something programmers are already very good at.

To clarify, multi-yield condition would be true unless the consumer stops anticipating the next value. This can happen when call breaks out of the for loop, or throw values. In both cases, the new value is still written back to the yielding variable.

This is still inline with my single-yield semantic earlier, that yield should always succeed, since the caller always anticipate a single value to work with.

I also recognize what @Karl said, that having yield behaves differently in different context would lead to more complex language. But I could not begin to fathom how to reconcile multi-yield context and single-yield context without one give way to another to the point of being detrimental. Especially when they're already much different.