Modify Accessors

glessard · December 20, 2019, 11:34pm

As a regular user of defer, I think the early-exit behaviour proposed in the first post is superior to all the suggested alternatives thus far.

The following has been known since Swift 2:

defer always executes when the current scope ends
(see Statements — The Swift Programming Language (Swift 5.7))

The following is new, though perhaps imperfectly explained by the proposal:

yield does not guarantee that the scope's execution will continue after the coroutine has yielded.

It follows from those two things that defer is the place to do any mandatory cleanup.

None of the suggestions manages to be simpler than the proposal in that regard.

David_Sweeris · December 21, 2019, 4:31am

I hadn’t noticed the read keyword.

Never mind, having things line up like that is an “all or nothing” proposition for me, so this makes me in favor of keeping it spelled modify (or at least this makes me think my earlier reasoning is invalid).

wadetregaskis · December 21, 2019, 7:03am

I agree that the proposed reliance on defer is unfortunate - code is easiest to understand when it appears in execution order. It might be the case that it is indeed the best option, but it’s well worth looking hard for a better one.

I like the general notion of explicit conditionality around the yield statement itself. There’s definitely a bimodality to this - either the yield succeeded and values are valid, or it didn’t and you need to clean up. Sounds like an if…else… in some form or another.

However, if yield returns a status relating to exceptions, that seems to preclude it from ever supporting input from the caller (a la the send method in Python, for example). I’m not sure if that’s a big deal - I can’t really say I’ve used that functionality much in Python or any other language, for example - but it warrants careful consideration if that approach is to be pursued. It seems like supporting send-like functionality is important to having generic coroutines.

Moreso, though, it means you either allow users of yield to do dubious things like _ = yield // YOLO, or conversely it brings in lots of special-casing and hand-slapping from the compiler.

I think @beccadax’s proposed use of guard is the most promising approach suggested thus far. It doesn’t preclude having yield return something else (the send functionality), e.g.:

guard let x = yield foo else { return }

…or, if you want to distinguish between nil and actual yield failure:

guard let x? = yield foo else { return }

…but it doesn’t require it since a bare guard yield foo is distinctly checking the success of the yield itself, not the yield’s return value.

It also doesn’t preclude having more content after the yield, outside of the guard clause, which is essential to supporting generators (which I value more than this ‘modify’ functionality; generators are the dog, this ‘modify’ stuff is just the tail). It’d still be clear - to anyone familiar with existing Swift - what code might be run when, as long as they understand what guard yield foo means.

I feel like putting a try anywhere in the mix doesn’t really help anything, though maybe it has merit for consistency, since really the yield is basically a functional call. i.e. it’s synonymous with:

func fooFighter(_ yield: @autoclosure (inout value) throws -> ()) {
    // Prepare stuff.
    do {
        try yield(&value)
    } catch {
        // Abandon work.
    } finally {
        // Clean up stuff.
    }
}

I do like the idea that you can choose, as implementor of the generator, whether you allow the yield to fail or not. Just like in any other code where it’s valuable to make that choice, for various reasons. Requiring all uses of yield to permit the yielded-to code to throw feels like a presumptuous over-reach; the goal is always to have code not be capable of throwing exceptions nor returning nil, wherever possible, since that makes life so much simpler.

glessard · December 21, 2019, 10:07am

You can't choose. The loop that's using your generator could contain a break, and then your coroutine won't resume after the yield that preceded said break.

glessard · December 21, 2019, 10:22am

You don't need to repeat yourself, and defer works just fine. Do cleanup on both paths like so:

// prepare the way

do {
  defer { myCleanupCode() }
  yield &myStorage
}

// compute some more

Jean-Daniel · December 21, 2019, 12:17pm

Can you elaborate ? Why would it not work. defer was explicitly design to handle such case.

wadetregaskis · December 21, 2019, 4:21pm

It's true that's how yield (or synonyms) tend to work in other languages - does it that have to be that way, though? Couldn't Swift say that the coroutine may require that it be resumed cleanly, without exception (pun slightly intended)? Presumably this could be enforced by the static type system, whether through 'throws' or its absence, or some similar mechanism specific to yield…

e.g.

func modifyA() yields -> inout T {
  // Prepare stuff.
  yield &_value  // Always returns & resumes.
  // Finish stuff.  This is _always_ run.
}

func modifyB() yields interruptably -> inout T {
  // Prepare stuff.
  guard yield &_value else {  // Compiler requires use of guard now,
                              // to handle interruptions.
    // Cleanup.
    return  // Or could throw in a variant of this method marked throws.
  }
  // Happy path.  Only run if control resumed cleanly.
}

modifyA() += 5 // Fine, no possibility of different control flow
               // short of process abort.

x = modifyA()
throw Error.Reasons
x += 5  // Compile-time error:  "x", returned by coroutine `modifyA`,
        //  may only be used before throwing or returning.

y = modifyB()
if maybe {
  return // Fine.  `modifyB` is implicitly resumed into the else
         // clause of the guard, before the `return` here takes effect.
}
y += 3

It's worth noting that a lot of this is only a concern with mutable values in combination with yields. I would expect that most code will yield immutable values, and wouldn't care how the caller resumes the coroutine; it's only critical to know when you have to know if the caller finished initialising your mutable value or not.

Using defer for this is similar to writing a distinct finalise method on the coroutine object itself (which is how Python makes you do it, for example). But, as in Python, that means you have an ugly hole in the language where you either can't because the object is implicit or can easily forget to because the compiler doesn't make you do it correctly.

An alternative would be to say that the code after yield is always run - it's implicitly 'defered'. That doesn't solve the problem, however, of distinguishing between whether the yield actually worked as intended or not, which is critical for yields of mutable values.

I feel like trying to express this through the existing function return syntax is problematic, since if you want to say that the code yielded to may throw an exception, where do you put the throws?

func modify() rethrows yields -> throws inout T {  // Wat?
  do {
    try yield &_value
  }
}

Maybe the stuff relating to the yield should be in the parameters…

func modify(_ yield: @yield (inout T) throws -> ()) rethrows {
  do {
    try yield(&_value)
  }
}

Now it's at least clean, and the only magic here is this new @yield decorator, similar in some sense to @autoclosure, which affects how this function - now actually a coroutine - is called.

Or maybe modify the parameter declaration syntax slightly to make it simpler and more clearly special, removing the ability to externally name the 'yield' parameter since a caller can't address it explicitly anyway (and incidentally enforcing its name within the function, for consistency across all code):

func modify(@yield: (inout T) throws -> ()) rethrows {
  do {
    try yield(&_value)
  }
}

So in a slightly more complicated generator example:

func subscript(_ slice: Range<Int>, @yield: (T) throws -> ()) rethrows {
  for i in slice {
    do {
      try yield(self._values[i])
    }
  }
}

…

for value in myAboveType[3..<7] {
  // Process process process…
  // Oh noes!
  throw Error.Bewm
}

It's also a little more familiar and intuitive, as a result, as to what order defer blocks are handled in, if they're used amongst this - since now you're pretty explicitly making a call via the yield, at the site of the yield you would expect the callee's defers to run first, then return to you, and then you fall back further up the stack.

One up the stack happens to be the superset function of the one you called via yield, though, so the question would have to be answered: is it possible to have only some of the defers & catches in said function run first, and the rest after the coroutine's?

More concerningly, where does the try go on the call to the coroutine? The only place it could be put is on the call as normal, but the problem is that the actual exception might not be raised until code after that coroutine call has executed…

func subscript(_ slice: Range<Int>, @yield: (T) throws -> ()) throws {
  guard somePrecondition else {
    throw BadUser.NoSoupForYou
  }

  for i in slice {
    do {
      try yield(self._values[i])
    }
  }

  guard postCondition else {
    throw OhNoes.WhatNow
  }
}

…

for value in try myAboveType[3..<7] {  // Might throw here, if the precondition fails…
  throw Error.Bewm  // Might 'rethrow' here, because of my throw…
}
// Or might throw here, if the postcondition fails.

Even if we accept that exceptions can now appear from places beyond just the statement actually containing the try…

One possibility is to require the entire use of the coroutine be within a contiguous do block, and educate users about how coroutines can work regarding exceptions. I think that education would be acceptable, I'm just not sure this restriction would be practical for the compiler to enforce and not unreasonably limit the use of coroutines.

An alternative approach would be to restrict how coroutines may terminate - i.e. that they cannot throw after a yield. That seems pretty limiting, though - now you're really limited in how you can handle problems with caller modifications to the mutable values you might yield - either find some way to handle it gracefully & silently, or crash the entire app - no middle ground.

On the upside, at least this syntax leaves easy room for future functional expansion - i.e. the ability to send values back into the coroutine, or to have a return value distinct from the yielded values - in a way that would be source backwards-compatible.

glessard · December 21, 2019, 4:35pm

That’s the reason I think the much simpler proposed behaviour of yield is better than all the complexity people are weaving in order to avoid using defer, a feature that the language has had for several years.

Lantua · December 21, 2019, 4:55pm

defer is always possible even in the original syntax or any of the proposed one.
I don’t think declaring new variable makes any sense during yield. So right above the yield should have all that you need.

You can already refer to a using x.

Lantua · December 21, 2019, 7:38pm

yield shouldn't be treated as doing something that may throw. It's still possible that yield will terminate the routine from different reason, like break out of the yielding for-loop (in the Alternatives Considered section) which does make sense to me.

I think the fact that yield may terminate will grow on users easily, especially that the <keyword> <statement> syntax already signify the non-regular control flow (this at least holds true thus far).

I see yield &something similar to try doSomething(), and putting the cleanup code after try statement is already a programming bug. So I don't see the sharp edge worrisome (or even sharp, for that matter).

Still, I prefer that yield semantic is tied to the set semantic for modify (Heck, I sound like a broken record at this point. I should stop...).

beccadax · December 21, 2019, 7:56pm

If we do require guard yield, I think defer would be a good way to handle cleanup on both paths. The fact that you see an else { return } would adequately indicate that defer should work.

Matt_Gallagher · December 22, 2019, 12:28am

I like the proposal.

I agree with @michelf, @mattpolzin and others that the issue in this proposal is not around modify directly but around yield and the related behaviors that need to be clarified.

The tl;dr from me is that I think the expectation is that, as much as possible, yield should behave similar to try in a throws function. If we have that as a mental model, then yield might not seem so scary.

I disagree with some commenters that you could avoid the yield keyword by using closures. You can't have two yielding closures and they can early exit, making a bespoke syntax less confusing than a weirdly constrained closure parameter.

I think the proposal should include a clear indication of how yield will function outside of modify. My suggestions...

1. Yielding functions should have yields decorators

Throwing functions have throws decorators so yielding functions should have yields decorators.

extension Array: MutableGenerator {
  func generateMutably() yields Element {
    for i in 0..<count {
      yield &self[i]
    }
  }
}

This is a situation where the modify accessor is a weird place to start, since it will likely be an exception to this rule but other functions that want to call yield will need to declare the type of the parameter that they mutate.

@michelf showed yields on a closure parameter. I disagree that this is the correct approach, since yields is not a property of the parameter.

2. You should be able to handle yield in a do block

You can handle try in a do block, you should be able to handle yield in a do block.

extension Array: MutableGenerator {
  func generateMutably() yields Element {
    for i in 0..<count {
      do {
        yield &self[i]
      } earlyExit {
        // unlike `defer` this would be *only* on throw from coroutine
      }
    }
  }
}

The do/catch syntax is the best way to introduce people to try (even if we often use try? or other sugar instead) since it clearly shows what is happening. Similarly, this syntax would clearly show the need to reason about early exits with yield (and would give this scenario a clear name) but would typically be used only in situations where you have multiple calls to yield (e.g. an unwound loop) or handling of both thrown errors and earlyExits.

As mentioned by @anandabits the need to handle the failure as a special case is poorly handled in a defer.

3. It would be nice to use yield in a condition

NOTE: I've needed to edit my comment. Previously, I suggested using yield in arbitrary conditions like try? but I realize that's not really possible.

extension Array: MutableGenerator {
  func generateMutably() yields Element {
    for i in 0..<count {
      if yield &self[i] {
        // do something
      }
    }
  }
}

This is not valid because we must exit if yield is fails. Similarly...

extension Array: MutableGenerator {
  func generateMutably() yields Element {
    for i in 0..<count {
      guard yield &self[i] else { continue }
    }
  }
}

This is not valid since we must exit the function after a yield, we can't simply continue the loop.

Perhaps it should be valid to use guard with a yield condition as long as the else block strictly uses return (not continue or break)

extension Array: MutableGenerator {
  func generateMutably() yields Element {
    for i in 0..<count {
      guard someOtherCondition, yield &self[i] else {
        // clean up
        return
      }
    }
  }
}

but that would be a new special cased kind of guard and the need for a new kind of guard weird enough that I leave it to others to decide if it's worth the effort.

Lantua · December 22, 2019, 12:35am

I’d prefer to have it be the inout on the resulting type.

func foo() -> inout Int {
  yield &x
}

While I’m not sure how a multi-yield function would look. This seems intuitive enough for a single-yield function. And it also makes sense to use the yielded value as an inout argument to other functions).

func bar(_: inout Int) { ... }

bar(&foo())

Chris_Lattner3 · December 22, 2019, 1:00am

This proposal is very exciting, I'm thrilled to see this coming together!!

That said, I share the concerns about giving magic behavior to defer. Thus far in Swift, defer {x} is exactly equivalent to executing x on all paths outside of the block. This seems inconsistent with that.

Did you consider adding a more specific form to yield to directly model this (unusual) case? You could support something like:

yield &x

as the last statement in the block to handle the common case, but require something like:

yield &tmp normalreturn {
  normal return code
} errorreturn {
  error return code
}

with suitably bikeshed'd contextual keywords for normalreturn and errorreturn. This approach would follow the general structure of guard and other control flow statements, and would allow the compiler/IDE to fixit "code after a yield" into the elaborated structure which encourages thinking about the error case.

Lantua · December 22, 2019, 1:12am

Isn’t that the behavior in the pitch? That defer is always executed, even if yield fails. That’s why it’s used for the clean up code?

Matt_Gallagher · December 22, 2019, 1:12am

Unless I'm misreading this proposal, it's not giving magic behavior (or any new behavior) to defer. However, defer would definitely be cumbersome if you needed to distinguish between success and failure cases.

var success = false
defer {
  if success {
    // cleanup after success
  } else {
    // cleanup after failure
  }
}
yield &x
success = true

Ergh.

glessard · December 22, 2019, 1:59am

I think the pitch relies on the behaviour of defer That is defined in the manual.https://docs.swift.org/swift-book/ReferenceManual/Statements.html#grammar_defer-statement.

In what way is it unexpectedly magical?

glessard · December 22, 2019, 2:05am

There hasn’t been a significantly better alternative suggested. The problem is that there are 3 cases: clean up with problem, clean up without problem, and unconditional cleanup.

Honestly, the unconditional case might just be the more common one.

wadetregaskis · December 22, 2019, 3:03am

Perhaps that’s the crux of the disagreement here - whether that last statement is true.

You’re quite right that there’s three possibilities - and an innate desire to handle all three elegantly is what draws me to something involving explicit control flow - whether guard-based or do-based or whatever. I’m not specifically presuming the other two cases are common, and I’m happy to be pragmatic if someone can demonstrate they wouldn’t be, but I’m not comfortable presuming they aren’t.

For example, what about the case where the modify method is inserting a new value implicitly? It’s not correct for it to insert some arbitrary ‘dud’ value if the caller fails to fully initialise the value, anymore than it’s valid to leave the raw memory in an undefined state. It might not be possible, either - there might be no viable ‘fallback’ or ‘default’ value. Surely that particular pattern alone won’t be uncommon?

It’s true, however, that defer is fine for the unconditional follow-up case - that is of course right in defer’s wheelhouse. And technically it would be viable to not handle the other two now, but add that later, in a syntactic sense.

However, the concern with not handling that elegantly from the start is particularly about letting people write broken code - that folks will naively but quite understandably just write stuff after the yield with the mistaken assumption it’ll always be run. It would be technically valid to allow that - and not even precluding future improvements - but it’d be disappointing. Why provide that footgun?

It also feels like a particularly odd omission given that Swift as a language is above-average in making control flow explicit. You can’t randomly get an exception - you know if it’s a possibility or not by the presence or absence of try. You can’t decline to choose what to do about it - you have to explicitly catch it or mark yourself with throws or at least use try! and be happy crashing. Making yield special in this regard feels like a needless regression; an unfortunate hole in the language. That defer can be made to workaround it will be little consolation for all the developers’ lost time dealing with bugs around this.

So that’s why I want to see this all handled from the outset, as opposed to ignoring it all for now and living on hope. I realise that can seem frustrating to those that see the coroutine aspect of this pitch as incidental or an implementation detail. Really, though, this modify is just a very singular application of coroutines, so we really need to sort out that functionality before cementing specific applications atop it.

michelf · December 22, 2019, 3:14am

And about modify, maybe the unconditional case should be the only one allowed.

Currently, the setter is unconditionally called, so if we wanted to preserve the current semantics of a property the modify accessor should have only have one path where the value is unconditionally written at the end.

Maybe the property contains logging information or other useful artefacts from the operation that has thrown the error. It's nice to avoid unnecessary work, but in general the accessor itself isn't well positioned to decide if the data should be saved or not in the presence of a thrown error.

I think simply allowing an error path that does something different in modify would make properties less reliable in general. Users would now have to think about the behavior of each property before determining whether it's suitable in case of a thrown error. In case of a chained access, all the properties in the chain are involved at the same time and they might not all behave in the same way, which would be confusing.

Properties are so fundamental to the language; we should keep them simple to reason about.