Modify Accessors

That is true, and irritating...but it does beat finding out at the call site after hours (or days?) of debugging that the modify needs to support error handling. In both cases if you can't modify the library you do have the fairly simple (and, again I admit irritating) workaround of avoiding modify by doing a get into a temporary.

1 Like

Do we need this exception only when it's the last line? We could just allow the terse yield &x syntax which implicitly exits on any sort of break or throw by the continuation, but also allow a more ergonomic yield &x else { ... } (my personal preference) or if yield &x { ... } else { ... }/guard yield &x else { ... }. This is still perhaps somewhat sharp, since it is not apparent from the simple cases that yield cleanup requires some sort of special syntax, but guard-style or if-else style syntax is much more familiar to most users of the language than defer, so I think it would be an improvement over requiring defer for any sort of mandatory cleanup.

I think the clarity wins would make this cheap at twice the price of a guard ... else { return }. Especially if we have a last-line exception.

(In generators, the same rule would apply—if you use a yield in the middle of the function, you need to do something with the return value. I think this is fine.)

Clean is important, but clear is more important. The proposed control flow for yield is not clear. There is absolutely nothing in the present syntax, in other parts of Swift, or in experience from other languages that suggests that yield might never return control flow to straight-line code, but it would still run defer blocks.

Let's be clear about two points:

  1. yield is a pretty advanced feature. modify accessors are a niche use case. Generators are something a programming class might teach, but probably towards the end of the course. I doubt they'll be the bread and butter of development in the way that, say, throws functions are. This means that, outside of specific teams that use it heavily, almost everyone who uses or reads a yield statement will be a beginner who doesn't already understand every nuance of its behavior.

  2. yield is a keyword, and keywords are the most weakly documented part of Swift. That means it's not easy to look up the behavior and see what yield does. (We ought to improve our tooling and documentation to address this, but it's not clear when or how that will happen.) Even more so than usual, people will rely on Google to understand yield, and even if our own documentation discusses the abort path, the random "here's what a modify accessor does" and "here's what a generator does" articles on people's blogs probably won't mention it.

I doubt yield will be used often enough or have relevant documentation that's accessible enough for people to understand this behavior. That means the only way people will find out about this behavior is if the feature is designed to surface it.

And I don't think the fact that many of the use cases for cleanup code involve unsafe APIs gets us off the hook. On the contrary—it makes doing this right more vital. The combination of rarely used features with unsafe code raises the stakes. The fact that running with scissors is dangerous doesn't mean that, if we see someone doing it, we should be happy to throw a couple hurdles in front of them.

Adding a Bool return value or doing one of the other things I suggested makes yield's behavior visible. It prompts a user unfamiliar with yield to ask the question, "what is this return value and what should I do with it?" I think it's better for them to ask this question up front and learn that they don't need to do anything special than it is for them to not ask it until they've spent hours debugging some seemingly impossible control flow.

If clarity at the point of use is our goal, we should favor a design that makes the abort path visible and clear.

16 Likes

Now that I let modify sink-in a little bit, here's my feedback on the pitched semantic, especially regarding the continuation/termination of yield statement.

Whether yield continue execution should depend on whether set would be called in the get-set semantic. That is, yield will continue iff set would be called, and terminate otherwise.

This would let use avoid combinatorial explosion should we introduce new non-trivial control flow, eg. async/await with get-set vs get-modify.

Furthermore, it'd allow for modify-only accessor. Something like this

var foo: Value {
  modify {
    sharedStateComputation()

    someGetComputingX()
    defer { getCleaning() }

    yield &x

    // Not called on get-only usage, or exception is called before any mutation.
    someSet(newValue: x)
  }
}

I think this structure would be most common even for get-modify semantic.

I think the discussion surrounding the error handling distracts us from the fact that yield shouldn't rely on the exception. Even if an exception is thrown, if the value is mutated and should be registered into the accessor, the set is still called. So the exception handling is orthogonal to get-set semantic, and so should get-modify.

To add, in the status quo, set is called even when the function throws:

struct Test {
    var _value: Int
    var value: Int {
        get {
            print("Getting")
            return _value
        }
        set {
            print("Setting to \(newValue)")
            _value = newValue
        }
    }
}
enum TestError: Error { case err }
func mutatingTest(a: inout Int) throws {
    a = 2
    throw TestError.err
}
extension Int {
    mutating func mutatingTest() throws {
        self = 1
        throw TestError.err
    }
}

var test = Test(_value: 3)

try? mutatingTest(a: &test.value)
// Getting
// Setting to 2

print("Current value", test.value) // 2

print()

try? test.value.mutatingTest()
// Getting
// Setting to 1

print("Current value", test.value) // 1

Note that throwing accessors are one important reason why I feel that design approaches that center around disallowing normal code after yield are non-starters. All of those ideas require the cleanup code to not throw, but presumably we do want to allow accesses that are ending normally (i.e. there hasn't already been an error thrown in the caller that aborted the access) to do things that might throw, like call a throwing setter.

For what it's worth, there are deeper semantic questions that we'll eventually need to settle about whether setters should be guaranteed to be run if an access is aborted.

4 Likes

But isn't it the status quo to call set on inout variables even if the function throw? Is there a merit in changing that, or is it simply that it's not documented?

That is the status quo, yes. However, it's semantically problematic to call the setter if it can throw, because now you might have two different errors but you can only throw one thing. Also, it's not at all obvious that we should call the setter when the access is aborted with an error; we don't have any real reason to think that the new value is "ready", and doing nothing seems like the more sensible default behavior.

7 Likes

I see.

Given how messy it is to add throwable accessor, and that current semantic may change. It's all the more reason to tie yield continuation/termination to get-set rather than to let it wanders on its own :grin:.

1 Like

As a regular user of defer, I think the early-exit behaviour proposed in the first post is superior to all the suggested alternatives thus far.

The following has been known since Swift 2:

The following is new, though perhaps imperfectly explained by the proposal:

  • yield does not guarantee that the scope's execution will continue after the coroutine has yielded.

It follows from those two things that defer is the place to do any mandatory cleanup.

None of the suggestions manages to be simpler than the proposal in that regard.

11 Likes

I hadn’t noticed the read keyword.

Never mind, having things line up like that is an “all or nothing” proposition for me, so this makes me in favor of keeping it spelled modify (or at least this makes me think my earlier reasoning is invalid).

1 Like

I agree that the proposed reliance on defer is unfortunate - code is easiest to understand when it appears in execution order. It might be the case that it is indeed the best option, but it’s well worth looking hard for a better one.

I like the general notion of explicit conditionality around the yield statement itself. There’s definitely a bimodality to this - either the yield succeeded and values are valid, or it didn’t and you need to clean up. Sounds like an if…else… in some form or another.

However, if yield returns a status relating to exceptions, that seems to preclude it from ever supporting input from the caller (a la the send method in Python, for example). I’m not sure if that’s a big deal - I can’t really say I’ve used that functionality much in Python or any other language, for example - but it warrants careful consideration if that approach is to be pursued. It seems like supporting send-like functionality is important to having generic coroutines.

Moreso, though, it means you either allow users of yield to do dubious things like _ = yield // YOLO, or conversely it brings in lots of special-casing and hand-slapping from the compiler.

I think @brentdax’s proposed use of guard is the most promising approach suggested thus far. It doesn’t preclude having yield return something else (the send functionality), e.g.:

guard let x = yield foo else { return }

…or, if you want to distinguish between nil and actual yield failure:

guard let x? = yield foo else { return }

…but it doesn’t require it since a bare guard yield foo is distinctly checking the success of the yield itself, not the yield’s return value.

It also doesn’t preclude having more content after the yield, outside of the guard clause, which is essential to supporting generators (which I value more than this ‘modify’ functionality; generators are the dog, this ‘modify’ stuff is just the tail). It’d still be clear - to anyone familiar with existing Swift - what code might be run when, as long as they understand what guard yield foo means.

I feel like putting a try anywhere in the mix doesn’t really help anything, though maybe it has merit for consistency, since really the yield is basically a functional call. i.e. it’s synonymous with:

func fooFighter(_ yield: @autoclosure (inout value) throws -> ()) {
    // Prepare stuff.
    do {
        try yield(&value)
    } catch {
        // Abandon work.
    } finally {
        // Clean up stuff.
    }
}

I do like the idea that you can choose, as implementor of the generator, whether you allow the yield to fail or not. Just like in any other code where it’s valuable to make that choice, for various reasons. Requiring all uses of yield to permit the yielded-to code to throw feels like a presumptuous over-reach; the goal is always to have code not be capable of throwing exceptions nor returning nil, wherever possible, since that makes life so much simpler.

1 Like

You can't choose. The loop that's using your generator could contain a break, and then your coroutine won't resume after the yield that preceded said break.

I really like @brentdax idea too, the guard ... else or yield ... else are the best ideas IMO. They either build on existing and common behavior with a clear paths or, in the case of yield ... else, the similarity to guard makes the whole thing very clear.

Despite all that, what of the case of cleanup that has to be done in both paths ? Writing it twice would cumbersome and a source of bugs, using defer would not works too.

You don't need to repeat yourself, and defer works just fine. Do cleanup on both paths like so:

// prepare the way

do {
  defer { myCleanupCode() }
  yield &myStorage
}

// compute some more
1 Like

Can you elaborate ? Why would it not work. defer was explicitly design to handle such case.

1 Like

It's true that's how yield (or synonyms) tend to work in other languages - does it that have to be that way, though? Couldn't Swift say that the coroutine may require that it be resumed cleanly, without exception (pun slightly intended)? Presumably this could be enforced by the static type system, whether through 'throws' or its absence, or some similar mechanism specific to yield…

e.g.

func modifyA() yields -> inout T {
  // Prepare stuff.
  yield &_value  // Always returns & resumes.
  // Finish stuff.  This is _always_ run.
}

func modifyB() yields interruptably -> inout T {
  // Prepare stuff.
  guard yield &_value else {  // Compiler requires use of guard now,
                              // to handle interruptions.
    // Cleanup.
    return  // Or could throw in a variant of this method marked throws.
  }
  // Happy path.  Only run if control resumed cleanly.
}

modifyA() += 5 // Fine, no possibility of different control flow
               // short of process abort.

x = modifyA()
throw Error.Reasons
x += 5  // Compile-time error:  "x", returned by coroutine `modifyA`,
        //  may only be used before throwing or returning.

y = modifyB()
if maybe {
  return // Fine.  `modifyB` is implicitly resumed into the else
         // clause of the guard, before the `return` here takes effect.
}
y += 3

It's worth noting that a lot of this is only a concern with mutable values in combination with yields. I would expect that most code will yield immutable values, and wouldn't care how the caller resumes the coroutine; it's only critical to know when you have to know if the caller finished initialising your mutable value or not.

Using defer for this is similar to writing a distinct finalise method on the coroutine object itself (which is how Python makes you do it, for example). But, as in Python, that means you have an ugly hole in the language where you either can't because the object is implicit or can easily forget to because the compiler doesn't make you do it correctly.

An alternative would be to say that the code after yield is always run - it's implicitly 'defered'. That doesn't solve the problem, however, of distinguishing between whether the yield actually worked as intended or not, which is critical for yields of mutable values.

I feel like trying to express this through the existing function return syntax is problematic, since if you want to say that the code yielded to may throw an exception, where do you put the throws?

func modify() rethrows yields -> throws inout T {  // Wat?
  do {
    try yield &_value
  }
}

Maybe the stuff relating to the yield should be in the parameters…

func modify(_ yield: @yield (inout T) throws -> ()) rethrows {
  do {
    try yield(&_value)
  }
}

Now it's at least clean, and the only magic here is this new @yield decorator, similar in some sense to @autoclosure, which affects how this function - now actually a coroutine - is called.

Or maybe modify the parameter declaration syntax slightly to make it simpler and more clearly special, removing the ability to externally name the 'yield' parameter since a caller can't address it explicitly anyway (and incidentally enforcing its name within the function, for consistency across all code):

func modify(@yield: (inout T) throws -> ()) rethrows {
  do {
    try yield(&_value)
  }
}

So in a slightly more complicated generator example:

func subscript(_ slice: Range<Int>, @yield: (T) throws -> ()) rethrows {
  for i in slice {
    do {
      try yield(self._values[i])
    }
  }
}

…

for value in myAboveType[3..<7] {
  // Process process process…
  // Oh noes!
  throw Error.Bewm
}

It's also a little more familiar and intuitive, as a result, as to what order defer blocks are handled in, if they're used amongst this - since now you're pretty explicitly making a call via the yield, at the site of the yield you would expect the callee's defers to run first, then return to you, and then you fall back further up the stack.

One up the stack happens to be the superset function of the one you called via yield, though, so the question would have to be answered: is it possible to have only some of the defers & catches in said function run first, and the rest after the coroutine's?

More concerningly, where does the try go on the call to the coroutine? The only place it could be put is on the call as normal, but the problem is that the actual exception might not be raised until code after that coroutine call has executed…

func subscript(_ slice: Range<Int>, @yield: (T) throws -> ()) throws {
  guard somePrecondition else {
    throw BadUser.NoSoupForYou
  }

  for i in slice {
    do {
      try yield(self._values[i])
    }
  }

  guard postCondition else {
    throw OhNoes.WhatNow
  }
}

…

for value in try myAboveType[3..<7] {  // Might throw here, if the precondition fails…
  throw Error.Bewm  // Might 'rethrow' here, because of my throw…
}
// Or might throw here, if the postcondition fails.

Even if we accept that exceptions can now appear from places beyond just the statement actually containing the try

One possibility is to require the entire use of the coroutine be within a contiguous do block, and educate users about how coroutines can work regarding exceptions. I think that education would be acceptable, I'm just not sure this restriction would be practical for the compiler to enforce and not unreasonably limit the use of coroutines.

An alternative approach would be to restrict how coroutines may terminate - i.e. that they cannot throw after a yield. That seems pretty limiting, though - now you're really limited in how you can handle problems with caller modifications to the mutable values you might yield - either find some way to handle it gracefully & silently, or crash the entire app - no middle ground.

On the upside, at least this syntax leaves easy room for future functional expansion - i.e. the ability to send values back into the coroutine, or to have a return value distinct from the yielded values - in a way that would be source backwards-compatible.

1 Like

In the guard ... else or yield ... else syntax, defer is not always available to use, which is a point against it:

modify {
  // Defer-ing here may not be possible
  // depending on when your variables become available
  // Like `a` below, if this syntax is choosen
  yield let a = &x else {
    // Cleanup code here
  }
  // And here
}

That problem does not come up in the original proposal, though I still prefer the yield ... else syntax for error handling.

That’s the reason I think the much simpler proposed behaviour of yield is better than all the complexity people are weaving in order to avoid using defer, a feature that the language has had for several years.

7 Likes

defer is always possible even in the original syntax or any of the proposed one.
I don’t think declaring new variable makes any sense during yield. So right above the yield should have all that you need.

You can already refer to a using x.

1 Like
Terms of Service

Privacy Policy

Cookie Policy