Modify Accessors

dabrahams · August 21, 2020, 11:49pm

I know this is an old thread, but I've realized something as I've worked with _modify that shouldn't be overlooked when this is finally proposed. Sometimes you need to extend the lifetime of a value across a yield. Because you can't yield from a closure inside _modify, the library-defined withExtendedLifetime is now inadequate and we need something like _fixLifetime to be exposed publicly.

michelf · August 22, 2020, 12:23am

Having done some work with _modify too, I can attest that not being able to use yield from within a closure is inconvenient at times. I generally work around that by adding dummy fileprivate subscripts in extensions on other types and calling those subscripts which themselves yield whatever I need. It's a bit messy, but it works.

Regarding withExtendedLifetime: I don't think you actually need to do the work within the closure. Calling it after the last point of use is basically the same thing as _fixLifetime:

_modify {
   yield &foo.bar
   withExtendedLifetime(foo) {}
}

BigSur · August 22, 2020, 12:30pm

Maybe Swift need some equivalent stuff like unsafe fixed in C#.

dabrahams · August 25, 2020, 11:41pm

Regarding withExtendedLifetime : I don't think you actually need to do the work within the closure.

Good point. On the other hand, [SR-13450] withExtendedLifetime(x) {} optimizes worse than _fixLifetime(x) · Issue #55892 · apple/swift · GitHub so you might want to use _fixLifetime anyway

Joe_Groff · August 25, 2020, 11:50pm

It would be interesting to take the concept of yield-once coroutines from accessors, and make it into a general mechanism for scoped modifiers like using or fixed in C#, yeah. Modeling things like withExtendedLifetime, withUnsafePointer, and friends as coroutines instead of closure-taking higher-order functions would allow them to compose with _modify coroutines, as well as async functions in the future.

dabrahams · August 28, 2020, 12:45am

I'd be interested to hear more.

While we're thinking about how to better integrate these with the rest of the language, I'll note that yield-once accessors don't really compose with higher-level functions. I was trying to apply the “non-capturing closure” method from this gist, and realized there is no way to create a _modify accessor that routes through a closure to an underlying _modify accessor.

dabrahams · October 2, 2020, 5:30am

Reviving this thread because I just noticed this part of the proposal.

This is inconsistent with the stated rationale for try, which was that it is important to mark all the places that can throw with try.*

In fact, I now realize I've written a number of accessors that are broken in the face of a throwing RHS context due to the behavior proposed. You might say the whole point of modify (and especially read) is that the author gets to inject some cleanup code after the yield. But as an author, nothing forces you to confront the fact that the code after yield might be skipped, so it's really easy to overlook. Given that:

error-handling paths are rarely tested in practice, and
every other function an error can propagate from has to be explicitly labeled [re]throws, so you get used to being protected from this kind of mistake and will be unwary,

ouch

I came to this post by way of wondering, “hmm, I wonder what happens when an error propagates through a yield?” and before I found this part of the pitch, I did some experimentation. My first instinct was that the code after yield would be run unconditionally, since it's invariably going to be cleanup code, but turns out I was wrong. “Okay, somebody probably figured you might want to do something different when there's no error,” I thought. So I tried putting a do { … } catch { … } around the yield. But the compiler doesn't accept that.

Eventually I hit on using defer, which obviously has to work… but if you did want different behavior when an error propagates, forces you into this:

modify {
  var errorOccurred = true // I have to lie, at least temporarily
  defer { 
    if errorOccurred {
      doSomething()
    }
    else {
      doSomethignDifferent()
    }
  }
  yield &whatever
  errorOccurred = false
}

This is clearly awful. We have a standard way to get different behavior depending on whether an error propagates: it's called do { … } catch { … }. I shouldn't have to use a completely different, contorted idiom to do the same thing just because I'm writing an accessor. What we've created is a context in which the Swift “rules of the road” for error handling are suddenly, and subtly, very different.

(Sorry for not reading the rest of this long thread before posting this suggestion; others have better ideas)

The most surgical fix I can think of for this problem would be to say that if the accessor contains non-defer-enclosed code that can execute after yield, the yield must be marked with try. The diagnostic could be something like, “statement will not be reached when an error is thrown from yield. Please mark the yield with try or move the logic into a defer block before the yield to have it execute unconditionally” (plus fixits, of course).

* I happen to know that it isn't important, and in fact only a very few kinds of places can benefit from such marking. I don't think it's too late to mitigate that design mistake, but that's a topic for a different thread.

orobio · October 2, 2020, 7:15am

It was also discussed that it might be better to not encourage this, because always running the same clean-up code, regardless of the access terminating normally or due to an error, is more consistent with the current get/set semantics.

The issue with this is that try is very much associated with error handling. If yield is used in a generator, then it can terminate due to a number of reasons, just one of them being a thrown error.

It was not my first instinct, but eventually came to the conclusion that this provides the most natural control flow for the common case, while preserving consistency with get/set semantics.

xwu · October 2, 2020, 11:42am

I hate to repeat myself, but I really do think some form of the following speaks to your concern some 250 posts earlier:

Lantua · October 2, 2020, 12:44pm

This is why I've been pushing for this:

Karl · October 2, 2020, 1:18pm

Yes, this is why (as one of the Rust articles said), one of the biggest issues is confidence that all of your dependencies are handling termination correctly.

Most languages seem to have a design for yield which treats termination almost as an afterthought. I have reservations about that and thought they would dissipate over time, but they haven't. I still think we should break from the precedent other languages set and make termination handling explicit.

That being said, I would ask you to please tone down the phrase "absolutely insane". I've been called out by the admins for using language which is less harsh than that.

sveinhal · October 2, 2020, 1:59pm

I realize that this is completely besides the point, but you could call your variable yieldedSuccessfully, invert the flag and not have to lie.

dabrahams · October 2, 2020, 2:31pm

I'm not sure that's a problem, because you could say try is associated with unwinding. Right now we only have one reason for unwinding (an error was thrown), so they're indistinguishable.

That said, your idea of making control flow explicit is very appealing. The reservations I have are that

terminate is poorly named and a bit too much magic to hide behind an ordinary variable name. I'd rather have it be a value that can be extracted somehow from the result of yield, e.g. let x = yield &y. Perhaps yield always has an @discardableResult.
It still leaves us with a contextually distinct idiom for handling unwinding. This difference isn't called out by any notation on the function body. In order to know how to read your generate() function, for example, I have to realize there's going to be a yield in it somewhere.

Indeed it does; sorry I hadn't seen it. @orobio's idea has the advantage that it acknowledges an inversion of the usual common-case behavior in yielding contexts: you're more likely to want code following the yield to execute when unwinding rather than being skipped. But I'm not sure convenience/cleanliness is worth the cost in inconsistency.

You're right. I'm sorry, I guess I let my frustration get the better of me there. Revised.

Ben_Cohen · October 2, 2020, 3:18pm

Thanks Karl. Yes, as a reminder to everyone, please both keep your interactions proportional when you disagree with an approach/design, and in particular, try to avoid ableist or otherwise uninclusive language.

dabrahams · October 2, 2020, 8:05pm

It's still a lie; we haven't yielded yet.

michelf · October 2, 2020, 10:55pm

Couldn't we just improve defer to make all this more convenient? Just like D's scope guard, we could conditionalize execution of a defer block on either success or failure. Translated in Swift, it'd be:

defer { print("Always printed on exit") }
defer(failure) { print("Exiting because something is being thrown") }
defer(success) { print("Normal exit") }

// then we throw something to exit the scope
throw SomeError()

// prints:
// Exiting because something is being thrown
// Always printed on exit

orobio · October 3, 2020, 12:39pm

I'm aware that there is some resistance against magic variables, but in this case I think it might be worth introducing one. For one, since terminate is treated specially by the compiler, it makes sense to me to make it implicit. Secondly, if the value is obtained from the yield, it seems too easy to write code like if yield &x { ... } else { ... }, which I think should be the exception.

I do agree though that magic variables are not great, especially since they are indistinguishable from normal variables. Perhaps something like $terminate would make it more clear that there is some magic going on. Regarding the name itself, I'm sure a round of bike shedding will come up with something better.

I didn't include it in the example, but I do expect that the function signature will have to distinguish between regular functions and coroutines. For a modify accessor it's inherent, but for a regular coroutine it could be for example:

func generate() yields { ... }

dabrahams · October 4, 2020, 12:53am

Those changes to defer might improve defer, but I don't think we can just change defer because we'll still have the problem that there's implicit control flow (exiting the function whenever an error propagates through yield).

John_McCall · October 4, 2020, 1:56am

I agree on both counts:

It would be nice to improve defer to make it easier to do something just on the error path; in fact I think this was in the original error-handling proposal and just never gotten fully designed, which would just mean picking syntax for it and then putting it through Evolution.
The implicit unwind edge out of yield is a problem, and I don't think it's reasonable to expect programmers to proactively remember it, and so something like unconditionally requiring try yield is probably the best choice. If we come up with a reasonable way to allow try to be omitted on ordinary expressions, well, it should apply equally here.

dabrahams · October 4, 2020, 2:13am

But it needn't be magic from the user's point of view. Semantically, it could just be a (thread-local) get-only property in the standard library, which is set by the runtime when an error is propagating (c.f. std::uncaught_exceptions). The compiler could of course optimize access to this variable by whatever magic means it likes without creating magic in the user model. The only disadvantage(?) this approach has compared to what you've suggested is that the value is accessible even when there's no yield.

Secondly, if the value is obtained from the yield , it seems too easy to write code like if yield &x { ... } else { ... } , which I think should be the exception.

I guess I don't see any advantage in making that harder to write. That said, and to provide a little more context: in a more general system for inversion of control, we'll want yield to be able to receive a value from the code to which it has yielded (IMO yielding an inout Optional is not a good answer to that need), and therefore I wouldn't want to “claim” the result-of-yield position for a simple Bool that says whether you're unwinding. It would probably be a property of YieldResult<T> or something. Since this would complicate getting the T out too, it might be a bad idea for that reason.

dabrahams:

It still leaves us with a contextually distinct idiom for handling unwinding. This difference isn't called out by any notation on the function body. In order to know how to read your generate() function, for example, I have to realize there's going to be a yield in it somewhere.

I didn't include it in the example, but I do expect that the function signature will have to distinguish between regular functions and coroutines. For a modify accessor it's inherent, but for a regular coroutine it could be for example:
func generate() yields { ... }

Yeah, that's very nice. I think, though, that I really also want to to see yields on modify and read accessors to clarify the difference in the rules of the road, especially since they can be mixed with get and set clauses.

That does make the simplest cases absurdly repetitive:

var innerThing: Q {
  get { outerThing.part }
  set { outerThing.part = newValue }
  modify yields { yield &outerThing.part }
}

But the cure might be to extend “implicit return” to “implicit yield”:

var innerThing: Q {
  get { outerThing.part }
  set { outerThing.part = newValue }
  modify yields { &outerThing.part } // no "yield"
}