A case study for `reasync`

ZPedro · June 2, 2023, 5:21pm

Well, fellow posters have before put forward such simple use cases for reasync, but it appears some of the people in charge remain unconvinced anyway; I speculated their objection could be "Why don't you just duplicate the code: one sync function and one async function?", so I cleaned up and posted this use case, where there is enough complexity that "just duplicating" the code in question is not reasonable, and this objection therefore moot.

(Here this complexity is also inherent to performing a reasonably useful computation, so as to preemptively answer the objection that this complexity was artificially added in order to make a point).

JuneBash · June 6, 2023, 4:31pm

I may be wrong, but I don't think anyone in charge is unconvinced; I think it's simply a matter of priorities. They've chosen to prioritize other work that needs to be done, and resources (ie number of people able to work on the compiler) are limited.

ZPedro · June 6, 2023, 4:46pm

Well, there is little sense in discussing this at the moment when everyone's busy with either WWDC as a whole or just the Swift parts, but I hereby swear I will prove my point next week.

austintatious · June 6, 2023, 11:12pm

One of the questions I have though is “reasync” the right word?

To clarify for me, you’re describing a function that can be called either as async or not? Essentially so you can avoid having to write two very similar functions: one which is async and the other which is not. Thank you for any clarification.

bbrk24 · June 7, 2023, 5:56pm

reasync is meant to be parallel to rethrows: A function marked rethrows will only throw if its function-typed argument throws. Similarly, a function marked reasync will only suspend if its function-typed argument suspends.

ZPedro · June 15, 2023, 8:41pm

Some time ago already, after the feedback topic on concurrency here came and went without any news on reasync despite the lack of it being mentioned multiple times as a pain point, I started monitoring the situation in order not to miss any new development in that area.

Oh, nothing fancy: just searching for the term on these forums, by itself, with results sorted by latest post. Then one day, this result showed up, which I'm quoting here for convenience:

The context, for what it's worth, being @Ben_Cohen defending the idea that future directions are not just where proposals go to die.

Since I've been twice challenged in this topic to put up as to what I alluded to in the initial post that made me clean up, elaborate on, and post this study, I am bringing this to everyone's attention so as to answer this challenge.

kiel · June 21, 2023, 4:35am

To forestall any communication issues, this message is not addressed to the OP directly but to the general forum reader.

First, in our own code base, we have asyncMap and asyncCompactMap because we'd like to await values within the closures given to map and compactMap. If we did not have these, we would have to write some small but annoying boilerplate: create a result array variable (not constant), reserve its capacity and loop over and act on each element. Another time I found myself trying to asynchronously reduce but bumped into the absence of an async overload, so I wrote the boilerplate and refactored it to use variables and loops.

While we do have functions to cover these, I am an idiot. Was I supposed to extend Sequence to do this? Should I use underestimatedCount to reserve the resulting array's capacity or was that a pointless operation? How will this scale over large collections? Should it be inlinable? Is some other dependency going to introduce this and collide with my own implementation? etc.

Second, there was a recent pitch whose implementation is literally repeated/duplicated. Add that to the ton of overloads some other library authors have to provide.

In the absence of overloads and third party, generic functions duplicating logic, developers would be required to use variables instead of constants. Doing this introduces complexity into reasoning about code and limits intuitions about code since, in order to understand the logic in some scope, we now also need to check the variable is not mutated anywhere else within it (I'd guess that kind of check would almost always result in a "no, it's not mutated anywhere else").

So it seems pretty straightforward that there's a need for reasync.

I guess the reason a proposal has not surfaced is because compilers are hard and it would not be a starter task?

Douglas_Gregor · June 21, 2023, 6:38am

reasync is in a tricky place because the design is easy (just follow rethrows but with async), and the motivation is easy, but a decent implementation in the compiler is a bunch of work.

Moreover, it's almost a syntactic-sugar feature, because you can get nearly the same effect by duplicating the code into async and non-async versions. Indeed, now that we have macros, I'd be curious just how far one can get by implementing a peer macro that, when applied to an async function with async closure parameters, produces a synchronous version of that function that zaps the async from closure parameters as well as all of the awaits within the function body.

Doug

ZPedro · June 21, 2023, 8:42am

Spoiler alert: after the WWDC sessions that were clearly an invitation to… exert the macro functionality, I'm halfway through making it into a macro, if nothing else so as to show I tried every other way. However, I can already flag the following two limitations:

The draft macro is already particularly complex. While attached macros, including ones that create a peer declaration, can and do access the AST of the attached declaration, the ones I've seen so far only need to do so in order to fetch a limited amount of information, typically one identifier, and make a bounded number of checks before unrolling their boilerplate. The macro I'm writing, on the other hand, needs to access every single syntax element of the attached declaration if only so that it can know whether that element needs to be present in the copy, and then recreate a copy structure made of copies of almost all these elements. This complexity raises future maintenability challenges, as well.
Diagnostics will be nonexistent. The principles of the macro system, as far as I've seen, include (deservedly) putting responsibility on the macro creator to raise its own diagnostics in situations of trouble, rather than generate incorrect code to be flagged by the compiler and leave the macro user to figure out what went wrong by reverse engineering the generated code. The macro I'm writing will be completely unable to assume this responsibility: if the async declaration to process mistakenly contains a call to an async function that was not passed as a parameter, for instance, the macro will be unable to tell (short of reimplementing significant parts of the compiler); instead, its only option is to blindly remove awaits and hope for the best (remember a single await at the start of an expression can cover many async calls, some possibly far away within).

Don't get me wrong, I am aware my sed-based workaround is worse by many aspects, but at least that ugly hack makes it clear it is what it is: a workaround.

(I briefly considered writing a freestanding macro, before balking at the maintenance nightmare. Let's just say iteratePossibleLeftNodes${ISASYNC}() is not in the same class as Array${N}D)

FranzBusch · June 21, 2023, 9:35am

I agree that macros can bridge a gap here but they won't help us with carrying these effects like rethrows or reasync across protocols. Arguably, @reasync is kinda hard to use since the implementation and usage can change dramatically if something is async or not.

Zollerboy1 · June 21, 2023, 10:03pm

If I implement a function that is very similar to a function that already exists in the standard library, I always look up, how (and on which protocol) it's implemented there.

E.g. if I wanted to implement an async version of the reduce function, I'd copy the implementation from here and add async and await where needed.

Zollerboy1 · June 21, 2023, 10:21pm

Have you tried using SyntaxRewriter? It makes the process of removing awaits from the function body much easier.

I have a minimal working example of how you could write a Reasync macro over here. Of course, it can be drastically improved. E.g. it just does a check if an await expression contains a reference to a parameter of the function which is an async closure. That means that the macro doesn't work correctly, if you e.g. set a variable to the async closure and then try to call that. But I think that improving these checks shouldn't be that much of a hassle.

EDIT:

I just checked how Swift handles rethrows so far and it actually doesn't allow many things that would make it hard for the Reasync macro. E.g. both of these functions give an error in Swift 5.8:

func foo(_ body: () throws -> ()) rethrows { 
    let body2 = body 
    try body2() 
} 

func bar(_ body: () throws -> (), _ body2: () throws -> ()) rethrows {
    try (Bool.random() ? body : body2)()
}

This means that it should be very much possible to get feature parity using a macro for reasync.

ZPedro · June 22, 2023, 2:22pm

Oh, neat! Thanks a bunch, I can't believe I've missed that. Wait, given the sparse documentation, maybe I can believe that…

It's still a bit more complex than I'd like, but at the very least it should be maintainable (e.g. no breaking when new syntax nodes are added in future Swift versions) so that takes care of my first limitation. Speaking of which, clearly some limitations are acceptable.

For instance, one that's not even an issue for rethrows is the case where there are multiple throws parameters to which a mix of non-throws and throws actual parameters are provided: in the case of rethrows, there is no monomorphization so it's handled for free.

But for reasync we don't have such polymorphism, and we're clearly not going to monomorphize across the combination of possibilities; instead, we're going to require that all actual parameters are sync or all async, and require the caller to wrap sync parameters into an async equivalent in case of mixed parameter types (and suggest such autothunking be implemented transparently if it isn't the case already).

Looking at your code, neat idea to check for the presence of at least one of the parameter identifiers, that should handle a nice chunk of the cases to diagnose. However, there remains the possibility of an await that covers for both such a parameter and a non-parameter async function, which the macro at this stage won't flag (await need not be next to the particular function in question, e.g. as seen earlier in try await progressRecorder.record(batch.value);). If you can find a way to flag that kind of diagnostic from the macro, my hat is off to you; I have no idea how this could be done.

Zollerboy1 · June 22, 2023, 2:58pm

Yeah, it's not really possible to account for that right now. You could obviously look at every FunctionCallExprSyntax inside of an AwaitExprSyntax and check if calledExpression is one of the parameters or not. However, this can't account for non-async functions being called inside an await expression. For that we would have to have a way to ask the compiler for the type of an expression, so that we can only look at call expressions where the called function is async.

ZPedro · June 22, 2023, 4:31pm

It's worse than that, because a client of that macro may need to wrap an @Reasync function by an outer @Reasync function, in which case the calledExpression is not one of the parameters regardless of the FunctionCallExprSyntax found inside, but the use case is legit nevertheless…

So, yes, for now best to just look for any identifier within the await expression being one of the parameter functions. Not sure a parser can ever provide us better context: proper diagnostics for reasync will likely need processing at or near the SIL level.

Zollerboy1 · June 22, 2023, 4:46pm

Yeah, that's true. Obviously it would be better to have reasync as a proper language construct, but it's amazing how much macros can do already. You could definitely write a usable Reasync macro right now, that just gives out some misleading diagnostics in some contrived cases.

austintatious · June 23, 2023, 12:06am

I have been wondering the same thing. The ability to have the compiler write a non async version of a function would be incredibly useful. If a macro could do this that would be excellent.

ZPedro · June 29, 2023, 3:04pm

No update this week: I had to update Xcode (from 14.2) in order to actually try out macros while benefitting from IDE support, which meant updating MacOS (from Monterey), which meant… putting a 300€ downpayment towards a new machine, since I was relying on a 13' MacBook pro from 2016 up until now. The new one should arrive in a few days, at least that is what I was told.

ZPedro · October 9, 2023, 8:49pm

Small update to note that I uploaded my repository with all the evolutions mentioned here (along with a few more twists and turns) at ~zpedro/AsyncCountdown - Countdown numbers round solver written in async Swift - sourcehut hg

Don’t forget to run RegenDerived.sh from the AsyncCountdown subfolder after downloading/cloning, while we wait for reasync to be available.

taylorswift · April 15, 2024, 8:22pm

(i recognize this reply is close to the “1 year” threshold where it might warrant a fresh thread so feel free to fork into a new thread)

i came across this while grokking through the history of reasync and i just wanted to add that generating async overloads using macros is going to be really bad for our ability to document these APIs. these overloads will require FNV-1 hash disambiguation 100 percent of the time, and will also require references to the original template implementation to use hash disambiguation.

we could mitigate this rather bluntly by just removing all macro-generated declarations from symbol documentation, but i’m not sure if it makes sense to vend a lot of public-facing API that does not show up at all in documentation.