On the proliferation of try (and, soon, await)

@ktoso, by this I mean something precise:

Our beloved LibDispatch is a wonderful tool that comes with gotchas. It is possible to misuse it. Traps that developers might fall into are thread explosion, priority inversion, certainly more.

If reasoning by analogy is not too wrong, here, it is expectable that Swift concurrency as a whole will exhibit gotchas as well, and opportunities for misuses. After all, code runs on physically constrained devices. Demanding tools will push the language to its limits.

In this context, it is normal that people ask for more details. And they should get an answer. Even if the answer is: "details are not ironed yet. Come back when ...".

Another possible answer is: "details are allowed to change from one Swift version to another. The only guarantees are: ..." But please mind that people will be very unhappy if their working code suddenly drains system resources or turns into a snail after a system update. The "assume the worst" mindset applies to many things.

2 Likes

Would this imply that I cannot write code like this because it might introduce a data race if a multi-threaded executor is used?

class DoublePing {

  var running = 0

  func run() async {
    async let result1 = ping()
    async let result2 = ping()
    await [result1, result2]
  }

  func ping() async {
    running += 1
    await networkCall( "/ping")
    running -= 1
  }
}

My understanding is that yes, you would have to guard against races here. Async code is re-entrant (currently always, there are discussions elsewhere about that), and the operator += is not atomic.

Since that's just an ordinary class — i.e. neither an actor class nor associated with a global actor — you don't really know anything about what executor you might be running on and what executor you might be resuming to after any awaits you make. So yes, that code is as race-prone as if you wrote the analogous code today with completion handlers and no queues or locks.

We do want to eliminate those races, but it's tricky because classes today are largely unrestricted, and it may take some time (or even prove impossible) to figure out something acceptable.

If I understand @John_McCall and the proposals correctly, the implementation of DoublePing would be safe if it was an actor or associated with a global actor—even if the actor is reentrant. The code would run interleaved but not in parallel, which would prevent data races*.

Edit: *(Unless you manually bind a non-exclusive task executor to the actor; not sure if that's possible).

1 Like

Me too. I just went back and read the definition of async let again, and I'm still a little confused. I think it comes down to the definition of concurrent, which I know has already been discussed. I'm looking for example at this quote from the Structured Concurrency proposal:

To make dinner preparation go faster, we need to perform some of these steps concurrently . To do so, we can break down our recipe into different tasks that can happen in parallel.

That makes it sound like "concurrently" and "in parallel" are essentially the same thing, even though IIUC they are not generally considered to be so.

Does a task created with async let execute in parallel (on another thread) or merely concurrently (interleaved with other tasks on the current thread)? Does it depend on the executor? If so, I think the syntax for that should be more explicit, like Task.execute(¡¡¡) or something. I was assuming mere concurrency.

Which leaves me wondering - was my initial assessment of that merge sort implementation correct? Does it just execute the whole sort as soon as you await it (and less efficiently because of the overhead of creating tasks)? Does it benefit from parallelism because of async let? Or does it depend invisibly on what the executor might be?

1 Like

I agree.

The way I see it however, we could reframe that discussion around data isolation. If the task is properly data-isolated then the executor can (but does not have to) run it in parallel, otherwise it must run it serially, interleaved with other partial tasks.

The problem with the current definition of async let is that it is unclear whether it expects its task to be data-isolated. I'd personally not expect data isolation to be a requirement for async let, but the implementation seems to assume isolation since the task runs in parallel.

I wouldn't mind the compiler parallelizing things automatically when it can prove data isolation*, but in the absence of such a proof the safe default is to keep things serial.

*vague idea about automated parallelization

I'm thinking the compiler could prove data isolation for some partial tasks and flag them so the executor know those partial tasks can be parallelized. For instance:

var batch = 0
async let resultsA = {
     let currentBatch = batch
     batch += 1
     // nonparallelizable before await
     let numbers: [Int] = await downloadData(batch: currentBatch)
     // parallelizable after await
     numbers.sort() // proven to not access anything but local state
     return numbers
}
async let resultsB = {
     let currentBatch = batch
     batch += 1
     // nonparallelizable before await
     var numbers: [Int] = await downloadData(batch: currentBatch)
     // parallelizable after await
     numbers.sort()
     return numbers
}

Concepts like purity and/or value semantics could help determine which partial tasks are properly isolated. It'd be nice if you could also specifically request for something to run in parallel and get an error if it can't, with an unsafe override of some sort.

2 Likes

Not all code is meant to be run from any thread. Consider this function:

func animateVisibility(ofView view: UIView, visible: Bool) async {
    // Hypothetical async version of UIView.animate
    await UIView.animate(withDuration: 0.2, delay: 0, options: []) {
        view.alpha = visible ? 1.0 : 0.0
    }
}

That function must be called on the UI thread. If you have a function that starts on the UI thread then this should be fine:

await animateVisibility(ofView: view, visible: true)

So is this safe?

async let animationTask = animateVisibility(ofView: view, visible: true)
// Do some other stuff
await animationTask

If async let implicitly changes the executor to make it run in parallel then it's not safe. That's why I don't think async let should do that.

1 Like

Fixed it.

If functions must be executed on specific "thread" they must run on an actor; and the typical requirement of "on ui thread" is expressed by a global actor with a special executor.

This works regardless if called form an async let or not.

// edit: added link to proposal section on global actors

1 Like

We really need to get our terminology straight. Caveat: I'm not an expert, but I do know how to use google and read the technical descriptions I find. Somebody let me know if I'm on the wrong track here.

Unless the definitions of parallelism and concurrency I've found are completely wrong: when multiple Tasks are split up at suspension points and have their partial tasks are interleaved on a single thread in an order consistent with the apparent straight-line ordering of the code in each task being run, that is an example of concurrency without parallelism. IIUC that is what “async/await by itself” enables. Concurrency, not necessarily parallelism.

The concurrency of basic async/await is cooperative, which can have a major impact on the programming model, but it appears that even preemptive concurrency may still not be parallelism. For example, from reading these definitions it seems that preemptive multithreading on a single core is technically also concurrency without task parallelism. Unfortunately, that is not a particularly useful fact when thinking about the programming model:

  • Without multiple cores, you probably don't ever need atomics, but most invariants span multiple variables, and protecting temporarily broken invariants from being observed in a preemptive system generally requires locks (or a fancy data isolation system like what seems to be underway with actors).
  • Programmers typically don't get to limit the number of cores used to run their preemptive threads (and anyway, we like the performance benefits of multicore, so we don't want to). So in practice, all multithreaded programs are written with the assumption that tasks exhibit both concurrency and parallelism.

It seems that the important dimensions here for the programming model are:

  • Concurrency, which can introduce reentrant access to data whose accessibility is not locally apparent
  • Preemption and/or true parallelism, which make the locations of suspension points insufficient knowledge for maintaining invariants and preventing races.
7 Likes

That may be a fix, but it shouldn't be necessary to get correct behavior. After all, people may have existing (Objective-C) functions that must be called on the UI thread with completion handlers. According to the Objective-C interop spec those will be imported as async, and those should behave correctly whether awaited directly or used with async let.

Even ignoring the UI thread cases, if async let changes how the function executes then there's a missing capability: splitting up the call from the await without changing the execution context.

let data = await readFromFile()

vs:

async let readTask = readFromFile()
// do stuff
let data = await readTask

If the second example runs readFromFile in a new thread then that's wasteful and unnecessary. It may actually have worse performance.

Using a separate thread should be a conscious choice. It shouldn't be the default behavior, and it shouldn't be forced on you just because you want to split up the async call from the await. There should be a way to split the call from the await that doesn't change the executor.

2 Likes

I think that in general async let should ask the current executor to run the initializer, if possible. Either the executor is an actor, in which it executes interleaved with the rest of the caller, or it is a parallelized executor where it can execute in parallel. However, this is probably not the current implementation.

But is that what is being proposed? It's not clear to me.

And if so, then what does that say about this?

// func f()

async let t = f()
await t

Is that allowed? What does it do? It would be strange if async let t = someAsyncFunc() runs on the calling thread but async let t = someSyncFunc() doesn't, but if calling a synchronous function using async let runs on the calling thread then what is the point of it?

Don't think I've been confused about this since @Douglas_Gregor first posted here.

I too have suggested that maybe there should be two keywords for the different kinds of execution, but I am equally unsure whether the joining of potentially-parallel tasks should really be using the await keyword. There are three interesting places in code with distinct implications for the programmer:

  • A potentially-parallel task is launched from the current one
  • A potentially-parallel task is joined to the current one
  • A task reaches a suspension point
1 Like

Taking inspiration from in, what about allowing any scope to have trying as its first statement to indicate thrown errors will not break invariants?

init(from decoder: Decoder) throws {
  trying
  var output = encoder.unkeyedContainer()
  output.encode(self.a) // no try needed!
  output.encode(self.b)
  output.encode(self.c)
}

Or, when various scopes are in play,

func somewhatTricksy() throws {
  do {
    trying
    thisThrows()
    thatDoesTo()
    noInvariantsAtRisk()
  }
  // Back to normal handling for invariant-breaking errors
}

People would (presumably) put trying on the same line as the opening brace, as with capture lists etc.

I don't think the benefits described so far necessarily justify adding a keyword and abusing the language syntax, but I do think they justify a thought experiment to see what else happens without ubiquitous try marking.

I think everybody's assumptions are likely to be colored by the particular situation in their own project. In my situation, it is the broken invariant that is the "elusive corner case" – defer is basically a silver bullet, so the presence or absence of try in the code has no impact on this for me. So as you say, I basically never care about explicit try for this, although that is only an argument against it for that one case.

Rather, the "main cases" for me as are as follows

  • Performance. In our situation, control flow is very expensive. It's common for us to set a control flow budget that a function ought not to have more than 10 branches, and implicit try makes this more difficult.
    Now obviously an opaque function call might do any controlflow, and so you can make an argument that try in the caller might be charged to an opaque callee. However, I don't believe Swift can actually apply that transformation for real (such as inlining the try return of the caller into the throws in the callee) so as a model of runtime performance it is less accurate than try.
  • Debugging. The fact is, I am often surprised that some function returns early. I spent a lot of time writing what the function is supposed to do, and relatively little time coming up with some guard statements at the top because it turns out somebody tries to call it before a view loads. A solid majority of my bugs are "nobody ran the rest of this function", so any syntax that makes me more likely to do that seems like a footgun.
    This is not a "broken invariant" usually (although that is an "elusive corner case"). More typically, it's a pure function, or a function that mutates on success only. You can conceptualize this as having a function named layoutSubviews but I accidentally wrote layoutSubviewsIfNeeded. If indeed the solution is to rename the function because it has a try inside it, this would suggest I would benefit from even more explicit error handling than we have today.
  • try would have little value in the places where it really mattered, because people would get used to silencing the compiler by thoughtlessly adding it in all the cases where it didn't matter.

I think in many ways this is analagous to complaints I've heard about optionals/IUO. Or my own complaints about Rust's borrow checker. We have to make choices between productivity and safety when they are in conflict. Some would rather we picked the other one, and they may well have good reasons for it.

IMO the genius of Swift is you can often pick the other one. do try scoped syntax is a good proposal for that. Personally I'd like to see file-scoped or module-scoped syntax as alternatives, since I think a lot of the factors that influence whether or not you like implicit try are similar across a project rather than local to a scope.

1 Like

Yep, assuming you understand the problems of all important use-cases is a typical rookie error for language development. Did I just do that? :wink:

I'd be more likely to argue that Swift hides a lot of branches from you already and the chance that you can accurately count branches at a glance is pretty darned low. There's a branch for every potential integer overflow, not to mention && and ||, branches within pattern matching, etc. Does a branch in a loop count as 1 branch or N branches?

I'm always interested to hear about different use cases; do you mind sharing your problem domain with us? It sounds like Swift

  • Debugging. The fact is, I am often surprised that some function returns early. I spent a lot of time writing what the function is supposed to do, and relatively little time coming up with some guard statements at the top because it turns out somebody tries to call it before a view loads. A solid majority of my bugs are "nobody ran the rest of this function", so any syntax that makes me more likely to do that seems like a footgun.
    This is not a "broken invariant" usually (although that is an "elusive corner case").

I dunno; if someone skipped running “the rest of this function” and it needed to not be skipped, doesn't that correlate with the function failing to (re-)establish some invariant that the rest of the program expects to hold? If not, what is getting skipped that matters?

Of course I see this as a matter of your point-of-view, but I am also suggesting that if you frame that problem as a broken invariant it will give you lots of power, both to think about that code but also the rest of your program.

More typically, it's a pure function,

I'm confused now. In a truly pure function (performance aside), it never matters whether the rest is skipped; you either get a result or an error and in the error case it's indistinguishable from remembering the error until just before the return and executing all the rest of the code.

or a function that mutates on success only. You can conceptualize this as having a function named layoutSubviews but I accidentally wrote layoutSubviewsIfNeeded. If indeed the solution is to rename the function because it has a try inside it, this would suggest I would benefit from even more explicit error handling than we have today.

Sorry, I'm lost again. How does this ifNeeded distinction relate to error handling?

I think in many ways this is analagous to complaints I've heard about optionals/IUO. Or my own complaints about Rust's borrow checker. We have to make choices between productivity and safety when they are in conflict. Some would rather we picked the other one, and they may well have good reasons for it.

FWIW, I'm very inclined toward static safety and don't believe potential crashes or traps should ever be hidden behind implicit syntax. That's not what we're talking about here, though; IMO it's qualitatively different.

Been thinking about @Chris_Lattner3's argument some more here, and I think it makes sense. There's no need to introduce a try statement, and I have no problem with try as a marker; in fact if we have that, the top level block of a function can still be marked with try. I have no problem with adding do in contexts where you'd need to add an additional level of nesting anyway; that's the real cost. Thus:

func f() throws
try { stuffThatThrows() } // legal

func g() throws
{ try do { stuffThatThrows() }  } // legal

func h() throws
{ 
  try { stuffThatThrows() }  // error: closure expression is unused
                             // note: did you mean to use a 'do' statement?
} 

I'm not suggesting the message about closure expressions is great; it just happens to be what you currently get when you write a block without the do introducer.

1 Like

you either get a result or an error and in the error case

Not all situations are formalized in the typesystem. Recently I wrote a function like UIView.layoutDifference(...) -> LayoutDifference, a pure function that studies the receiver's state to produce a layout, or more specifically the update between present and desired layout.

Layout systems have underspecified cases, such as views with no constraints, and specifying these is a matter of taste. We could throw, assert, return nil, return an empty difference, return a difference that moves the view to a zero frame, introduce some enum result type, etc. One popular layout engine thought it was a good idea to have inconsistent frames and call these "ambiguous layout" :eyes:. Let's call these underspecified cases "silly situations" and gloss any way we decide to handle it as a "silly result".

What happens is, due to out-of-scope-cause-X, the function that makes constraints had a silly situation and did a silly result. Then our view has silly constraints and got a silly layout, two silly layouts had a silly layout difference, the silly difference started a silly animation and now I'm reading a bug report about that. So I have to navigate through this jungle all the way back to X.

Scarred by this experience, I resolve to make every silly result an assert, so it's easier to find the source of a problem. But it turns out that some callers create silly situations for sensible reasons and don't want to assert. So it's not there's an objective answer to avoid difficulties of this kind.

Anyway, it is useful with this sort of debugging to have a fixed enumeration of keywords that can return silly results to callers, so as to glance at a function and informally prove why it might have returned nil or whatever you got.

if you frame that problem as a broken invariant it will give you lots of power, both to think about that code but also the rest of your program.

Nailing down which function in a jungle had its invariant violated can be a powerful method but it's context-dependent. At some point if there are too many riddles on Slack about "if an empty layout difference is applied in a forest, was the view ever really laid out?" management tells me to get back to work.

do you mind sharing your problem domain with us?

Sure although it may be offtopic. The situation I had in mind on performance (different from the layout example above), involves CPU/GPU. For a given function in my library, there's risk I may need to move it to the other processor or to have it on both.

The obvious strategy is to cross-compile, but this has all the problems one normally has writing cross-platform code, and more, because the architectures are more different. And if I didn't actually need both versions I've suffered a lot.

Alternatively. Swift can range from a very highlevel modern idiom, to a very lowlevel "maybe if I add semicolons, clang will compile it" portable idiom. I can slide between these idioms depending on how the port risk feels at that moment, and sometimes this sliding makes the code fast enough without the need to port for real. And if I do need to port, I have a working function to start with. So this is often a practical approach.

As far as nailing down particular branch counts, we use an abstract machine which is some blend of both platforms. It's not going to be fully accurate, but it encodes a useful idea of the worst case. Examples like && and || are indeed more expensive on the abstract machine than they are for Swift so we really do watch out for them when writing Swift with this idiom.

Overflow is an interesting case since it's cheap on CPU but prohibitive on GPU. Ultimately we write explicit overflow handling as if Swift did nothing, and then use Swift's behavior like a sanitizer. More rarely we turn on Ounchecked.

Other Swift features that might hide controlflow are generally not portable so we avoid them for this idiom. Occasionally though we port them – I actually found this thread because I am porting try.

The distillation of all this for try is that it's expensive for the abstract machine and we want to easily count it. I can totally see why this is a weird concern in the context of app code though.

I'm very inclined toward static safety and don't believe potential crashes or traps should ever be hidden behind implicit syntax. That's not what we're talking about here, though;

Sorry, I used a loaded analogy :sweat_smile:. A better one is the requirement to write self in a closure – it's annoying but it also prevents bugs. Like we did with self, there is probably some way to relax the rules on writing try. But I think we need advocates for both the 'annoyed' and 'useful' positions on try to find a good balance.

do try seems like a good balance, it is not disruptive for the cases I described. At the same time I don't know if it really addresses the burden of try, simply because I don't feel that burden very acutely.

1 Like

I am very late to this thread but I wanted to say that I'm one of the ones that don't see try as a problem. This is one of my favorite features, and I think the reason people misuse it so much is because UIKit annoyingly discourages you from using it. Libraries that completely rely on try are really stable and often also super nice to read in code.

4 Likes