Pitch #3: ConcurrentValue and @concurrent closures

Hi all,

After a long hiatus due to the holidays and other constraints, I was able to incorporate a bunch of feedback (primarily from Doug) into the old ActorSendable proposal -- largely rewriting it in the process.

I'd appreciate comments and feedback on the updated version: ConcurrentValue and @concurrent closures. The key observation is that type checking for higher order functions and other values being passed between concurrency domains is a core part of both the actor and structured concurrency model. Getting this right will make both of them more memory safe and consistent. This also allows defining nice APIs like parallelMap and other constructs in a way that is completely type/memory safe, while still being library extensible and able to handle advanced cases like concurrent hash tables.

There are many differences between this proposal and the last including 1) a new protocol name, 2) dropping ValueSemantic from the design, incorporating the closure attribute as part of the mainline proposal, and expanding the discussion to include structured concurrency instead of just actors.

I would really appreciate comments and feedback on this, thanks!

-Chris

21 Likes

It might be worth mentioning in the text that the by-value captures in @concurrent closures are immutable. I guess it falls out naturally from how by-value captures work, but it wasn’t until I got to this example that everything clicked into place in my mind:

// Error: someLocalInt is immutable because it was captured
// by-value in a @concurrent closure!
var someLocalInt = 1
list = await contactList.filteredElements {
  someLocalInt += 1
  return $0.firstName == searchName
}

This will stop so many bugs. :clap:

The implicit conformance of structs and enums to ConcurrentValue has a subtle impact on resilience: adding a non-ConcurrentValue member to a struct will break the struct’s conformance to ConcurrentValue, and the implicit nature of this conformance may make it difficult to detect.

Is this really a problem for resilience if explicit conformance is required for public types?

Uhm, does ConcurrentValue enforces deep copy whenever an object has to be migrated to another actor? If always yes, then this is ridiculous!

Regarding @concurent, it should be reasonable not to focus much on concurrency, but add general annotations for closures to specify how it can capture objects in its surrounding. Two most obvious versions of this are @nocapture and @readonly_capture.

As the proposal says, no one should do this:

extension NSMutableString : ConcurrentValue {}

It's sort of obvious if you understand the concurrency model. But undoubtedly someone is going to try this at some point:

extension NSString : ConcurrentValue {}

It's less obvious, but unfortunately the later implies the former due to inheritance. It'd be nice if there was a warning/error for this. So maybe common types could be annotated with something like:

@nonconcurrent("Use @NSCopied to transfer between actors.")
class NSString { ... }

which would prevent conformance to ConcurrentValue and also provide an error message when attempting a transfer to another actor to nudge people towards the proper solution. Otherwise the diagnostic would tell you there's a missing conformance to ConcurrentValue and people would be tempted to just add the missing conformance.


Typo: "but we believe that it is find to consider that in a follow-on proposal."

3 Likes

Overall I think this new draft looks good. I only have a couple minor comments.

The name ConcurrentValue implies to me a value that can be simultaneously accessed from multiple threads, i.e. a "thread safe" value such as an atomic data structure. That said, I don't have a better suggestion and I think this is a reasonable name. People will just have to learn what it means in Swift.

I continue to think implicit conformance could lead to confusing error messages, especially for less experienced programmers. I have some concern about this approach since a change from implicit to explicit would be source breaking, therefore making it hard to move to requiring explicit conformance if experience indicates that would be better. I don't have strong feelings about this but I do think we should make this call very carefully.

I'm also wondering why you decided to drop ValueSemantic. Is it just to separate that discussion out? I was excited to see that topic finally get some momentum and will be disappointed if it stalls out.

3 Likes

Can you clarify what you mean here?

This proposal chooses to limit this [implicit ConcurrentValue conformance synthesis] to non-public types because Swift generally wants extra attention paid to the public APIs of types, but we could make autosynthesis happen independent of access control, or we could make it only happen when explicitly opt’d into.

The way I read it, it seems you mean the following struct would require an explicit conformance to ConcurrentValue, because it is public:

public struct Point {
  let x, y: Int
}

One danger I see with this approach is that we don't distinguish between the "this is a public type that can be used concurrently because all of its members are ConcurrentValue" and "this is a public type, who's members are not all ConcurrentValue, but the synchronization of those members is handled by the implementation".

I think we should just extend auto-synthesis of ConcurrentValue to public types, since I haven't yet figured out a way to misuse such a value. Failing this, I think we at least need to capture the intent of "I just want the implicit synthesis for this public type" (which would emit errors for non-ConcurrentValue members) versus "The implementation of this type makes it safe to use concurrently" (which would happily allow non-ConcurrentValue members). I have no opinions about how these two cases should be spelt.

2 Likes

This is shaping up pretty good!

I'm glad to see it's not an unsafeSend anymore, and just a marker protocol :+1:

The closure things need more details though I feel, allow me to share some questions and cases that are troublesome:

@concurrent closure inference

This is looking good overall, but misses one very tricky aspect of inferring @concurrent that we need to define in a bit more detail.

The @concurrent attribute to function types is orthogonal to the existing @escaping attribute, but it works the same way.

Sadly they do interact a little bit - because in order for the model to actually guarantee safety we need to infer that if a closure escapes, it is potentially concurrent.

The proposal here mentions that @concurrent can be inferred, but does not dive deeper into the semantics of the inference. We can lean on Doug's design doc for a take on this though:

Swift already provides semantics for non-escaping closures, ensuring that they can only be passed "down" the stack. Any attempt to save a non-escaping closure elsewhere (e.g., into a local or global variable, or pass it to a function requiring an escaping closure) is already an error. Therefore, we consider a closure to execute concurrently if it is an escaping closure. [emphasis mine]

Long story short:

  • non-escaping closure:
    • it cannot be invoked concurrently, since in order to do so, we'd have to escape it to that "other" execution context, and in order to do so, we'd have to escape it.
    • We can consider such closures to be implicitly "@nonconcurrent".
  • @escaping closure:
    • we must be pessimistic about it potentially running concurrently and infer it as @conrurrent; We don't know if someone will store it, pass it around etc, and eventually invoke it concurrently. Someone might as well store it, and invoke it when called by some other actor, or even invoked it via a DispatchQueue etc.
    • We must assume it is @concurrent, because we can not prove it isn't.
    • Assuming only @concurrent functions can be passed to other actors addresses the issue but only for passing between actors. We'd have to say we ignore any other execution mechanism in the safety model (could be a thing, but needs to be super explicit in the design).

So, given "@escaping implies @concurrent", we end up with the following world, where escaping functions which are not concurrent don't have a way to express this guarantee.

Case in point is something I bumped into with OSLog, see below:

import os.log
let logger = os.Logger()

actor class Hello {
    var isRefreshing: Bool = false

    func log() {
        logger.log("Hello \(self.isRefreshing)")
// error: Actor-isolated property 'isRefreshing' is unsafe
// to reference in code that may execute concurrently
    }
}

This is fairly surprising, but happens because of the auto closure inside the OSLog's Message:

  public mutating func appendInterpolation(
    _ argumentString: @autoclosure @escaping () -> String,
    align: OSLogStringAlignment = .none,
    privacy: OSLogPrivacy = .auto
  ) 

It is not invoked concurrently though. Thus, we need to allow such APIs to be @autoclosure @escaping @nonconcurrent - that's a mouthful, but fully expresses the semantics of this closure. Without this, we would not be able to log any actor state using OSLog.


Given such inference rules the model can prevent any racy code involving closures, regardless what the source of concurrency might be. And we need to introduce @nonconcurrent, in order for the model support those few cases which are escaping but not concurrent.

The alternative is to not infer concurrent from escaping. That, sadly, leaves holes in the model and any non-actor non-child task concurrent invocation of this closure (say, via dispatch or anything else) will be able to invoke such "not @concurrent" closure concurrently (:scream:) -- thus breaking the promise we try make with this model.

IMHO, it is not a good idea to follow through with the alternate approach since it makes @concurrent be a small lie, and we can't really rely on it for providing the safety it seems to guarantee. It's up to the team to decide how safe we want the language to be, but I'm not super comfortable with an annotation meant to solve races that can not really prevent those races half of the time (until everything in swift is actors) :thinking:

It means we need @nonconcurrent for those weird uncommon APIs which are escaping but not concurrent (for whatever weird insane reason they do so). Thankfully, there should be very few such APIs, as the normal way is either a) non-escaping and non-concurrent and b) escaping and indeed concurrent.

1 Like

Thanks for writing this new draft, Chris. It's definitely much improved. A couple of comments:

  • Even though it might eventually need to be a separate proposal, I'd like the notion of a "marker protocol" to be sketched out a little bit in this proposal, so folks can understand the intended limitations. I think it's mostly a matter of not permitting is ConcurrentValue or as? ConcurrentValue or, really, any value of type ConcurrentValue... a bit like the current prohibition on creating values of protocol type where the protocol contains Self requirements. That's the semantics we need to make marker protocols have zero runtime footprint.
  • I feel like we're going to need a stronger opt out for "I know this declaration uses non-ConcurrentValue types but I really need to use it from another actor". The property-wrapper formulation handles some cases (parameters, properties), but not other important ones (like the return value of a function or subscript). @actorIndependent(unsafe) has effectively the right semantics here, so we could use that, but it feels like the wrong spelling when you're doing the opt-out for (say) a captured local variable.
  • As I wrote in my document on preventing data races and @ktoso also brought up, I feel like the link between escaping and concurrent is very strong and important to closing these data-race loopholes, so I'd like us to further explore that escaping implies concurrent by default (with @nonconcurrent as an opt-out for the rare cases).

I'm excited that we seem to be converging toward a workable model here. Let's keep on iterating!

Doug

5 Likes

Thanks, I clarified this a bit by saying "Furthermore, they implicitly capture local values by-value (which are immutable like a let value) and require that the captured values conform to ConcurrentValue."

No, it does a normal Swift copy.

Yes, I agree that someone could try to do that, and it would be a bad idea (given that mutable strings subclass immutable strings in ObjC). I'm trying to subset ObjC interop and related issues out of this base proposal though, so I think that further extensions like this can be considered as follow-ons.

Thanks Matthew. Yes, that's pretty much what it implies. The value can be shared across concurrency domains and safely accessed concurrently. This is true for values without sharable mutable state (like Int), true for COW types like arrays that have ConcurrentValue elements, and true for internally synchronized types like concurrent hash tables.

That said, I don't think the name is perfect either. We can debate that on the eventual review thread. The major issue I have is related to your other question:

Yes, I dropped it because it is a separable issue and is getting bogged down on definitional problems. Such issues can be handled in a follow-on review.

That said, I think the major concern about naming the protocol ConcurrentValue is that the natural name for ValueSemantic may be just Value... and that may make us think twice about the name ConcurrentValue. That said, this is a conversation for another day IMO.

I agree. I personally prefer explicit conformance, but I understand that some folks prefer implicit conformance. This is a fairly binary decision that can be made at the time of a final review.

One example problem with implicit conformance is cited in the proposal from an earlier discussion. I don't think we should make public conformance be implicit. I think your argument is further fuel for the fire that argues against implicit conformance at all.

No actually, that's really the point. These are two orthogonal things. An escaping closure within an actor is not necessarily concurrent. Similarly a concurrent closure is not necessarily escaping (e.g. parallelMap). Modeling these are two orthogonal things allows precision and avoids conflating different axes of function types.

Here's a silly example of a non-concurrent escaping closure within an actor, which is perfectly safe and correct by construction:

actor MyActor {
   var stateFn : @escaping () -> ()
   func escape(fn: @escaping () -> ()) { stateFn = fn }

   func stuff() {
     var local = 42
     escape { local += 1; print(local) }
   }
    ...
}

It is safe because the actor local state (a capture of local in the stuff function) can only be touched by things that touch stateFn which is itself actor local.

This is the magic that lets ConcurrentValue and @concurrent functions compose correctly: only @concurrent functions are ConcurrentValue's. Because only ConcurrentValues can be shared across actors, you know that any non-@concurrent function is local to the current concurrency domain. This is what provides memory safety of the model.

Yes, I think this warrants further discussion. I was just imagining that a closure expression passed in a context that requires a concurrent context would be inferred to be async. A closure expression that contains an await would also force async on its context. I think that is enough, do you agree?

I do not think that escaping makes sense to conflate into this per the points above, but I'm happy to discuss this and provide more examples if that is not clear.

Yes, this is exactly why this is a bad idea. Let's keep these correctly separated from each other.

Right. I put this in the standard bucket of "interfacing with a concurrency unsafe world is unsafe". Dispatch is an unsafe API, as are other C APIs and other things. I understand that it might seem enticing to "fix" this by conflating escaping and concurrent function types, but you can't achieve that.

The issue is that there is nothing that correlates concurrency in the unsafe world with escaping - it is just as possible (from a type system perspective) to have these issues with non-escaping closures. The Swift rule is that non-escaping functions are bounded by the lifetime of the callee that receives them. You can have fork/join parallelism with non-escaping functions, e.g. things that invoke unsafe equivalents of parallelMap that are implemented in Objective-C or withoutActuallyEscaping.

Conflating escaping and concurrent just makes the pure-swift-with-concurrency model more confusing and less expressive, without closing the safety-hole-with-unsafe-code issue that you are concerned about. I think it is much better to keep them separate, and move the world away from unsafe constructs towards safe ones over time. Dispatch, pthreads, and other unsafe APIs are not going to be safe no matter what we do, just as C APIs using manual memory management aren't safe even if they don't expose an unsafe pointer directly (e.g. because the pointers are wrapped in C structs).

That said, I think it would make sense to invest into introducing Clang attributes that allow marking APIs that take Blocks and C function pointers in a way that would allow Swift to diagnose warnings, and it would make sense to have an attribute that causes the imported-into-swift API to take a @concurrent closure. This is the standard way we handle imported APIs, and this can go a long way to make them more safe and nice to work with in general. I'm just trying to nail down the core language model first.

I'm super happy to adopt that, could you propose wording, or directly write it into the proposal? You should have direct edit access, let me know if not. Alternatively, I can also inject edits if you'd prefer to communicate them another way.

Are you suggesting an additional affordance, or arguing against something in the base proposal? If the former, then I think it should be part of an @actorIndependent proposal. If the later, I'd love to know more, e.g. with an example of what you're thinking about here.

Please let me know if the comments above don't make sense to you or you'd like to discuss them more, thx.

Agreed, this is a critical level of the stack to get right. Thanks!

-Chris

2 Likes

I realize that this claim is probably too opaque to be meaningful. What I mean by this is that C code imported into Swift is not generally safe, and does not guarantee that "unsafe" will appear in the API. Here are some trivial examples of C functions:

struct X { int *ptr; };
struct X createX() { return X((int*)calloc(1, sizeof(int))); }
void freeX(struct X x) { free(x.ptr); }
void incX(struct X x) { ++*x.ptr; }

While this is small to make the point, this pattern (of using a C struct to wrap one or more pointers) is pretty common way of providing type abstraction in C and C++ code.

The issue here is that this imports directly into Swift like this:

struct X {...}
func createX() -> X {...}
func freeX(_ x: X) { ...}
func incX(_ x: X) { ...}

Which means that obvious memory safety, race, undefined behavior and other bugs are possible in well meaning Swift client code:

func doStuff() {
  var myX1 = createX()
  var myX2 = createX()
  incX(myX1)
  incX(myX2)
  freeX(myX1)
  
  incX(myX1) // oops
  // leak myX2
}

My only point here is just that calls to C functions are not safe by construction. Dispatch, pthreads, and other similar things are in the same camp. They never were safe in normal Swift <=5 code, and Swift 6 won't change that.

That said, I do think it would be worth investigating Clang attributes that can be applied to commonly used C functions, allowing Swift to diagnose common bugs. This is the pattern we take with ObjC APIs in general in the past.

-Chris

1 Like

What are the requirements for a type to conform to ConcurrentValue? The closest thing I could find is

The ConcurrentValue protocol models types that are allowed to be safely passed across concurrency domains by copying the value.

, but as far as I know every type could be safely passed across, as long as you don't do anything with it.

Is there a list of what kinds of accesses have to be thread-safe? Getters? Setters? Calling a method? Calling a method that takes a parameter? Calling stuff defined in extensions? Passing the instance to an arbitrary function?

Personally I'd say all of the above. All public APIs involving a ConcurrentValue (I mean those vended by the same library as the type) should be thread-safe, unless they have "unsafe" in the name or take some other parameter of an "unsafe"-labeled type.

1 Like

I agree. All public API should be usable across concurrency domains. I added this to the proposal to clarify the intent here, thanks!:

Type should conform to this protocol when the type is designed such that all of their public API is safe to use across concurrency domains. For example, if there are no public mutators, if public mutators are implemented with COW, or if they are implemented with internal locking or some other mechanism.

I also updated the proposal to explicitly discuss the inference rules for @concurrent on closure expressions and also to address the behavior of actor.self in closure expressions.

Thank you all for the feedback,

-Chris

1 Like

That would tie @concurrent with @actorIsolated, wouldn’t it? I figured there needs to be some restriction on how @concurrent @escaping functions are passed around. I first thought we’d need move-only closure.


IIUC, does this raise error?

actor X {
  // error, returning non-ConcurrentValue
  func foo() -> NSMutableString { ... }

Yes, these two concepts need to be reified, and I have an opinion about how to do so cleanly, but I'm trying to get consensus on @concurrent value before dipping toe into that water though. It is just much easier to lock in pieces of the model one at a time instead of having everything up in the air and ambiguous.

-Chris

1 Like

I think ConcurrentValue as a (marker) protocol isn't well suited since protocol is inherited, but such guarantee isn't necessarily so, with the NSString-NSMutableString as a prime example.

Maybe if we have it solely as a function attribute, with extension and class attribute being used to infer function attribute?

@concurrent
class A {
  func foo1() { } // imply @concurrent
  @concurrent func foo2() { }
}

extension A {
  func foo4() { } // not @concurrent
}

ofc with implicit @concurrent for all functions in structs/enums with only concurrent values.

Or to have it as class/struct attributes:

@concurrent
class A { } // All `A`'s methods are concurrent

I think @concurrent @escaping is too tricky, and we might want to wait until move-only got in. Its converse, @nonConcurrent @escaping, won't work unless we restrict it in a single (serial) actor domain.

But this means that a NSString cannot be a ConcurrentValue (unless you think informal Objective-C conventions are good enough), since a variable of type NSString may refer to a shared NSMutableString. In general, substitutability means an open class can’t be marked as concurrent-safe without imposing that as a requirement on subclasses.

:thinking::thinking: That would be true, though I think it's hard for the subclasses to acknowledge such implicit requirements. That's why it might be a good idea to reaffirm that for new subclasses.

Slapping ConcurrentValue on class types without any requirements to fulfil is going to come with very sharp edges. This makes me worry about using the same syntax for “this class implements ConcurrentValue, I promise” and exporting ConcurrentValue-ness across module boundaries.

I’m wondering if it might be better to do it the other way around, though: use the ConcurrentValue protocol for manual and conditional conformances (classes and extension Array: ConcurrentValue where Element: ConcurrentValue {}), and an annotation like @concurrent to indicate that a public struct/enum is exported as a ConcurrentValue even if it doesn’t have an explicit conformance.

Hello, I wonder why Array and other COW containers are not always ConcurrentValue, regardless of the type of their elements, even if elements do not conform to ConcurrentValue.

Why are arrays of thread-unsafe classes, for example, out of the game? There are still plenty of thread-safe operations that can be performed on such containers: counting elements, comparing object identifiers of elements, etc. Is the pitch over-cautious? What are users who want to share containers of non-concurrent values supposed to do?