Relaxing non-isolated protocol conformance

mattie · September 19, 2024, 10:09am

I had an idea for a way to relax the constraints on an isolated type conforming to a non-isolated, non-Sendable protocol. But, I haven't thought too deeply about it yet. And, honestly, I just don't think I know enough about the possible problems here. So, I thought I'd just throw it out there and ask for help poking holes in it.

Today, isolated functions are prevented from satsifying non-isolated protocol requirements. My idea is to shift the constraint from the conformance to the usage of the type. To protect against invalid usage, if you try to downcast the conforming type to the non-isolated protocol, it must be synchronous.

// Note this is non-Sendable!
protocol NonIsolatedProtocol {
	func doThing()
}

actor MyActor: NonIsolatedProtocol {
	// normally an error
	func doThing() {
	}
}

// both errors
// converting to protocol cannot be done outside of the actor's isolation 
MyActor() as NonIsolatedProtocol
let p: any NonIsolatedProtocol = MyActor()

class NonSendableClient {
	init(_ arg: NonIsolatedProtocol) {}
}

extension MyActor {
	func withMyIsolation() {
		// this can work!
		let client = NonSendableClient(self)
	}
}

I have a feeling there are problems with this idea. I'd love to hear your thoughts, especailly the problems that haven't occurred to me yet.

ktoso · September 19, 2024, 10:16am

That's a big can of worms in general...

For example, these kinds of "conform to synchronous requirement" exist already if you conform using

actor A: @preconcurrency TheProtocol

yet... they actually are a hole/bug in the checking, as shown in: Opening actor existentials breaks dynamic isolation checks inserted for @preconcurrency conformances · Issue #76507 · swiftlang/swift · GitHub

If called on an any TheProtocol there's no dynamic, or static warning or even runtime crash -- we call from random threads into the actor itself, breaking isolation entirely.

We have to fix the above bug; probably by inserting dynamic checks INTO the methods rather than into just the witnesses because we open those protocols...

Perhaps out of the fix for the above issue we can get a mechanism to "check at runtime" but then again, it'd be defeating the purpose of static checking by shifting the checks to runtime, and for what gain? It's just always going to be wrong to call without the proper isolation.

~~

// error, would require await
MyActor() as NonIsolatedProtocol

I don't see why we'd need a cast here? It's an IS-A relationship here to begin with between those two.

It's not that the conversion requires an await; the calls require one; or we'd have to magically insert thread-blocking calls whenever doThing() is called in a blocking way because we must hop to the actor to make those calls correctly.

This is not a good idea to do implicitly, blocking threads should be explicit.

There may be some kernel of an idea here, but just making it easy to conform to synchronous requirements without anything additional is not going to be enough.

mattie · September 19, 2024, 10:26am

Right, this conversion does not require an await today. What I'm imagining is a language change that would make this synchronous cast disallowed.

From that bug, with this imaginary change the following would be a compiler error if OldProtocol contains synchronous methods and is non-Sendable (which are both true)

let a: any OldProtocol = MyActor()

However, I have not thought about how @preconcurrency would come into play and that's a great point...

ktoso · September 19, 2024, 10:32am

I'm sorry I don't see how this would be solving anything.

You still could, given the above, do:

MyActor().doThing()

if we allowed the conformance; And this is incorrect -- synchronous code cannot hop to the actor; so we'd have to:

implicitly make up a task (bad, allocation), BLOCK the calling thread (very bad), perform the async call and unblock the caller.
- this really should not be as implicit and hidden like that. If we are to make some "block the calling thread to make async calls" we should at least make it a little bit noticeable in source because this can totally be a source of deadlocking the entire system.
have some opportunistic way to execute the actor on the calling thread even if it's not a task...?
- but we can't do that if the actor is already running something, so we're back to blocking threads.

What the @preconcurrency conformances are supposed to do is:

if called from a context that is NOT already on the actor -> crash

And we'll bring back this behavior in this edge case where existential opening breaks it, but you see it's not going to save you from "hop to the actor", it's going to require it and move the checking to runtime from compile time. It's a last resort for existing legacy APIs.

TL;DR; This isn't as simple as just allowing the conformance and suddenly the calls just work.

mattie · September 19, 2024, 10:43am

I don't understand, yet, why this would be! Calling outside of the actor's isolation would require an await. The only way you could make this call synchronously is if you were already isolated to the actor.

I'm not proposing changing the calling semantics in any way whatsoever. I'm experimenting with the idea of catching the cases where the synchronous requirements could escape the type's isolation and disallowing that.

I have to think more on the preconcurrency angle, and I will!

ktoso · September 20, 2024, 2:13am

Okey, so you're saying that stripping away the "actorness" would undergo checks... maybe?

This is a bit weird since the actor isn't anymore clearly IS-A instance of some protocol... as you would not be able to pass:

protocol P { func s() {} } 
actor A: P { func s() {} } 

func take(a: any P) {}
take(a: A())

since this erases to P you'd have to error there...

Notably, this note from initial post is not right:


// converting to protocol cannot be 
// done outside of the actor's isolation 
MyActor() as NonIsolatedProtocol

it doesn't matter "where" we do the conversion, it'd still be unsafe to escape such an erased thing:

actor A: P { 
  func s() {
    take(a: self)
  } 
}

Since this erases to any P you can't do this here either. some P would also not work statically, because you don't know if the underlying thing is isolated or not...

Maybe there's a kernel of an idea here but so far I don't see how pushing the check to "conversion" moment can give us something viable.

Jumhyn · September 20, 2024, 2:35am

I think the shape of the idea here is something like:

We only allow the A to P conversion when we have an isolated A
Because P is not Sendable, a value of type P can never leave the isolation domain in which it’s formed
Therefore a P formed with an isolated A will always remain in the same domain as the actor from which it was formed
Therefore it will be safe to call methods on that value in a nonisolated manner

I’ve not thought enough to be confident that this analysis is correct in all cases but I think I understand what @mattie is getting at!

ktoso · September 20, 2024, 3:01am

Ah the lack of Sendable is something I had forgotten about here, thanks for pointing that out.

That's interesting and might actually work, I wonder if it's practical enough to be useful? Maybe in situations where really most of the time just concrete types are used it might be. I wonder if it'd help get rid of preconcurrency conformances which are tied to MainActor types because those may often "never leaving the main actor" hm...

Any erasing of the actor to the protocol would need to be prevented (I'm having a feeling that even some P and generics may also not play well with this...) on such adopting types, which feels a bit weird given the natural sub-typing relationship... Like we could not invoke take(a: any P) with an nonisolated A but we could with an isolated A and the reason is the synchronous requirements on P hm...

I'm somewhat tempted to call these "isolated conformances", and maybe we'd require conforming in such way using T: isolated Protocol?

I don't know how comfortable we'd be with such a rule from a type-system perspective, wdyt @Slava_Pestov ?

mattie · September 20, 2024, 10:11am

The protocol being non-Sendable is critical to the concept! I should have called it out more prominently in the first post.

I can tell you that I have encountered this situation frequently with the delegate pattern in AppKit/UIKit.

Here's concrete example: NSTextStorage. It has a non-isolated, non-Sendable delegate type NSTextStorageDelegate. And this is correct, because NSTextStorage is not a MainActor type and is itself not Sendable. In fact, large portions of the text system explictly support non-main-thread usage. But, that's not the common case. The common case is to have a MainActor type that a) creates and owns the storage instance and b) becomes its delegate. Today, this requires a @preconcurrency conformance. But nothing about this is "preconcurrency". That's just a useful tool for allowing this totally valid arrangement to work.

I think that semantically this idea may need to work the same, though. Currently, there are still many examples of protocols that execute their functions in the background but are not Sendable. And the isolation guarantees still need to be inforced.

xwu · September 20, 2024, 3:35pm

Per @Jumhyn, aren’t we actually talking about isolated T: Protocol?

mattie · September 20, 2024, 7:54pm

I think what @ktoso was suggesting was that there'd need to be an indication at the site where the conformance was added to indicate this special mode was being used:

// pretend syntax
protocol P { func s() {} } 
actor A: isolated P { func s() {} }

But I'm not quite sure why there would need to be any extra information communicated here. The compiler knows there's an isolation mis-match already, and could use that to produce errors at the site of unsafe conversions. (I actually know nothing about the internals here and I'm sure this is easier said that done.)

But, my theory is there are common use cases where unsafe conversions wouldn't ever happen in the first place. And if that's true, it would be really nice because the developer would not even need to be aware of this special-casing.

ktoso · September 21, 2024, 6:45am

I'm just thinking out loud about the : isolated P because they're somewhat "not complete" and it's a bit weird that we'd allow conforming to things just like other types and situations, and then somewhere down the line you notice you can't use it like you'd expect to use any other conformance in Swift -- thus the idea to mark it. To me at least that's less surprising and we're educating at the point of the "weird conformance" about what it is, rather than somewhere completely unrelated that "you can't pass this value! (to an any P parameter, which normally would be completely fine)"

mattie · September 21, 2024, 10:21am

Ok that's a good point.

On the one hand, the "point-of-use" could be really far from the conformance - different file, even different module. So, forcing the developer to think about it upfront is reasonable.

On the other hand, this adds work (understanding, writing code) to handle a situation that could never occur. Also, many concurrency-related problems can only show up at invalid points-of-use.

The more I think about it, the more I feel like it isn't the conformance that's weird at all, it's the use that's weird. Though I will admit both the problem and the fix are slightly hard to articulate.

"Converting type 'MyActor' to 'any P' could lose isolation because the protocol has synchronous requirements" or something.

vns · September 21, 2024, 10:32am

There would be an issue with this behavior IIUC, since the following code could be possible (since actor-isolated type is Sendable and allowed to be passed to different isolation), but hop to the main actor wouldn’t be performed:

@MainActor
class Entity: Codable {
    // …
}

nonisolated func callee<T>(
    _ value: T
) where T: Encodable {
    // can be called off main actor
}

Which (theoretically) can be extended with isolated in some imagined syntax:

nonisolated func callee<T>(
    _ value: T
) where T: isolated Encodable {
}

But then this will be the question of the design: if some API isn’t modeled to allow such, use of the protocol conformance would still be limited.

mattie · September 21, 2024, 10:55am

This is a great example! This is exactly the kind of thing this change would have to be able to catch.

@MainActor
func useCalleeWhenIsolated() {
  // this is fine, no isolation change required
  callee(Entity())
}

func useCalleeWhenNot() async {
  // not allowed, isolation does not match Entity
  callee(Entity())
}

You are 100% correct, though, that this is a limitation. I think it is functionally identical to the limitations imposed by a @preconcurrency conformance.