On Actor Initializers

kavon · May 27, 2021, 10:41pm

For the most up-to-date and complete version of this post's write-up, go here.

Hi everyone,

I'd like to start a discussion on a problem with how actor initializers are currently implemented, which permits a data race, and ways to solve it.

Here's the initial version of the write-up to kick off the discussion:

On Actor Initializers

Authors: Kavon Farvardin, John McCall, Konrad Malawski

Table of Contents

On Actor Initializers
- Synopsis
- Problems
  - Ordinary Initializers
  - Async Initializers
  - Deinitializers
- Solutions
  - Solution A: maintain unique ownership of self
    - Calling self methods after being fully initialized
      - Option 1: Flow-sensitive Rule
      - Option 2: Initialization Hooks
    - Summary
  - Solution B: add special semantics to executors
- Conclusion

Synopsis

The actors proposal (SE-0306) does not go into sufficient detail about how actor initializers and deinitializers work. There is a need for concrete details about them, because the creation and destruction of actor-instances has complex trade-offs surrounding two important aspects of actors: data-race safety and the availability of the actor's executor.

This post summarizes the problems with the current implementation of actor-instance init and deinit methods and proposes a number of solutions, in order to solicit feedback and discussion in the process of developing an amendment to SE-0306 to resolve this ambiguity.

Problems

As with classes, actors support both synchronous and asynchronous initializers, along with a user-provided deinitializer, like so:

actor Database {
  init() { /**/ }
  init(_ rows: [Data]) async { /**/ }
  deinit { /**/ }
}

This section provides an overview of the challenges in implementing initialization for actors.

Ordinary Initializers

The synchronous init() is special in that it is not treated as a cross-actor call, like it would be if it were an actor method, because there is not yet an actor or executor to "hop" to before entering the synchronous init method. In addition, deinit must be synchronous so that an actor-instance can be deallocated from anywhere.

The fact that the body of these initializing methods are synchronous means that the actor-instance's executor is not used during the method's execution, because it's not possible to suspend and switch executors from a synchronous context. This fact creates a problem, because both of these special methods are currently allowed to use self in any way they wish, once all of the stored properties have been initialized (i.e., when self is fully initialized). Thus, all of the actor's protected state is exposed once self is fully initialized in the actor's init method, but without synchronization with its executor!

This problem has real consequences, like in the following example

actor StatsTracker {
  var counter: Int

  init(_ start: Int) {
    self.counter = start
    // -- actor's `self` is fully initialized at this point --
    Task.detached { await self.tick() }
    sleep(5)
    if self.counter != start { // 💥 race
      fatalError("actor state changed!")
    }
  }

  func tick() {
    self.counter = self.counter + 1
  }
}

where a task that mutates the actor's state is created and races with the actor's initializer. This example will reach this example's fatalError statement in the existing implementation of actors. The purpose of actors is to prevent such races, so this functionality is considered a bug.

Async Initializers

Unlike its ordinary synchronous counterpart, an async init could implicitly hop to the actor self's executor once it is fully initialized.

But, there's another problem. It is both valid and desirable to be able to isolate an actor's init to a global actor, such as the @MainActor, to ensure that the right executor is used for the operations it performs. The problem is that it becomes unclear which executor is running for a global-actor isolated async init, which initializes an actor-instance object, if we were to do an implicit hop. Consider this example,

class ConnectionStatusDelegate {
  @MainActor
  func connectionStarting() { /**/ }

  @MainActor
  func connectionEstablished() { /**/ }
}

actor ConnectionManager {
  var status: ConnectionStatusDelegate
  var connectionCount: Int

  @MainActor
  init(_ sts: ConnectionStatusDelegate) async {
    // --- on MainActor --
    self.status = sts
    self.status.connectionStarting()
    self.connectionCount = 0
    // --- actor `self` fully-initialized here ---
    
    // ... connect ...
    self.status.connectionEstablished()
  }
}

where we would expect to have exclusive access to self, since this is its initializer, but we'd also like to perform the initialization while on another actor so that the ConnectionStatusDelegate can be updated without any possibility of suspension (i.e., no await needed). Currently, not even the assignment to self.status is considered valid in this example, even though the actor-instance self has not been fully initialized. That's because it's currently treated as a cross-actor assignment:

error: actor-isolated property 'status' can not be mutated from context of global actor 'MainActor'
    self.status = sts
         ^
note: mutation of this property is only permitted within the actor
  var status: ConnectionStatusDelegate
      ^

Deinitializers

When an actor-instance's deinit is called, its reference count has dropped to zero and it is not safe to use self after its deinit finishes execution. But, it is currently possible to break this rule in unexpected ways using tasks. Consider this deinitializer:

actor StatsTracker {
  var count: Int = 0
  func tick() { count += 1 }
  deinit {
    Task.detached { await self.tick() }
  }
}

which captures self in an escaping closure and enqueues it as part of a task on self's own executor, which might have already been destroyed by the time the task is executed! In fact, this example currently causes a crash in the actor runtime system. This same problem with keeping self alive beyond the lifetime of its deinit is faced by classes as well, but there is no reason to preserve past mistakes for actors, since they are a new nominal type.

Solutions

In total there are five kinds of actor initializers that we need to consider solutions for:

Synchronous init.
Synchronous init isolated to a global actor.
Async init.
Async init isolated to a global actor.
deinit.

to fix the bugs discussed earlier. There are several high-level solutions to the problem, which will be discussed in detail.

Solution A: maintain unique ownership of `self`

Within an initializer, uses of self are already restricted: the initializer can only access the instance's stored properties until the instance is fully initialized. Other uses of self could potentially observe uninitialized memory. Swift takes advantage of the fact that an initializer starts off with a unique reference to self to guarantee memory safety. The restriction on early uses of self helps maintain that uniqueness of reference. As long as we have this uniqueness of reference, we also know it's safe to continue accessing the stored properties of the actor, even from a different actor.

A similar property applies to deinit. The fact that deinit has started executing means that there are no remaining references to self. Therefore, it is once again safe to access the stored properties of the actor.

One simple way of allowing init and deinit to take advantage of uniqueness would be to ensure that self remains a unique reference for the entire duration of the function. That is, the only uses of self that would be allowed would be accesses to the stored properties. This would mean that init and deinit would not be allowed to call any other method on self, since the method wouldn't necessarily obey the same restriction, and so it could make the reference non-unique and therefore introduce a race.

Calling `self` methods after being fully initialized

There are two strategies for augmenting Solution A to handle the scenario when the programmer needs to execute code involving self after it has been fully-initialized in an init. This type of code presumably needs to modify some of the actor's state further, before any other uses of the actor happen, so it would be an error to not run this code after initialization. These uses of self in this type of code can be as mundane as calling a helper method on self, but because that method can do arbitrary things with self, the uniqueness restriction would not allow that method call.

Option 1: Flow-sensitive Rule

We could make the uniqueness restriction on self flow-sensitive within an asynchronous init that is not isolated to a global-actor. That is, we could allow an async init to do anything it wants with self after it is fully initialized. This is OK because the implementation is able to switch over to the self executor at the flow-sensitive point of full initialization. A flow-sensitive point just like this is already used to prevent calling a class method before all stored properties of the class are initialized.

For an async init that is is isolated to a global-actor, switching to the self actor half-way through the function would be confusing. It would mean that, in our ConnectionManager actor defined earlier, that calls to the @MainActor sometimes requires async, and sometimes does not, within the same function body:

@MainActor
  init(_ sts: ConnectionStatusDelegate) async {
    // --- on MainActor --
    self.status = sts

    if someCondition {
      self.connectionCount = 1
      // --- switch to `self` actor ---
      await self.status.connectionStarting()
    } else {
      self.status.connectionStarting()
      self.connectionCount = 0
      // --- switch to `self` actor ---
    }
    // ...
  }

Simply reordering an innocuous assignment to a stored property can change whether you must await the call to a @MainActor method or not. Thus, it does not make sense to switch to the self actor in a global-actor isolated init, even with the flow-sensitive rule.

For synchronous initializers, it is still not possible to switch to the self actor. Even with an an asynchronous convenience initializer to act as a wrapper around a synchronous init, you're just propagating the problem. You will no longer be able to initialize the actor from a sync context, because it will need to go through some async initializer so that we can hop to the self actor. Currently, actors do not support convenience initializers, but we could support them if going this route.

Option 2: Initialization Hooks

Another way to ensure that the actor's state is modified after it is fully-initialized is to launch a task that performs those modifications as the last action within the init. For example, you could try to do this:

actor StatsTracker {
  var counter: Int

  init(_ start: Int) {
    self.counter = start
    // -- actor's `self` is fully initialized at this point --
    Task {
      assert(self.counter == 0) // 💥 race
      await self.establishInvariants()
    }
  }

  func establishInvariants() { ... }
  // ...
}

As long as no code appears in the initializer after that task, then there is no chance of clobbering actor state that the initializer, which assumes that it has exclusive-access to self. Once self is returned from init, the usual mutual-exclusion and protection rules apply to that actor-instance.

But, this explicit task-launching pattern above suffers from a different race: the task launched in the init is racing to gain exclusive access to the actor, in order to establish a crucial initialization invariant. It needs to gain access to the actor before anyone else, but if the code that gets self back from the init immediately uses it, then it may gain access before the invariant-establishing task was scheduled:

let a = StatsTracker()
await a.tick()

So, the solution here is to integrate this task-launching pattern into the implementation of actors, to ensure that the task is always enqueued as the first one to gain access to self. The syntax for this "initialization hook" would look like something like this:

actor StatsTracker {
  var counter: Int

  init(_ start: Int) {
    self.counter = start
  }

  afterInit {
    assert(self.counter == 0)
    self.establishInvariants()
  }

  func establishInvariants() { ... }
  // ...
}

where afterInit is a synchronous function that takes no arguments (other than self implicitly) and has exclusive access to the actor. Due to actor re-entrancy, it is not a good idea to allow the afterInit to be async. Any await appearing in this afterInit would provide an opportunity for some other task to take away access to the actor, before having completed the afterInit routine.

If there is a strong argument in favor of allowing async post-init code for an actor, here are three possible ways to support this, each with its own pros and cons:

Allow a limited form of Option 1 to apply only to async inits that are not global-actor isolated.
- Pro: this makes it harder for actor re-entrancy to allow another task to take over self before post-init code is done, because self has not yet been returned from the init. The programmer would need to manually create a second task in the init to make that mistake.
- Con: makes the language less uniform: there is a flow-sensitive rule that loosens self restrictions only for an async, non-global-actor-isolated init. Does not work for global-actor isolated init.
Allow for afterInit() async { ... }
- Pro: keeps the language uniform: there is no flow-sensitive rule at all. Plus, this works for global-actor isolated async init too.
- Con: Because self is returned by init once afterInit is enqueued on the actor, if a second task is enqueued on the actor and afterInit reaches an await, the second task gets to run before afterInit has finished.
Don't provide any built-in mechanism for this. Programmers can use the manual but race-prone task-launching solution, because there is no way to guarantee that the afterInit code has fully-completed if it contains any await at all.
- Has the same pros and cons of an async afterInit.

Summary

So, we have an overall approach called Solution A that relies on uniqueness of reference to self to prevent races in the initializer. There are two options for implementing such a solution so that post-initialization code can still be specified:

Option 1 uses convenience initializers and a flow-sensitive rule that matches how classes work.
Option 2 uses a new afterInit declaration (called an initialization hook) in the actor to define a piece of code that is enqueued on the actor before returning from initialization.

To summarize whether you can use self arbitrarily after it is fully-initialized (or before it's deinitialized), but before self is accessible to others (or before self is deallocated), here is a table:

	Flow-sensitive	Initialization hooks
Sync `init`
Sync `init` + global-actor iso
Async `init`		or w/ flow-sensitive
Async `init` + global-actor iso
`deinit`

Where

means that you cannot do it in a robust way purely with initializers / deinitializers.
- For example, you would need to hide the init and use a static, async factory method to produce the actor-instance.
means it can be done with just initializers, but it would require defining an async convenience initializer.
means that you can simply write the code directly in the initializer / deinitializer.
means that you can write the code in an initialization hook.

Solution B: add special semantics to executors

An alternative to the above is to try to make the body of init and deinit actually behave like an actor function for self. Actors typically have a dedicated executor, and at the start of init (and deinit) this executor is known to be idle. In an init, we can safely allow arbitrary references to self to be introduced as long as we stop work from running on the actor's executor concurrently with the init. To do this, we could simply start up the executor in a running state instead of an idle state, then update the executor to no longer be running when the init is complete (potentially requiring some other thread to be scheduled to take it over, if jobs were added). In deinit, we could simply check that no new work was added to the actor during deinit.

The problem with this is that it makes a lot of assumptions about the actor's executor. Actors that customize their executor may not be able to support doing any of the above. For example, an actor might re-use an existing serial executor that might already be running work; we do not want to allow a synchronous initializer to block waiting for such an executor to become available. So this alternative would at best only be available for certain kinds of executors. Moreover, supporting this in custom executors would significantly complicate the SerialExecutor protocol just to enable a relatively obscure capability. It would probably only be reasonable to offer this for actors that do not use custom executors, and that would introduce an unfortunate semantic difference between different kinds of actors.

Conclusion

This post makes an effort to remain as objective as possible in laying out the problem and possible solutions. But, at least Kavon believes that Solution A with Option 2, which uses an initialization hook (afterInit) in combination with a uniqueness restriction on the init, would be the best way to solve this problem. Feedback is greatly appreciated.

JJJ · May 27, 2021, 11:48pm

I do not think this should be dictated by the convenience of isolating init() to a global actor. I think it's fair to treat init() as an ordinary actor method in that regard. But I agree that it would be a big plus if it was possible

I think the intuitive solution to me, and what I think would be the easiest to explain, is if the executor could be in a limited state in init() and deinit. What would be intuitive to me would be something like only allowing child tasks in init(), ensuring all enqueued work was done before init() returns. (I guess that would be annoying to guarantee transitively )

Edit: Also, I would be completely fine with deinit not allowing any form of task creation. Even fatalError()'ing in that case would be fine, IMHO.

JJJ · May 27, 2021, 11:52pm

Also, would it be possible, and make sense, to make the compiler automatically chop up init() where self becomes fully initialized, so the code after that point implicitly goes in an afterInit without spelling that out? Implicitly doing some sort of await afterInit, I guess.

Terje · May 28, 2021, 5:02am

The flow-sensitive solution seems ripe for causing confusion for the ordinary user.

To me it seems logically to break up init into two steps. The first step is to setup the executor and other internal state on the actor that creates the new actor, I mean, the user can’t expect to be able to create (long running) tasks before an executor even exists. The second step being the very first task to complete the actor setup. Any other tasks added after the actor creation will run after that very first task so everything should be fine. That’s (in line with) the initialisation hook afaict.

It would be nice if that very first task is called implicitly. If afterInit is present it gets called automatically.

bjhomer · May 28, 2021, 4:33pm

The idea of afterInit (or perhaps didInit) is somewhat appealing, because it makes explicit our implicit two-step initialization which sometimes causes confusion. Is it scalable, though? What if a type has two separate init methods, and they each need a different afterInit? Would it make sense to attach the afterInit to the init directly somehow, somewhat like how didSet can be attached to a property?

Terje · May 28, 2021, 6:11pm

didInit is nicer. Not a fan of afterInit either but I didn’t want to start bikeshedding.

You make a good point about different init methods having potentially different initial tasks. Something akin to didSet makes it clearer though it prevents the user from how (s)he wants to organise their code.

sighoya · May 28, 2021, 7:04pm

kavon:

class ConnectionStatusDelegate {
  @MainActor
  func connectionStarting() { /**/ }

  @MainActor
  func connectionEstablished() { /**/ }
}

actor ConnectionManager {
  var status: ConnectionStatusDelegate
  var connectionCount: Int

  @MainActor
  init(_ sts: ConnectionStatusDelegate) async {
    // --- on MainActor --
    self.status = sts
    self.status.connectionStarting()
    self.connectionCount = 0
    // --- actor `self` fully-initialized here ---
    
    // ... connect ...
    self.status.connectionEstablished()
  }
}
where we would expect to have exclusive access to self , since this is its initializer, but we'd also like to perform the initialization while on another actor so that the ConnectionStatusDelegate can be updated without any possibility of suspension (i.e., no await needed). Currently, not even the assignment to self.status is considered valid in this example, even though the actor-instance self has not been fully initialized. That's because it's currently treated as a cross-actor assignment:
error: actor-isolated property 'status' can not be mutated from context of global actor 'MainActor'
    self.status = sts
         ^
note: mutation of this property is only permitted within the actor
  var status: ConnectionStatusDelegate
      ^

It couldn't be updated safely on another actor, it must still be the @MainActor because self.status references an isolated field in the main actor.
Moreover, imaging you are passing self to a dynlib, and it's accessing only self.state but flow analysis can't check this over boundaries so both the main -and the self actor need to isolate at the same time which can be a subtle problem if both actors's isolated state depends on each other (deadlock).

kavon:

actor StatsTracker {
  var counter: Int

  init(_ start: Int) {
    self.counter = start
    // -- actor's `self` is fully initialized at this point --
    Task.detached { await self.tick() }
    sleep(5)
    if self.counter != start { // 💥 race
      fatalError("actor state changed!")
    }
  }

  func tick() {
    self.counter = self.counter + 1
  }
}

What's so problematic with this case?
Why can't we block on the detached task right before the if section?

Aren't we just safe when insisting on just one executor?
What's the deal for another?

kavon · May 28, 2021, 9:15pm

This is effectively Solution B, where the executor for self is locked for the duration of the initializer.

I think this would be confusing for programmers, because effects like calling print will not be observed at the time they expect, because everything after the invisible point where the actor is fully-initialized in the self would be implicitly run in a separate task asynchronously. So, this program could print 2, then 1:

actor A {
  init() {
    // <- fully initialized right here,
    // so everything below is run in a new task:
    print("1")
  }
}

let a = A()
print("2")

The goal of putting this into a separate declaration is to highlight the fact that afterInit / didInit actually may run after self is returned from init, but before anybody else can access the actor's protected state.

Maybe you're not confused by this, but to be clear to others: for an actor's async init, the "hop" to self at the flow-sensitive point would be transparent to the programmer and have no noticeable effects. You would just be allowed to use self arbitrarily at that point, which is how classes and structs work: you're only allowed to call a method on self after having initialized all stored properties.

I like the name didInit more now, thanks. I also think you bring up a good point about wanting different didInit's for each init, or going further, possibly one of a few didInits from the same init. It makes me wonder whether the actor should have a built-in capability to express the idea of enqueuing a task on self without the possibility of a race.

In that example, the field status is not isolated to the MainActor, only the declaration init is. Because the init provides the first, unique reference to self, and we know nothing else is running on that actor, it's actually safe to mutate the self actor's state from a different actor like MainActor, in order to initialize self.

The uniquness restriction discussed in the post would prevent this. To maintain uniqueness, you can only use self to access its stored properties. You cannot even pass self as an argument to a function, nor call any of its methods. This is the same restriction that is in place for classes*: until all of the stored properties are initialized, you can't do anything else with self. Solution A says that this uniqueness restriction should extend for the entirety of the actor's init and deinit to prevent races, since actors, unlike classes, are suppose to protect against races.

*except classes can call super.init, or self.init in a convenience init. But, the uniquness restriction transitively applies to those two kinds of init too!

It's a data race on the actor's state. The sole purpose of actors is to prevent data races.

benlings · May 28, 2021, 9:57pm

Could self be nonisolated after the actor is fully initialised? If you need to call a method on self at this point to establish an invariant, you would need await and for the initializer to be async.

sighoya · May 28, 2021, 10:03pm

Thanks for pointing that out.

But's inside an actor. Maybe I misunderstood it, but doesn't an actor prohibit only data races caused by external entities calling the actor.

Counterexample, what about the case sts being mutated by the caller in parallel while self.status.connectionStarting also mutates status, isn't that a data race, too or is that ok because it isn't annotated with isolated?

Terje · May 28, 2021, 10:16pm

Ok, yes. First the actor has to be fully initialised before any hop can/wil occur. Though I think it would be beneficial if it can be made clear what runs on which actor (executor/queue/whatever) in this 2-step initialisation instead of implicitly hopping over. Suppose that a long running computation is kicked of just to set one of the fields (before the hop), then the user might be surprised the original actor is waiting on that. Should such a computation be done in the init, probably not, but it could.

Anyway, when I wrote that sentence I actually had the first snippet under
** Solutions: Option 1: Flow-sensitive Rule** in mind where the presence of async depends on the order in which the code is written. That is just asking for trouble, I think, and should be avoided if possible.

jayton · May 29, 2021, 8:50am

Following existing language precedent, we could write:

init (…) {
    …
} then: {
    …
}

jazzbox · May 29, 2021, 9:31am

Maybe we can borrow from the computed property syntax ( with get {} set {}):

init() {
    setup {
        self.x = 42
    }
    finish {
        ...
    }
}

This has the advantage, that you immediately see what belongs together, especially with more than one init method.

BigSur · May 31, 2021, 9:44pm

Would defer init fix this problem?

init(){
...
defer {...}
...
}

kavon · June 2, 2021, 1:10am

Yes, that's essentially what we're trying to go for here in Solution A, but we're doing it as a flow-sensitive restriction on uses of self like classes. You bring up a good point about nonisolated. Initially I thought that nonisolated methods on self would be okay, but they still allow for a race with init:

actor A {
  var counter: Int
  func tick() { counter += 1 }

  init() {
    self.counter = 0
    self.nonIsoMeth()
    assert(self.counter == 0) // 💥
  }

  nonisolated func nonIsoMeth() {
    Task.detached { await self.tick() }
  }
}

The whole problem with init is that it's a declaration that is partially actor-isolated and partially not. We don't have that concept in the type checker for actor-isolation: an entire declaration or closure is isolated to one actor for its entirety. So, I left out calling nonisolated methods because I think most people expect that init is isolated to self, so the race above is not OK. Thus, Solution A generally says that only an async-init could call a method on self, but no await would be needed, since we can implement it that way under -the-hood without messing with the type checker. The heart of the problem is really the synchronous init, where that kind of implementation under-the-hood is fundamentally not possible.

Actors also protect against data races internally. The races I've demonstrated earlier for init will not race if that code appeared in any other actor-isolated method. Having a racy init like this is definitely viewed as a bug.

No, because both methods are isolated to the global MainActor, which serializes those mutations.

This is a good point, but I don't think it's a problem. If the init needs to run as part of some global actor, like the @MainActor, then the init should be marked as such, and no implicit hop would happen, whether it's an async or not.

An async function in general, not just an async init, with unspecified actor-isolation would already require an await to call it, meaning there are no guarantees about the executor used to process that call. Whatever that function decides to do, Swift will switch back to the right executor once the function returns. Since a synchronous init cannot support a hop in general, I don't think we have any issues here.

I totally agree!

I appreciate the suggestions from @jayton, @jazzbox, @BigSur, and others about how this afterInit/didInit thing should be expressed; thank you.

I'm leaning more towards a design modeled like defer, where you specify in the initializer what code should be run asynchronously after the init. You can think of it like an "async defer" like this:

init() {
  self.x = 0
  async defer { self.establishInvariant() }
}

I'll pop back in here once I've done a feasibility assessment of this design, while considering the pros/cons of the other suggestions too!

Thanks,
Kavon

Chris_Lattner3 · June 4, 2021, 6:13am

This is a great writeup, thank you for such a clear explanation of the issues and challenges. Ideally the solution for dtors can help inform the existing problem we have with class dtors.

I don't think that initialization hooks are an acceptable way to go, I think it is important to allow calling self methods in init/deinit like we do for other types. It is too much complexity and innovation to solve
an existing problem, and we already have strong precedent for flow sensitive rules in classes.

There are a bunch of different cases here and it is sort of hard to parse it out from the writing, I don't think we should have different solutions for different cases (sync init, async init, global actor init) -- we should aspire to a unifying solution that composes naturally.

The key question here is "what is isolated to the actor"? This affects the sendability checks that have to happen, and it affects the current context (whether you're "inside" the actor).

I think it would be a much simpler model to say that the init is /not/ part of the actor (it is part of the calling context), and that the initialization of the properties happens outside the actor until it is fully initialized, at which point it is "sealed" and it is a standalone actor. The @Sendable checks happen when storing into the actors properties. The consequence of this is that initializers would have to be async and await methods on the self actor if they want to invoke them (because of the cross actor hop):

func passOff(_ : A) {...}

actor A {
  var str : String

  init() {
     str = "foo" // total fine
     // self is now fully initialized, so it is is a proper actor, and self is nonisolated
     passOff(self) // totally fine, by the existing flow sensitive rules in Swift.
  }

   func syncMethodOnActor() {}

    // Also fine.  init isn't part of the actor isolation unit, so passing an NSString is ok.
    init(nsstr: NSString) async {
       str = String(nsstr) // perfectly safe.
       // self is now fully initialized, so it is is a proper actor, and self is nonisolated.

       // must await this sync method because we're nonisolated.
       await self.syncMethodOnActor()
     }
}

I think this is simple (building on the existing nonisolated support), composable naturally with global actor markers etc, and is consistent with the existing rules for structs and root classes (where they magically change character when fully initialized).

deinits are a different issue and (as you point out) the same issue exists for classes. I don't think we should do something weird only for actors, I think we should fix the problem for both classes and actors in one swoop with a single model. I don't think that limiting what you write in deinits is a productive thing to do: I think we should do a simple dynamic check that verifies the class has no references at the point of its final destruction.

-Chris

Alejandro_Martinez · June 4, 2021, 7:45am

From a dev perspective, without knowing how hard is for the compiler to do this, I think this is the best approach. As Chris mentions devs are already used to this from current initialisation rules. If we show similar errors it will feel very natural and intuitive to learn.

benlings · June 4, 2021, 11:27am

I don’t think this would work, because this init is nonisolated at this point and therefore can’t access non-let properties. It would have to be changed to init() async and assert(await self.counter == 0). This would then only be racy in the same way that any re-entrant calls on actors are.

Chris_Lattner3 · June 4, 2021, 3:27pm

Actually, I think the opposite model also works, and it has some advantages and disadvantages: this model would say that the init method is in the actor's isolation domain. This means that the sendability checks happen as arguments to the init member, so this is more similar to other actor method calls. This enables the "setup" logic at the beginning of the init can do arbitrary computation within itself with non-sendable types (e.g. set up a reference semantic binary tree and store it into a member):

actor Foo {
  var localState : ReferenceSemanticThingy

  init(name : String) {
     // This is totally fine, the arguments have already been checked.
     localState = ReferenceSemanticThingy(name)

     // actor is fully initialized.

     // how does this work?
     self.foo()
   }
   func foo() {..}
}

I see a few challenges with this model though:

You need to make sure that TLV's and other things that happen within that initialization phase are done in the actors context.
Custom executors mean that you can't call foo without actually being on the actor task/executor (as you mention above).
Actor initializers can be failable (optional return or throw an error) and we need to support this and propagate it back up.

While I can imagine some hacky solutions to parts of this problem, the only way I see to solve all the issues is to require an await when creating an actor, and have it actually switch to the actors context and run the init logic there: throwing, TLV, etc all compose out of that model cleanly. The existing optimizations for eliminate unnecessary cross-actor hops should also work as well.

This seems like a pretty defensible model to me, given every other cross-actor interaction requires an await, why should initialization be any different?

Coming back to the crux of the issue, the init logic either needs to be "inside the actor" or "outside the actor". Either way can work, but we need to pick one. I don't think that hybrid models will work.

-Chris

benlings · June 4, 2021, 5:30pm

With init being 'inside the actor', presumably this means that constructing it would need to be awaited? ie.

let f = await Foo(name: "foo")

I think it would be useful, particularly with the decision on lets being nonisolated to not require all actor initialisation to be asynchronous. e.g.

actor NamedActor {
  let name: String
  init(name: String) {
    self.name = name
  }
}
let bar = NamedActor(name: "bar")
print(bar.name)

When I posted previously

my line of thinking was that the following would be equivalent:

actor Foo {
  var localState : ReferenceSemanticThingy

  // Option 1a - sync init method
  init(name : String) {
    // This is fine - can do this from nonisolated context
    localState = ReferenceSemanticThingy(name)
    // actor is fully initialized.
    // Do things in non isolated context, e.g. start new Task, access `let`s, etc
  }

  // Option 1b - sync static factory
  static func named(_ name: String) -> Self {
    let f = Self(ReferenceSemanticThingy(name))
    // Do things in non isolated context, e.g. start new Task, access `let`s, etc
  }

  // Option 2a - async init method
  init(name : String) async {
    // This is fine - can do this from nonisolated context
    localState = ReferenceSemanticThingy(name)
    // actor is fully initialized.
    // Do things in non isolated async context
    await foo()
  }

  // Option 2b - async static factory
  static func named(_ name: String) async -> Self {
    let f = Self(ReferenceSemanticThingy(name))
    // Do things in non isolated async context
    await f.foo()
  }

   init(_ localState: ReferenceSemanticThingy) { self.localState = localState }
   func foo() {..}
}

On Actor Initializers

On Actor Initializers

Synopsis

Problems

Ordinary Initializers

Async Initializers

Deinitializers

Solutions

Solution A: maintain unique ownership of self

Calling self methods after being fully initialized

Option 1: Flow-sensitive Rule

Option 2: Initialization Hooks

Summary

Solution B: add special semantics to executors

Conclusion

Solution A: maintain unique ownership of `self`

Calling `self` methods after being fully initialized