Isolated synchronous deinit

xwu · August 11, 2022, 5:05pm

Is this more strict than it has to be? (I haven’t reviewed the rules for initializers here, and as long as it’s all consistent, it’s certainly fine, but this is for my own edification.) Can there not be hops between executors as an object is deinitialized, with subclass C doing its thing on one executor, then hopping over to the executor for subclass B’s deinitializer, etc.?

Nickolas_Pohilets · August 11, 2022, 9:25pm

Short answer - implementing this for sync code would reinvent async functions, but poorly. If there is an actual demand for this, I think async deinit would be a better tool for this. And also more powerful - you would be able to call async functions from deinit.

And the reason for this lays in how deinit works. There are two kind of deinit - destroying and deallocating. Deallocating is the entry point - it calls the destroying and then frees the memory. Instance size is hardcoded in the implementation of the deallocating deinit. Destroying deinit calls destroying deinit of the superclass. Destroying deinits on their own could be implemented with multiple hopping, but in the end you need to return to freeing the memory. So you cannot do a one-way hop as current implementation does, you would need to pass some sort of completion closure. This is what I mean by poorly reinventing the async functions.

If Swift would free the memory in the base class, as ObjC does, then one-way hopping would work. But I doubt that there is a sufficient demand for such deinit’s to motivate such ABI-breaking change.

Actually @objc Swift classes do free memory that way. So one-way hopping to different executors technically is possible for @objc classes. But I decided to prefer consistent behavior.

Nickolas_Pohilets · August 11, 2022, 10:00pm

True, it can fluctuate. And as I wrote above, you can also detect it using subclasses. So implicit is deinit is not perfectly unobservable. So in the end it is a trade-off. Based on my current project, only 1.4% of classes have explicit deinit. Based on global open source search in Sourcegraph, it is 77.5k deinits for 2.2m classes, which gives 3.5%. I do not think that it is worth to penaltise other 96.5% only for a small smoothing of learning curve.

For MyClass, change between implicit vs explicit deinit in MyActor is not a source-breaking change. It is type-checked independently from MyActor.

And in term of runtime contract, in Swift release of any class is allowed from any thread. So if MyClass has a nonisolated deinit, it may observe it being called from different threads/actors, but it should work correctly regardless. If MyClass requires deinit to be called on a specific actor, it should take care of isolating its own deinit.

kavon · August 12, 2022, 5:56am

Hi @Nickolas_Pohilets, thanks for tackling this! This part of the design in SE-327 has been a sore point for me and I've also considered a redesign for deinits along this line of thinking.

First I want to say that this pitch looks good overall. I don't see any issues with it as it stands, because as others have mentioned, the point at which a deinit will be executed is not guaranteed by the language.

But, I do think that we will want to evolve and/or add the option to have a fully-fledged async deinit that one must opt into, because on its own, a synchronous but isolated deinit won't fully solve the problem of not having access to access an instance's isolated data to deinitialize it. However, the synchronous deinit here does appear to permit for an efficient and simple implementation, so I think there's value in having it as an option in the overall solution.

The main corner case I can think of at the moment is the situation where multiple properties have different isolations:

class A {
  @MainActor var one: NonSendableType = .init()
  @SomeActor var two: NonSendableType = .init()

  @MainActor deinit {
    _ = one.tearDown() // OK with deinit isolated to MainActor

    _ = two.tearDown() // error: cannot access property 'two' with a non-sendable type 
                       // 'NonSendableType' from main-actor isolated deinit 
  }
}

Since a function can only be isolated to one actor, we can't write a synchronous, isolated deinit to allow unrestricted access to these properties due to their types not being Sendable. An implicit deinit that would otherwise be generated by the compiler is an exception to that and does not need to worry about isolation, because we know exactly what it does with the properties and can "prove" that it is safe. User-defined deinits are the only trouble. So I think the only way to completely fix this issue is to also allow for an async deinit.

Thus, my immediate concern is around how to implement of async deinits efficiently but without too much complexity. For example, if a large data structure full of async deinits were to be deallocated, we want to ability to optimize this at runtime to avoid flooding the task scheduler. I think it would be good to have this either in the initial implementation, or have the implementation be extensible in the future to allow for those optimizations, without causing an ABI break.

I have some unpolished ideas for how we could make this efficient (e.g., imagine batching deinits to use a single task), but I'm curious if you've thought about this?

kavon · August 12, 2022, 9:51pm

On second thought, I was mistaken here. An async deinitializer won't help us when accessing non-sendable types, and there isn't a way to solve that at the moment, because awaiting doesn't change the fact that the value can't safely pass across the actor boundaries:

warning: non-sendable type 'NonSendableType' in implicitly asynchronous access to global actor 'SomeActor'-isolated property 'two' cannot cross actor boundary
    _ = await two

but this need for obtaining isolation to different actors in the course of running the deinit is still something we ought provide via an async deinit. Here's a slightly different example:

@globalActor actor SomeActor {
  static let shared = SomeActor()
}

@SomeActor final class Data: Sendable {
  func tearDown() {}
}

@MainActor func notifyEverybody() {}

final class A: Sendable {
  @SomeActor let data: Data = .init()

  @MainActor deinit {
    _ = notifyEverybody() // OK no await needed

    _ = Task { await data.tearDown() } // oops I escaped `self` in a deinit!
   // deinit can complete before task using `self` is scheduled
  }
}

We don't want people reaching explicitly for tasks within a deinit just to make an async call that is needed to properly destruct the instance. That's mainly because of the danger of escaping self from a deinit, which can cause crashes. If we had an async deinit I think we could correctly order things so self is not destroyed before its last uses in the deinit:

Task {
  await someInstance.deinit()
  // now, task is scheduled and uses `self` only once and never again after invoking deinit
}

Nickolas_Pohilets · August 17, 2022, 3:02pm

Async deinit is a complicated topic, and I was glad to put it out of scope. I’m not yet familiar with use cases for async deinit, and I wouldn’t trust myself yet to choose right trade-offs. So I guess, I would start with the simplest implementation that just wraps every deinit into a task. Or don't do even that - just make sure that developers have enough tools at their disposal to read all the fields and manually launch a tasks, and then see how this is used.

I’m not sure I fully understand you idea for batching async deinits, but I guess it is something along the lines of letting deinits of child objects to be executed inside the task of the parent object. I think this could be generalized even further - if last release happens inside a task, then async deinit executes inside that task. Effectively this introduced async release.

Yes, you are absolutely right about this. I think having a runtime check for escaping self would help to mitigate this to some degree. This is not the best tool, but it is so simple to implement, that I think we should do it. Ideal solution would be to check at compile time. But in the most general case it requires support for explicit lifetimes and writing code generic over lifetimes.

Async deinit should be able to work with non-Sendable fields isolated on different actors, if work with each field is moved to a function with appropriate isolation:

class A {
  @MainActor var one: NonSendableType = .init()
  @SomeActor var two: NonSendableType = .init()

  deinit async {
  	await deinitOne()
  	await deinitTwo()
  }

  @MainActor func deinitOne() {
  	_ = one.tearDown()
  }

  @SomeActor func deinitTwo() {
    _ = two.tearDown()
  }
}

But I'm afraid we have a bigger problem here.

For simplicity let's think about non-sendable type as a struct which contains a reference to mutable data structure:

class Impl {
	var data: Int = 0
}
struct NonSendable {
	var impl: Impl
}

So for each variable of NonSendable type we need to consider isolation of two memory locations - variable itself, and the data inside the Impl. Both of these locations are "isolated" on actor/task/thread. But first location can have explicit isolation annotations visible to the type system. Second one does not, but to avoid data races, it also needs to be isolated on something. We just don't have information about that in the type system and don't check it. Instead every time we access non-sendable type, we assume that previous checks have done their job, and current context is what Impl is isolated on. So isolation of the variable effectively becomes isolation of the Impl.

But now flow-sensitive isolation allows to write to isolated variables of non-sendable types in nonisolated inits. For the variable itself that's ok - we have exclusive access to the variable and can ignore its isolation. But for the Impl that's a problem. It becomes more visible if we use pseudocode with dependent types:

// Currently compiles without error
actor Leaker {
    var foo: NonSendable
	init() {
		self.foo = NonSendable()
	}
}

Pseudocode with dependent types:

class Impl<iso: Isolation> {
	@iso var data: Int = 0
}
struct NonSendable<iso: Isolation> {
	var impl: Impl<iso>

	@iso
	init() {
		self.impl = Impl<iso>()
	}
}

actor Leaker {
    var foo: NonSendable<self> // Yes, it's a lowercase self, it's a type depending on value

	// nonisolated init is modelled as a generic function where current context (actor/thread/task) is a generic parameter
	init<currentContext: Isolation>() {
		// Error: Cannot assign value of type 'NonSendable<currentContext>' to type 'NonSendable<self>'
		self.foo = NonSendable<currentContext>()
	}
}

So we need a rule, that:
a) all non-sendable variables must be isolated
b) all non-sendable variables can be initialized only in the function that has the same isolation as the variable

Under this rule, you can initialize variables of non-senable types with different isolation, only using async initializer, with initialization of the different variables being suspension point, and initializing value must be constructed out of sendable values.

werner77 · November 17, 2022, 9:23am

Just to add to this discussion, I created an Objective C base class with manual ARC to accomplish the dealloc/deinit to always run on the main thread in the following manner:

@implementation MainThreadConfinedObject {
    volatile BOOL _deallocating;
    volatile BOOL _retained;
}

- (id)init {
    if ((self = [super init])) {
        // retain one additional time, this will be released when the retainCount == 1
        [self retain];
        _retained = YES;
    }
    return self;
}

- (oneway void)release {
    NSUInteger retainCount = [self retainCount];
    [super release];
    retainCount--;
    if (retainCount == 1 && _retained && !_deallocating) {
        _deallocating = YES;
        if ([NSThread isMainThread]) {
            [super release];
        } else {
            // Create a weak reference to avoid retain being triggered
            __typeof(self) __weak weakSelf = self;
            dispatch_async(dispatch_get_main_queue(), ^{
                [weakSelf release];
            });
        }
    }
}

- (id)retain {
    if (_deallocating) {
        // Ensure that the retain count can not go up after this anymore,
        // the object is now going to be deallocated
        return self;
    } else {
        return [super retain];
    }
}

@end

jaredsinclair · March 11, 2024, 2:15pm

This proposal, since it was first posted in June 2022, has become profoundly relevant, especially in the spring of 2024 when teams around the world are actively refactoring and revising Swift code in preparation for strict concurrency checking. I don't feel qualified to speak to the particulars of this proposal's merits, but I am more than qualified, having gone through enough refactors and revisions already, to point out specific real-world challenges that this proposal, or one like it, seems likely to solve.

For example, any application that uses CoreData's NSFetchedResultsController to configure and update a user interface almost certainly has a contact point, either directly or via an intermediary object, between an NSFetchedResultsController's delegate property and an object that has access to the user interface. NSFetchedResultsController is neither Sendable nor bound to any actor, whereas any object that interacts with user interface code (f.ex. a UIKit view controller) is almost certainly bound, wholly or in part, to the Main Actor. This mismatch in isolation has several inherent challenges, but among them is one that could be improved by this proposal. Good citizenship, when implementing the NSFetchedResultsControllerDelegate protocol, requires setting the delegate property of the controller to nil inside the deinit body of the delegate (for reasons inherent to CoreData implementation details that predate Swift and are outside the scope of my comment here):

@MainActor class SomeUIObject: NSObject, NSFetchedResultsControllerDelegate {
    let controller: NSFetchedResultsController<MyModel>

    deinit {
        controller.delegate = nil
    }
}

That deinit body will not compile without warnings under strict concurrency checks because NSFRC isn't Sendable. If I understand this proposal correctly, such a statement could be made, with minor adjustments, to happily compile and run on the Main Actor if this proposal were accepted.

I'm sure many other examples could be gathered besides this one, but I think this one is particularly relevant because of the ubiquity of CoreData and the long-standing need to nullify the delegate property during deinitialization (a best practice that is older than Swift itself).

What I hope to achieve with my comment here is not to provoke a discussion about the merits of CoreData in 2024, but rather, to shake the dust off of this proposal and renew interest in it, as I imagine it could be a godsend to teams like mine that have been resorting to a combination of locking mechanisms, generics, and @unchecked Sendable to skirt around compiler warnings in the interim.

inamiy · June 11, 2024, 3:44pm

Xcode 16 Beta 1 does not still solve this issue yet. Will this be complete by Swift 6 official release?

Just to give another idea, I wonder if deinit isolation can be checked automatically by the compiler so that we don't need to worry about writing isolated deinit or nonisolated deinit explicitly, i.e.:

// 1. no explict deinit = nonisolated deinit

// 2. deinit without accessing `self` = nonisolated deinit
deinit {
    print("deinit in \(Thread.current)") // nonisolated
}

// 3. deinit that contains `self` access = isolated deinit
deinit {
    self.cleanUp()  // isolated
    
    print("deinit in \(Thread.current)") // isolated
}

By having this context sensitive isolation rule, inserting debug-print solely won't affect the overall deinit behavior, so +1.

John_McCall · June 11, 2024, 4:56pm

No, this feature will not make it for Swift 6.0.

geoffreygaren · June 13, 2024, 10:32pm

I want to emphasize jaredsinclair's point about the importance of this issue. All WebKit API objects must be deallocated on the main thread. The same is true for many other objects in Apple's SDKs. The alternative is memory corruption.

Objective-C offers a solution for this issue. It would be great if Swift offered one too.

To the extent that a general purpose solution may be difficult, please consider a simpler solution tailored to @MainActor classes, since those are the most common classes that have this issue.

mattie · July 24, 2024, 11:34pm

I’m also greatly anticipating a solution to this problem. As far I as I understand, Objective-C actually does not have a mechanism to ensure this. Am I wrong?

Jumhyn · July 24, 2024, 11:57pm

This proposal is currently under review here.

Certain Objective-C classes accomplish this today by (IIUC) overriding release to ensure that when the last strong reference goes away the dealloc call gets dispatched appropriately. And the proposal under review allows Objective-C classes to annotate their dealloc method appropriately so that Swift subclasses can observe the correct isolation.

geoffreygaren · October 29, 2024, 9:28pm

ObjC with reference counting offers:

Manually override dealloc and dispatch a block to the main thread to call dealloc
Automatically opt in to _class_setCustomDeallocInitiation

ObjC with garbage collection offered objc_finalizeOnMainThread().