Efficiently Reaping Weak Collections

Context:

I have a ModelController that keeps a collection of weak references to ModelObjects. The controller receives changes from the cloud and, for each one, sees if the change affects a ModelObject in the collection. If so, it updates that object’s properties.

The collection is weak because if nothing in the app other than the ModelController is using a given ModelObject, we obviously don’t care about it and can let it go.

Code:

This is the basic (abbreviated) setup:

class ModelController
{
    // ModelObjects each have a UUID, so we key by that:
    private var cache: [UUID: WeakBox]
}
class WeakBox
{
    weak var wrappedValue: ModelObject?
}

Question:

When wrappedValue becomes nil inside WeakBox, what is the best strategy for removing that WeakBox instance from the cache dictionary so it doesn’t linger?

Property observers don’t fire when ARC nils a reference, so WeakBox can’t call back up to ModelController on didSet.

I know discussions about weak collections go back to the dawn of Swift. In this case, the cache collection could see low millions of entries, so cleaning up would be significant. For that same reason, I cannot walk the dictionary and test each entry to see if its wrappedValue is nil—we’ve essentially reinvented garbage collection at that point.

Anyone have a Tevanianly clever way to do this?

What if you only check if it’s nil once a change arrives from the cloud for that object and remove the WeakBox then? I would also provide a method to check every box so you can choose the most opportune point to do a full check once in a while in case no updates come from the cloud for a long time.

1 Like

Not that there are no answers to this, but the fact that it doesn’t have an easy answer is why Swift doesn’t have a set of built-in weak collection types already. :-(

4 Likes

Yea. Having property observers fire when ARC nils the reference would be the cleanest solution, but I gather that isn’t done because the nilling doesn’t happen eagerly—it gets done on the next load of the value.

I was hoping there’s something dangerous but clever I could do. These objects drive the UI, so everything is bound to the main thread and taking a garbage collection break over many, many entries in the collection is sure to cause beachballs.

I suppose I could do it when the app’s occlusion state changes to occluded (it’s a Mac app) and just hope it completes before the user brings the app back on screen.

Do you really want to let it go immediately once the last reference is dropped, or keep it alive for a while in case it's needed soon after?

You can implement an "intrusive" weak dictionary where the keys or values are weak references by having each value store its own key, together with a strong reference to a queue. The value's deinit then adds the key to the queue before a value is deallocated. An insertion into the dictionary first processes the queue and removes empty keys.

EDIT: Now that I think about it, that's probably overkill. Since open addressing usually requires a periodic O(n) rehash to clean up tombstones from deleted keys anyway, if you implement the hashtable yourself, you can just treat a nil weak reference as a tombstone entry for the purposes of lookup and insertion, without imposing any requirements on the weak-referenced key or value stored within.

3 Likes

The value's deinit then adds the key to the queue before a value is deallocated.

This should get the job done. Each ModelObject has a reference to the ModelController, so it can call from deinit and pass its ID so that ModelController can remove that entry from cache.

I was hoping for something more encapsulated because deinit plus Swift Concurrency is painful. (ModelController is not isolated to @MainActor; I can instantiate instances of it on background actors to do background processing such as bulk imports, etc. so the picture is a little more complex than the example here.)

If you control deinit, then anything is possible!

var modelObject: Optional = ModelObject()
@Weak var weakModelObject = modelObject
let controller = ModelController([weakModelObject!])
#expect(controller.cache.count == 1)
modelObject = nil
#expect(weakModelObject == nil)
#expect(controller.cache.isEmpty)
import typealias Foundation.UUID
import Combine

final class ModelObject: Deinitializing {
  let id = UUID()
  var deinitPublisher: some Publisher<ModelObject, Never> { _deinitPublisher }
  private let _deinitPublisher = PassthroughSubject<ModelObject, Never>()
  deinit { _deinitPublisher.send(self) }
}

final class ModelController {
  @Weak.Dictionary var cache: [UUID: Weak<ModelObject>]
  init(_ objects: some Sequence<ModelObject>) {
    _cache = .init(objects, key: \.id)
  }
}
import Combine

@propertyWrapper struct Weak<Object: Deinitializing & AnyObject> {
  init(wrappedValue: Object!) {
    self.wrappedValue = wrappedValue
    cancellable = wrappedValue.deinitPublisher.subscribe(deinitPublisher)
  }

  private(set) weak var wrappedValue: Object!
  private let deinitPublisher = PassthroughSubject<Object, Never>()
  private let cancellable: AnyCancellable
}

extension Weak {
  @propertyWrapper final class Collection<WrappedValue: Swift.Collection> {
    var wrappedValue: WrappedValue
    private var cancellable: AnyCancellable!

    @inlinable init(
      wrappedValue: WrappedValue,
      objects: some Sequence<Object>,
      deinit: @escaping (Collection, Object) -> Void
    ) {
      self.wrappedValue = wrappedValue
      cancellable = Publishers.MergeMany(objects.lazy.map(\.deinitPublisher))
        .sink { [unowned self] in `deinit`(self, $0) }
    }
  }

  typealias Dictionary<Key: Hashable> = Collection<[Key: Weak]>
}

extension Weak.Collection {
  @inlinable convenience init<Key>(
    _ objects: some Sequence<Object>,
    key keyForValue: @escaping (Object) -> Key
  ) where WrappedValue == [Key: Weak] {
    self.init(
      wrappedValue: .init(
        uniqueKeysWithValues: objects.lazy.map { (keyForValue($0), .init(wrappedValue: $0)) }
      ),
      objects: objects,
      deinit: { collection, object in
        collection.wrappedValue[keyForValue(object)] = nil
      }
    )
  }
}
import Combine

protocol Deinitializing<DeinitPublisher> {
  associatedtype DeinitPublisher: Publisher<Self, Never>
  var deinitPublisher: DeinitPublisher { get }
}

While it may be unnecessary in this case (c.f. @Slava_Pestov's suggestion about amortizing cleanup by treating nil as a tombstone), it's possible to leverage Objective-C Associated Objects on Apple platforms to implement a general "observer" that triggers when an object is deinitialized. Something like:

final class DeinitObserver: Sendable {
  private enum Key {}

  let block: @Sendable () -> Void
  private init(_ block: @escaping @Sendable () -> Void) {
    self.block = block
  }
  deinit { block() }

  static func observe(_ object: AnyObject, perform: @escaping @Sendable () -> Void) {
    let observer = DeinitObserver(perform)
    let key = UnsafeRawPointer(bitPattern: UInt(bitPattern: ObjectIdentifier(Key.self)))!
    objc_setAssociatedObject(object, key, observer, .OBJC_ASSOCIATION_RETAIN_NONATOMIC)
  }
}

// usage:
DeinitObserver.observe(myValue) {
  // perform cleanup on deinit
}

Not sure of the performance impact this would have though so YMMV.

4 Likes