Swift ARC Weak/Unowned performance

For sometime now in Swift, weak/unowned references have significant overhead. I have managed to overcome most of this by using structs or using unowned(unsafe) in cases where I need references.

As the documentation states:

Swift also provides unsafe unowned references for cases where you need to disable runtime safety checks—for example, for performance reasons. As with all unsafe operations, you take on the responsibility for checking that code for safety.

However, recently I have returned to work on some of my projects and noticed that unowned(unsafe) also has much overhead as well. Has anyone else been dealing with these issues in performance-critical code? If so, how have you managed to get around that. I have tried using the Unmanaged type but have not had great success with it. This stack overflow link discusses it further: link

That sounds like a bug. unowned(unsafe) is intended to be equivalent to Unmanaged, i.e. literally no refcounting occurs. Can you file a bug report?

2 Likes

Kinda new here, should I submit at bugs.swift or through Apple radar or both?

Yes my original train of thought was that unowned(unsafe) was suppose to be a simple pointer with no ref counting. So I’ve originally been using it everywhere in my performance-critical code, even in cases where there is no retain cycle as an optimization (although I’m sure the compiler could probably have optimized some of its usage). But as of today, I literally have just done find/replace-all to remove all usages of it because of its overhead. This is true in both Xcode 12 beta and Xcode 11.5.

And just to verify, I downloaded Xcode 11.1 and ran the same code and the performance was good again. So at some point after Xcode 11.1 unowned(unsafe) gained a lot of overhead (to the point where it’s not worth using in performance-critical code).

1 Like

I managed to reproduce it too, using your Stack Overflow test case. Since this is something that doesn't have to do with any Apple APIs, https://bugs.swift.org is the right place to start; if it turns out to be Apple-specific, the Apple folks will "clone" it to an internal bug.

1 Like

FYI — [SR-13074] unowned(unsafe) performance bug · Issue #55520 · apple/swift · GitHub

2 Likes

I've been looking more into the Unmanaged type as a replacement for unowned(unsafe). Are these two implementations roughly equivalent? I ran both implementations across the 60+ tests in my physics engine. The Unmanaged implementation passes all performance tests easily, whereas the unowned(unsafe) fails all of them (but passes them all in the earlier version of Swift/Xcode as discussed). I'm just not 100% confident if I'm using Unmanaged correctly.

The original Unowned(unsafe) implementation.

final class A {
    let b: B
    init(b: B) {
        self.b = b
        b.a = self
    }
    deinit {
        b.a = nil
    }
}

final class B {
    unowned(unsafe) var a: A!
}

The new Unmanaged implementation.

final class A {
    let b: B
    init(b: B) {
        self.b = b
        b._a = Unmanaged<A>.passUnretained(self)
    }
    deinit {
        b._a = nil
    }
}

final class B {
    var _a: Unmanaged<A>!
    var a: A {
        return _a!.takeUnretainedValue()
    }
}

Unmanaged is literally implemented in terms of unowned(unsafe), so the difference here is actually a subtle one: in one case the Optional is "inside" the unowned(unsafe) and in the other it's the other way around. Support for Optional-typed unowned references was added to Swift fairly late, so it's not entirely surprising that it has bugs that you don't see with Unmanaged.

2 Likes

This is not the most performant way to do this and in fact will cause one to emit retains that can not be eliminated by the optimizer. Today one can not do what one wants to do without using standard library SPI that has not gone through evolution. Specifically:

final class A {
    let b: B
    init(b: B) {
        self.b = b
        b._a = Unmanaged<A>.passUnretained(self)
    }
    deinit {
        b._a = nil
    }
}

final class B {
    var _a: Unmanaged<A>!
    var a: A {
        return _a!._withUnmanagedRef { $0 }
    }
}

The retain inserted by the _withUnmanagedRef in the newest swiftlang will be a normal retain that the optimizer can eliminate.

Why would returning from _withUnmanagedRef be any more optimizable than loading directly with takeUnretainedValue? In both cases the object is known to be valid (externally owned) at the time of the call and no longer known to be valid once returned.

Let me give you the full picture (just being complete): The compiler views loading from an unowned (unsafe) value as a conversion from something that is unmanaged and outside of the compiler system to something that it controls. At any such point, the compiler system needs to take ownership of the value. It is also assumed that the user is more of an expert user since they are managing their own memory. With that in mind, it inserts at that conversion point a retain that can not be removed by the optimizer since in a sense the value is entering the ARC system's control.

Now that being said, we recently noticed this problem and so I added a Builtin that unsafely converts the inner value to a guaranteed value and reimplemented _withUnsafeGuaranteedRef on top of it. The idea is that the user is asserting to the compiler that... "Compiler this value is something that is already under your control. You do not need to take ownership of it. So convert it unsafely for me to guaranteed". Once we have the guaranteed value, the optimizable retain is from the return value from the closure.

The only thing to be aware of is that even though Unmanaged's method is transparent, I have seen some cases where the closure isn't inlined since the closure isn't transparent itself. We may want to look into having some form of transparent that implies that all closure params are transparent as well? Not sure.

1 Like

I guess I'll rephrase that: there should be no difference between the two, because there are no circumstances where one is safe and the other isn't. What happens if you reimplement takeUnretainedValue in terms of _withUnsafeGuaranteedRef?

Hmm you are right, it's a performance bug related to unowned(unsafe) with an optional type. In fact, if I replace my usage of optional unowned(unsafe) with non-optional by just using a dummy value, there is no performance bug. So I suppose I can now either use this approach or the unmanaged approach until optional unowned(unsafe) becomes performant again.

Perhaps I spoke too soon. There is definitely something else going on with non-optional unowned(unsafe) variables in terms of performance too. While it is true that the change does fix the simple benchmark linked above, I just re-enabled unowned(unsafe) throughout my project on many non-optional variables and I still see a significant performance loss across multiple performance tests (roughly 1.3x slower on average).

Previously, I would use unowned(unsafe) very liberally if I knew it was safe as an optimization, and in previous versions of Xcode/Swift this works fine. But for now I'm simply avoiding unowned(unsafe) entirely unless I absolutely need it, and in those cases, I will be sure to use non-optionals. I can try to build new benchmarks that help capture this problem when I get the chance.

Okay, so here is another brief benchmark showing performance bug using a non-optional unowned(unsafe).

Targeting macOS 10.15 on release mode.
Run in Xcode 11.1 prints: 0.0153731107711792
Run in Xcode 11.5 prints: 0.20703494548797607
Run in Xcode 12 beta prints: 0.21625101566314697

So this code runs roughly 13x-14x slower in newer versions of Xcode/Swift.

import Foundation
final class A {
    var i = 0
}

func run() -> Int {
    var arr: [A] = []
    var tot = 0
    for _ in 0 ..< 10000 {
        arr.append(A())
    }
    
    let startTime = CFAbsoluteTimeGetCurrent()
    for _ in 0 ..< 1000 {
        for i in 0 ..< arr.count {
            unowned(unsafe) let a = arr[i]
            a.i += i
            tot += a.i
        }
    }
    let time = CFAbsoluteTimeGetCurrent() - startTime
    print(time)
    
    return tot
}

let r = run()

Yea. that is going to be very inefficient. I think you are continually converting back and forth in between unmanaged and strong ownership.

The issue is one of API intent. Specifically:

  1. It is assumed that Unmanaged's users are "expert users" whose intent the IR/optimizers must respect since something is going on that is not represented in the IR itself.
  2. If that user wants a retain at a certain place, the compiler should respect that expert user.

So yes, I could implement takeUnretainedValue on top of _withUnsafeGuaranteedRef, but it would allow for the optimizer to potentially violate the intention of the expert user. This fits the intention of this being used for CF bridging and in other places where the expert user wants that control since they are working along language boundaries.

I agree with this, but that's the purpose of Unmanaged.retain(), not Unmanaged.takeUnretainedValue(). takeUnretainedValue means "bring this into management by ARC"; the retain is an implementation detail. A lot has been said about the names of the Unmanaged methods being unclear, but the intent was "take [unretained value]", i.e. "take a value that is being kept alive by a reference other than this one". That's the same semantic meaning as _withUnsafeGuaranteedRef.

2 Likes

That's interesting -- to me, the retain is an essential part of takeUnretainedValue.

When I call takeUnretainedValue(), I promise the compiler that there is still an outstanding reference to the object somewhere -- it could be dissolved into the murky depths of Unmanaged, it could be a proper strong reference somewhere, who knows -- all I promise is that it exists, so that the reference is still valid. (Otherwise the code has undefined behavior.)

But this promise is only valid for the duration of the call to takeUnretainedValue -- I'm not promising to ensure that the other reference will remain valid for the lifetime of the return value. So I expect the compiler to retain the object before takeUnretainedValue returns, to ensure it won't get prematurely deallocated.

I also expect that the returned reference will be considered non-unique by e.g. isKnownUniquelyReferenced, because unless I destroy the "other" reference(s), there are now obviously at least two outstanding references to the object.

Now, a sufficiently smart compiler could perhaps prove sometimes that

  1. that vague "other" reference won't actually go away while my code is working with the returned strong reference and
  2. that I'm not checking for uniqueness.

If both of these are true, then this theoretical compiler could optimize away the retain/release pair. I am not a compiler engineer, but it seems to me that (1) would be difficult to prove in practice -- the source of that other reference often isn't obvious to me, much less the compiler. (Is the lack of release operations in the executed code enough? I'm honestly not sure. What if someone calls Unmanaged.release() on another thread (with suitable synchronization)?)

The advantage of _withUnsafeGuaranteedRef is that it comes with an explicit promise from me, the user, to keep the "other" reference alive while its closure is executing. It also gives permission to the compiler to borrow the same preexisting reference rather than creating a new one -- so that the compiler can easily guarantee that there won't be any ARC traffic, and I won't get surprised if the resulting reference looks unique.

This is an extremely useful operation -- we should find a good name for it and make it public.

1 Like

Great explanation and argument for withUnsafeGuaranteedRef. But the issue here is that either the following needs to be illegal, or it needs to semantically retain the returned $0:

return _a!._withUnsafeGuaranteedRef { $0 }

1 Like

Oh, is that not the case? I'd (perhaps naively) expect that returning the borrowed reference would retain the object.