Unexpected ObjC bridging symbols emitted for pure-Swift final classes in Array/Set

Hello everyone,

In a large iOS codebase, I recently investigated a binary size regression caused by replacing an immutable struct backed by private reference storage with an equivalent immutable final class. I was able to isolate the increase to Objective-C bridging paths emitted for standard library collections containing the class type. The module is pure Swift and these values are not intentionally bridged to Objective-C.

Minimal example

I compared two immutable, read-only data models:

struct Foo: Hashable, Equatable {
    private final class Storage: Hashable, Equatable {
        let key: String
        let value: String
        // init, hash(into:), == ...
    }
    private var storage: Storage
    // accessors forwarding to storage...
}

and

final class Foo: Hashable, Equatable {
    let key: String
    let value: String
    // init, hash(into:), == ...
}

The final class version produced additional binary size from Objective-C bridging-related symbols.

The main examples I found were:

  1. For Array, around +4,120 bytes per case coming from: _ArrayBuffer._getElementSlowPath(Int) -> AnyObject

  2. For Set, around +10,237 bytes per case, driven by Set specializations including: NativeSet.init(:__cocoaSet:capacity:)

My understanding is that Swift classes implicitly conform to AnyObject, and Array / Set are bridgeable to NSArray / NSSet. As a result, the compiler emits bridging fallback paths for each distinct class type used in these collections, even when the code is otherwise pure Swift.

Using ContiguousArray avoids part of this tax, but replacing [T] with ContiguousArray across a large codebase is not realistic for us due to API boundary friction, inference issues, and Apple framework APIs that require standard arrays.

Questions

  1. Is this expected behavior for pure-Swift final class types used inside standard collections?
  2. Is there any existing attribute, compiler flag, or design pattern that allows a class to opt out of Objective-C bridging / AnyObject legacy behavior when it is known to be Swift-only?
  3. Is this theoretically dead-strippable by the optimizer or linker when no Foundation bridging entry points are reachable from the module?
  4. Are there known workarounds other than avoiding classes or replacing standard collections with ContiguousArray?

Thanks - I would appreciate any context from the compiler/runtime side.

1 Like

Don't think there's a way to opt-out from a class conforming to AnyObject.
I'd just go with a struct.

immutable Copy-on-Write struct

How is it possible to write to a struct that's immutable? :thinking:

Fair point. “Copy-on-Write” was probably imprecise wording here. The actual model is a value-semantic struct wrapper around private immutable reference storage. In the reduced example there is no mutation API, so there is no write path and therefore no actual copy-on-write operation.

The relevant distinction for the binary size issue was not CoW itself, but:

  1. struct Foo wrapping private class storage, used in [Foo] / Set<Foo>
  2. final class Foo used directly in [Foo] / Set<Foo>

I suspect the backing class storage is unrelated and the issue can be simplified to a struct vs. class comparison. However, I observed the described behavior by directly comparing this specific struct variant.

But then (given value semantic is important, just not shown) - the "naked" class version (with writeable fields, not shown here for simplicity) won't have that... or am I missing something?

The issue I'm raising is strictly about compiler code generation and binary size.

The core problem is that putting a struct into an Array behaves normally, but putting a pure-Swift final class into an Array forces the compiler to emit Objective-C bridging fallback paths (like _ArrayBuffer._getElementSlowPath(Int) -> AnyObject).

I admit I was asking those questions to better understand your particular use case to be able suggesting a better alternative (as I suspected this was an example of XY problem, but seemingly it's not).

As for the "why" I suspect this is not about "Obj-C bridging" per se, but more about NSArray bridging – (you could use NSArray in a pure Swift app) – when Swift is dealing with an array of reference types it is, like, "hurray, I would be able to bridge that back and forth with NSArray in O(1), let's do that!". And that's for any swift classes, they don't have to be NSObject or @objc, etc – the "users" of that NSArray on the other side of the bridge could be pure Swift, or equally, that NSArray could be used as a currency type and bridged back to Swift for pure Swift consumption.

class C {}

struct S {
    var c = C()
}

func foo(_ items: NSArray) {}
func bar() -> NSArray { NSArray() }

// Swift Array -> NSArray bridging
foo([C()])
foo([S()])
foo([C(), S()])

// NSArray -> Swift Array bridging
_ = bar() as! [C]
_ = bar() as! [S]

Thanks, I understand why the bridge exists semantically.

This is somewhat of an XY problem from the perspective of app binary size optimization: the direct issue I am trying to solve is the repeated binary size cost of these bridging fallback paths in code where Foundation bridging is known to be unused.

Is there any way to express that a particular class type, collection use site, or module is Swift-only and should not emit NSArray / NSSet bridging support?

More specifically:

  1. Is this support emitted unconditionally for Swift class types used in Array / Set?
  2. Is there any supported opt-out, attribute, visibility trick, or compilation mode that prevents this code from being emitted?
  3. Could the compiler or optimizer theoretically prove that these bridging paths are unreachable and avoid emitting them in such cases?
  4. If this is not currently supported but considered feasible, would a PR in this area be welcome?
  5. If none of the above is realistic today, is ContiguousArray the only practical standard-library workaround for Array, with no equivalent option for Set?

Do you have a stand alone reproducer that exhibits this behavior which you can share? I've been unable to see the symbols you mentioned when building a single file with the class from your original example in it.

Interesting. Let me prepare one and get back to you.

Here's one: GitHub - sewerynplazuk/collections-swift-ns-bridge-repro · GitHub

Thanks, that was helpful. Using your project as inspiration, the following reduction seems like it demonstrates the basic issue:

public final class CItem {
    public let id: Int
    public init(id: Int) { self.id = id }
}

public func classArrayEndIdx(_ a: [CItem]) -> Int {
    a.endIndex
}

When compiled for Darwin with optimizations enabled, various array methods and properties get inlined from the stdlib into callsites. In particular, in this example after the "PerfInliner (inline)" optimization pass we end up with the bridging logic for Array inlined into the classArrayEndIdx() function. I don't entirely follow all the involved logic, but it appears that if the array's element is a class type, even if it's "Swift-only", then it inlines code to handle the possibility that the backing storage may need to support bridging:

// From the optimized SIL for the function:

// classArrayEndIdx(_:)
// Isolation: unspecified
sil @$s4main16classArrayEndIdxySiSayAA5CItemCGF : $@convention(thin) (@guaranteed Array<CItem>) -> Int {
[%0: read v**.c*.v**, write v**.c*.v**, copy v**.c*.v**]
[global: read,write,copy,allocate,deinit_barrier]
// %0 "a"                                         // users: %2, %1
bb0(%0 : $Array<CItem>):
  debug_value %0, let, name "a", argno 1          // id: %1
  %2 = struct_extract %0, #Array._buffer          // user: %3
  %3 = struct_extract %2, #_ArrayBuffer._storage  // user: %4
  %4 = struct_extract %3, #_BridgeStorage.rawValue // users: %17, %12, %5
  // 👇 Based on this runtime check we'll end up hitting the 
  // contiguous storage or bridged storage paths
  %5 = classify_bridge_object %4                  // users: %7, %6
  %6 = tuple_extract %5, 0                        // user: %8
  %7 = tuple_extract %5, 1                        // user: %8
  %8 = builtin "or_Int1"(%6, %7) : $Builtin.Int1  // user: %10
  %9 = integer_literal $Builtin.Int1, 0           // user: %10
  %10 = builtin "int_expect_Int1"(%8, %9) : $Builtin.Int1 // user: %11
  cond_br %10, bb1, bb2                           // id: %11

However, for swift-only types (struct, enums, etc), the bridging logic that's emitted seems different, and the backing array buffers appear to be known not to need bridging (not sure exactly why).

Given that, there are a couple things I've found that seem to influence the presence of the bridging symbols in your sample project.

The first is to suppress the inlining of Array internals by opting-out of optimizations by adding the (unofficial!) @_optimize(none) annotation on some functions. In your repro I found that using this on the classSetInsert() function reduced the number of bridging symbols scanned for in the inspection script from 7 down to 1. Now, this might not be a generally wise thing to do since fully opting out of the optimization passes probably makes the codegen worse, but it's something you could try out.

The second workaround requires a change in type signatures, but it seemed to prevent the storage symbols your script checks for from surfacing in the target binary. The approach was to make the the class functions generic:

public func classArrayMap(_ a: [some Item]) -> [Int] {
    a.map { $0.id }
}

public func classArraySubscript(_ a: [some Item], _ i: Int) -> Item {
    a[i]
}

public func classSetInsert<I: Item>(_ s: inout Set<I>, _ x: I) {
    s.insert(x)
}

I'm not 100% sure, but I think this prevents specialization and subsequently the Array<Item> storage stuff getting inlined into these function bodies. I think the bridging checks still happen at runtime though, just within the stdlib's copy of the relevant collection storage functions. It's possible that the optimizer might eventually become "smarter" and realize that the generic constraints here are effectively same-type constraints (since the class is final) and decide to specialize things in the future though, so not certain this is entirely future-proof (or even really intended).

Overall I'm not sure if there's a great solution to this issue. It seems like what you want to be able to do is limit some specialization and inlining decisions from stuff within the stdlib, but there aren't really granular tools to do that currently AFAICT. Maybe some folks more familiar with the optimizer have ideas.

1 Like