Performance of `as?`

KeithBauerANZ · November 13, 2024, 12:39am

We have code that does a lot of as? across many threads. There are solutions to avoid a lot of them, and we're working on that! But there's some that there's no clear solution to, and the performance is very poor.

More concerningly, the performance of these casts varies by several orders of magnitude. Sometimes, the same cast will take take ~1µs, sometimes it will take ~10,000µs.

It's not always the first cast that's slow, either, sometimes we have a couple of identical casts that are all very slow.

The source to swift_dynamicCast is extremely complicated and I don't pretend to understand all the caveats, but it looks from instruments as if we're mostly getting caught in a long linear search of some data structure, possibly the __swift5_proto section? Ours is ~192KB, which is a lot of data to be scanning through, though I'm not sure if that's all the problem.

Can anyone suggest any things to check, any tips & tricks for speeding it up (other than "don't"), or any special situations where we might be able to do something unsafe but "instantaneous" if we can constrain the problem?

jrose · November 13, 2024, 1:27am

I don’t know if I’ll have an answer for you, but I can at least request more info: it matters a lot what the source and destination types are in this case. What are you usually casting to? What are you casting from? If you’re casting to a protocol, are the conformances involved conditional?

David_Smith · November 13, 2024, 3:02am

One thing to check would be whether the slow cases are the first launch of a fresh binary (a case software developers obviously hit often, sadly). dyld builds a cache of a lot of interesting data for subsequent runs called the "launch closure".

KeithBauerANZ · November 13, 2024, 4:30am

The values are always stored as Any, and cast to protocol types. The protocols don't have Self references/associated types. The conformances are never conditional. Within that, we have three broad cases:

final class to a protocol; this cast should always succeed. the protocol itself is not usually : AnyObject but could often be made so if that would help
unknown type to a protocol
- 50% of the time it's really a generic struct, and the cast succeeds (the generics are necessary, but irrelevant to the cast).
- 50% of the time it's some completely arbitrary type, and the cast fails

First run, vs. subsequent run, doesn't make any obvious difference to the performance profile (iPhone Simulator).

Jon_Shier · November 14, 2024, 4:52am

Emerge Tools have done quite a bit of work to investigate and improve protocol lookup performance, I suggest reviewing their blog entries, especially the one about order files.

CharlesS · November 14, 2024, 10:54pm

Is it possible that you might be invoking the Objective-C bridge somehow? There are plenty of ways to trigger that with as, and bridging Objective-C objects will obviously be more time-consuming than type-checking native Swift objects.

KeithBauerANZ · November 14, 2024, 10:55pm

I don't think so. Mostly our types are Swift classes without NSObject superclass, or Swift structs. There shouldn't be any hidden NSArray/NSDictionary either.

nocchijiang · November 15, 2024, 3:44am

I am not sure about the platform your program is targeting, but dyld introduced an optimization for the use case you mentioned starting from iOS 16/macOS 13. dyld/dyld/DyldAPIs.h at 65bbeed63cec73f313b1d636e63f243964725a9d · apple-oss-distributions/dyld · GitHub

Edit: If you are profiling your program using Instruments from Xcode, this optimization is disabled.

KeithBauerANZ · November 15, 2024, 3:47am

Thanks, I'm profiling using Instruments on the iOS 18 simulator, so it seems I won't be getting the actual performance of the app as seen by users. That's actually encouraging :D

Does anyone know of a way to profile with the dyld optimization enabled?

nocchijiang · November 15, 2024, 3:48am

I would like to recommend GitHub - EmergeTools/ETTrace: Easily and accurately profile iOS apps.