How does type casting work?

Zeta · January 9, 2024, 1:14pm

How is type casting, e.g., as? operator, implemented?

For instance, consider the eqZero function

func eqZero<T>(_ x: T) -> Bool {
    guard let x = x as? Int else { return false }
    return x == 0
}

Does the x argument carry a type tag to check if it can be down casted at runtime?
If so, if one doesn't use such casting operations, will the compiler be enough to optimize away such tags?

mlienert · January 9, 2024, 2:53pm

What I remember from this video : 2017 LLVM Developers’ Meeting: “Implementing Swift Generics ” is that the runtime type information of T are passed from the caller to the function implementation.

tera · January 9, 2024, 3:01pm

Yes for a general implementation (given just the N payload bytes you can't know what type it is).

There will also be specialisations in some cases (e.g. the generic function is in the same module as the calls themselves) which would look like:

// eqZero( 42)
func eqZero(_ x: Int) -> Bool {
    guard let x = x as? Int else { return false }
    return x == 0
}
// eZero("hello")
func eqZero(_ x: String) -> Bool {
    guard let x = x as? Int else { return false }
    return x == 0
}

Which would be optimised to:

// eqZero( 42)
func eqZero(_ x: Int) -> Bool {
    guard true else { return false }
    return x == 0
}
// eZero("hello")
func eqZero(_ x: String) -> Bool {
    guard let false else { return false }
    return x == 0
}

and then to:

// eqZero( 42)
func eqZero(_ x: Int) -> Bool { x == 0 }
// eZero("hello")
func eqZero(_ x: String) -> Bool { false }

and then inlined afterwards to mere `x == 0` and `false` correspondingly.

Zeta · January 9, 2024, 3:02pm

Thank you for the pointer!

Zeta · January 9, 2024, 3:06pm

Okay, so in general cases the type tag will be there; but in simple cases like this the function will be specialized and optimized, making it possible to elide the tag check.

wadetregaskis · January 9, 2024, 5:40pm

As I understand it, in order to support libraries Swift generics have truly generic implementations, that take in auto-boxed values with associated type metadata. So the 'core' implementation of your generic function does not take a raw Int or any other type, but rather a 'box' of Type + raw value (or equivalent).

The compiler may - but isn't required to - generate specialisations as @tera discussed, and in those there are a variety of general optimisations that can be applied to remove unnecessary boxing (and therefore allowing dynamic type casts to potentially be determined at compile time).

jrose · January 9, 2024, 8:22pm

Implementation Details that don’t affect the substance of the previous reply

Technically what’s passed in isn’t a box, it’s “just” a pointer, with the type passed as a separate argument. This helps for cases when the value is already on the stack or heap. The “box” form happens when you use Any or any Blah, because then the value has to stay associated with its type, but for generics and some Blah passing them separately is more flexible and can result in less allocation traffic. It also means the type only has to be passed once when it’s used multiple times in an argument list.

wadetregaskis · January 9, 2024, 10:26pm

Right, thanks - that is an important distinction, that I was unwittingly glossing over. Is there a particularly terminology for this? In the vein of 'box', 'existential', '____'?

jrose · January 10, 2024, 6:08pm

Within the compiler, this is “indirect” or “by-address” (usually at the SIL level). You also see it as “address-only”, as in “this variable has an address-only type”. Someone who works on SIL more than I did might be able to explain better.

Slava_Pestov · January 10, 2024, 8:12pm

The distinction is that the tag is not part of the value itself, but passed separately as the generic type argument. So func f<T>(_: [T]) for example receives runtime type metadata for T as an argument, but that describes all elements of the array[T], with individual elements stored inline as direct values, without any kind of tagging.