Most efficient way of type casting from UInt32 to Int

(Jens Persson) #1

Assuming my target is 64 bit, so an Int can represent every UInt32 value, what is the most efficient way to get an Int value given a UInt32 value?

For example will these two be equally efficient and perform no unnecessary checks or anything:



Int(truncatingIfNeeded: myUInt32Value)


(Cory Benfield) #2

This is a question most easily answered by Godbolt. According to Godbolt, the two operations generate identical assembly code.

At the SIL layer, this is what we get for Int.init(UInt32):

%2 = struct_extract %0 : $UInt32, #UInt32._value // user: %3
%3 = builtin "zextOrBitCast_Int32_Int64"(%2 : $Builtin.Int32) : $Builtin.Int64 // user: %4
%4 = struct $Int (%3 : $Builtin.Int64)          // user: %5

And this is what we get for Int.init(truncatingIfNeeded: UInt32):

  %2 = struct_extract %0 : $UInt32, #UInt32._value // user: %3
  %3 = builtin "zextOrBitCast_Int32_Int64"(%2 : $Builtin.Int32) : $Builtin.Int64 // user: %4
  %4 = struct $Int (%3 : $Builtin.Int64)          // user: %5

The SIL here is identical, so the compiler consider these to be the same, at least on 64-bit platforms. This is what I'd expect: as you say, the compiler is forewarned that these two operations will be absolutely isomorphic.

(Steve Canon) #3

If either of these ever fails to generate a simple in-register zero-extension when optimized, that's a performance bug. So they should be equivalent. As @lukasa noted, you can check this on Godbolt; here's a link showing an example:

Here's the code from that example:

public func foo(x: UInt32, y: UInt32) -> Int {
    return Int(truncatingIfNeeded: x) + Int(y)

and the generated assembly, annotated with my notes explaining what's happening:

// Standard stack frame setup. Arguably Swift should be able
// to omit the frame for simple leaf functions like this.
push    rbp
mov     rbp, rsp
// Zero-extend `x` and `y` by using a 32b `mov` instruction, which 
// implicitly zero-extends to 64b.
mov     ecx, edi
mov     eax, esi
// Add the two zero-extended values as 64b integers.
add     rax, rcx
// Clear stack frame and return
pop     rbp