Jens
1
Assuming my target is 64 bit, so an Int can represent every UInt32 value, what is the most efficient way to get an Int value given a UInt32 value?
For example will these two be equally efficient and perform no unnecessary checks or anything:
Int(myUInt32Value)
and
Int(truncatingIfNeeded: myUInt32Value)
?
2 Likes
lukasa
(Cory Benfield)
2
This is a question most easily answered by Godbolt. According to Godbolt, the two operations generate identical assembly code.
At the SIL layer, this is what we get for Int.init(UInt32):
%2 = struct_extract %0 : $UInt32, #UInt32._value // user: %3
%3 = builtin "zextOrBitCast_Int32_Int64"(%2 : $Builtin.Int32) : $Builtin.Int64 // user: %4
%4 = struct $Int (%3 : $Builtin.Int64) // user: %5
And this is what we get for Int.init(truncatingIfNeeded: UInt32):
%2 = struct_extract %0 : $UInt32, #UInt32._value // user: %3
%3 = builtin "zextOrBitCast_Int32_Int64"(%2 : $Builtin.Int32) : $Builtin.Int64 // user: %4
%4 = struct $Int (%3 : $Builtin.Int64) // user: %5
The SIL here is identical, so the compiler consider these to be the same, at least on 64-bit platforms. This is what I'd expect: as you say, the compiler is forewarned that these two operations will be absolutely isomorphic.
8 Likes
scanon
(Steve Canon)
3
If either of these ever fails to generate a simple in-register zero-extension when optimized, that's a performance bug. So they should be equivalent. As @lukasa noted, you can check this on Godbolt; here's a link showing an example: Compiler Explorer
Here's the code from that example:
public func foo(x: UInt32, y: UInt32) -> Int {
return Int(truncatingIfNeeded: x) + Int(y)
}
and the generated assembly, annotated with my notes explaining what's happening:
// Standard stack frame setup. Arguably Swift should be able
// to omit the frame for simple leaf functions like this.
push rbp
mov rbp, rsp
// Zero-extend `x` and `y` by using a 32b `mov` instruction, which
// implicitly zero-extends to 64b.
mov ecx, edi
mov eax, esi
// Add the two zero-extended values as 64b integers.
add rax, rcx
// Clear stack frame and return
pop rbp
ret
5 Likes