I was writing some Swift code in Godbolt, to see how it implements this check:
func getIndex(arr: [Any.Type], type: Any.Type) -> Int? {
arr.lastIndex { $0 == type }
}
To my surprise, the resulting assembly contained seemingly unnecessary test
instructions, e.g.
test rcx, rcx
je .LBB1_6
(Godbolt link containing the full output assembly)
I investigated this a bit more locally, with -Xfrontend -emit-ir
, and I found the LLVM IR it outputted also contained the redundant check.
; Function Attrs: nounwind
define swiftcc { i64, i8 } @"$s4test8getIndex3arr4typeSiSgSayypXpG_ypXptF"(%swift.bridge* nocapture readonly %0, %swift.type* readnone %1) #1 {
entry:
%2 = bitcast %swift.bridge* %0 to %Ts28__ContiguousArrayStorageBaseC*
%._storage.count._value = getelementptr inbounds %Ts28__ContiguousArrayStorageBaseC, %Ts28__ContiguousArrayStorageBaseC* %2, i64 0, i32 1, i32 0, i32 0, i32 0
%3 = load i64, i64* %._storage.count._value, align 8, !range !16
%4 = icmp eq i64 %3, 0
br i1 %4, label %.loopexit, label %5
5: ; preds = %entry
%6 = add nsw i64 %3, -1
%7 = bitcast %swift.bridge* %0 to i8*
%8 = getelementptr inbounds i8, i8* %7, i64 32
%tailaddr = bitcast i8* %8 to { %swift.type* }*
%9 = getelementptr inbounds { %swift.type* }, { %swift.type* }* %tailaddr, i64 %6, i32 0
%10 = bitcast %swift.type** %9 to i8**
%11 = load i8*, i8** %10, align 8
%12 = bitcast %swift.type* %1 to i8*
%13 = icmp eq i8* %11, %12
br i1 %13, label %.loopexit, label %14
14: ; preds = %5
%15 = icmp eq i64 %6, 0
br i1 %15, label %.loopexit, label %.preheader.preheader
.loopexit: ; preds = %26, %.preheader.preheader, %14, %5, %entry
%16 = phi i64 [ 0, %entry ], [ %6, %5 ], [ 0, %14 ], [ 0, %26 ], [ %21, %.preheader.preheader ]
%17 = phi i8 [ 1, %entry ], [ 0, %5 ], [ 1, %14 ], [ 1, %26 ], [ 0, %.preheader.preheader ]
%18 = insertvalue { i64, i8 } undef, i64 %16, 0
%19 = insertvalue { i64, i8 } %18, i8 %17, 1
ret { i64, i8 } %19
.preheader.preheader: ; preds = %14, %26
%20 = phi i64 [ %21, %26 ], [ %6, %14 ]
%21 = add nsw i64 %20, -1
%22 = getelementptr inbounds { %swift.type* }, { %swift.type* }* %tailaddr, i64 %21, i32 0
%23 = bitcast %swift.type** %22 to i8**
%24 = load i8*, i8** %23, align 8
%25 = icmp eq i8* %24, %12
br i1 %25, label %.loopexit, label %26
26: ; preds = %.preheader.preheader
%27 = icmp eq i64 %21, 0
br i1 %27, label %.loopexit, label %.preheader.preheader
}
As best I can tell -- which I may be wrong, as I'm not familiar with LLVM IR -- %21
is always loaded from %20
, which in turn is first loaded from %6
. This means that %24
and %12
are the same, and the icmp
instruction assigned to %25
is redundant.
I have three questions:
- Is my understanding of the problem correct?
- Is this a known issue?
- Is fixing this a responsibility of the Swift frontend (to emit cleaner IR) or LLVM (to eliminate this kind of check)?
Edit: On further investigation, I realize this probably isn't actually a problem. If I treat these as unconditional jumps then the entire function is only four instructions, so there's probably just something going on I don't understand. On the other hand, this slightly more complex version of the function has a different problem:
func getIndex(arr: [(type: Any.Type, name: String?)], type: Any.Type, name: String?) -> Int? {
arr.lastIndex {
$0.type == type && $0.name == name
}
}
test rsi, rsi
je .LBB1_10
test r12, r12
je .LBB1_14
; ...
jmp .LBB1_9
.LBB1_10:
test r12, r12
jne .LBB1_14
; ...
Since there's no other jumps to LBB1_10
, and the instruction immediately before it is an unconditional jump, I'd think this can be simplified to:
test r12, r12
je .LBB1_14
test rsi, rsi
je .LBB1_10
; ...
jmp .LBB1_9
.LBB1_10:
; ...
But that would be solidly within the domain of LLVM and not Swift.