Redundant test instructions in loop

bbrk24 · October 28, 2022, 10:09pm

I was writing some Swift code in Godbolt, to see how it implements this check:

func getIndex(arr: [Any.Type], type: Any.Type) -> Int? {
    arr.lastIndex { $0 == type }
}

To my surprise, the resulting assembly contained seemingly unnecessary test instructions, e.g.

        test    rcx, rcx
        je      .LBB1_6

(Godbolt link containing the full output assembly)

I investigated this a bit more locally, with -Xfrontend -emit-ir, and I found the LLVM IR it outputted also contained the redundant check.

; Function Attrs: nounwind
define swiftcc { i64, i8 } @"$s4test8getIndex3arr4typeSiSgSayypXpG_ypXptF"(%swift.bridge* nocapture readonly %0, %swift.type* readnone %1) #1 {
entry:
  %2 = bitcast %swift.bridge* %0 to %Ts28__ContiguousArrayStorageBaseC*
  %._storage.count._value = getelementptr inbounds %Ts28__ContiguousArrayStorageBaseC, %Ts28__ContiguousArrayStorageBaseC* %2, i64 0, i32 1, i32 0, i32 0, i32 0
  %3 = load i64, i64* %._storage.count._value, align 8, !range !16
  %4 = icmp eq i64 %3, 0
  br i1 %4, label %.loopexit, label %5

5:                                                ; preds = %entry
  %6 = add nsw i64 %3, -1
  %7 = bitcast %swift.bridge* %0 to i8*
  %8 = getelementptr inbounds i8, i8* %7, i64 32
  %tailaddr = bitcast i8* %8 to { %swift.type* }*
  %9 = getelementptr inbounds { %swift.type* }, { %swift.type* }* %tailaddr, i64 %6, i32 0
  %10 = bitcast %swift.type** %9 to i8**
  %11 = load i8*, i8** %10, align 8
  %12 = bitcast %swift.type* %1 to i8*
  %13 = icmp eq i8* %11, %12
  br i1 %13, label %.loopexit, label %14

14:                                               ; preds = %5
  %15 = icmp eq i64 %6, 0
  br i1 %15, label %.loopexit, label %.preheader.preheader

.loopexit:                                        ; preds = %26, %.preheader.preheader, %14, %5, %entry
  %16 = phi i64 [ 0, %entry ], [ %6, %5 ], [ 0, %14 ], [ 0, %26 ], [ %21, %.preheader.preheader ]
  %17 = phi i8 [ 1, %entry ], [ 0, %5 ], [ 1, %14 ], [ 1, %26 ], [ 0, %.preheader.preheader ]
  %18 = insertvalue { i64, i8 } undef, i64 %16, 0
  %19 = insertvalue { i64, i8 } %18, i8 %17, 1
  ret { i64, i8 } %19

.preheader.preheader:                             ; preds = %14, %26
  %20 = phi i64 [ %21, %26 ], [ %6, %14 ]
  %21 = add nsw i64 %20, -1
  %22 = getelementptr inbounds { %swift.type* }, { %swift.type* }* %tailaddr, i64 %21, i32 0
  %23 = bitcast %swift.type** %22 to i8**
  %24 = load i8*, i8** %23, align 8
  %25 = icmp eq i8* %24, %12
  br i1 %25, label %.loopexit, label %26

26:                                               ; preds = %.preheader.preheader
  %27 = icmp eq i64 %21, 0
  br i1 %27, label %.loopexit, label %.preheader.preheader
}

As best I can tell -- which I may be wrong, as I'm not familiar with LLVM IR -- %21 is always loaded from %20, which in turn is first loaded from %6. This means that %24 and %12 are the same, and the icmp instruction assigned to %25 is redundant.

I have three questions:

Is my understanding of the problem correct?
Is this a known issue?
Is fixing this a responsibility of the Swift frontend (to emit cleaner IR) or LLVM (to eliminate this kind of check)?

Edit: On further investigation, I realize this probably isn't actually a problem. If I treat these as unconditional jumps then the entire function is only four instructions, so there's probably just something going on I don't understand. On the other hand, this slightly more complex version of the function has a different problem:

func getIndex(arr: [(type: Any.Type, name: String?)], type: Any.Type, name: String?) -> Int? {
    arr.lastIndex {
        $0.type == type && $0.name == name
    }
}

        test    rsi, rsi
        je      .LBB1_10
        test    r12, r12
        je      .LBB1_14
        ; ...
        jmp     .LBB1_9
.LBB1_10:
        test    r12, r12
        jne     .LBB1_14
        ; ...

Godbolt link

Since there's no other jumps to LBB1_10, and the instruction immediately before it is an unconditional jump, I'd think this can be simplified to:

        test    r12, r12
        je      .LBB1_14
        test    rsi, rsi
        je      .LBB1_10
        ; ...
        jmp     .LBB1_9
.LBB1_10:
        ; ...

But that would be solidly within the domain of LLVM and not Swift.