Swift 4.1: Question on IR Gen of Tuple type

Hello experts,

I am trying to debug a problem on s390x architecture where "for _" enters an indefinite loop. The problem appears to be related to the order in which tuple type is declared. Code sample below lists the failing and success tuple type declaration:

cat sam.swift
enum IntKey : Int {
          case a = 3
          case b = 4
}

let tt = [(IntKey.a,1)] **<---- This tuple declaration causes indefinite for loop, whereas ...**
//let tt = [(1,IntKey.a)] **<---- This declaration exists the loop as expected**

for _ in tt{
}

For the failing case: (let tt = [(IntKey.a,1)]) - IR generated looks like:
...
%T3sam6IntKeyO_SitSg = type <{ [16 x i8] }>
...
%11 = bitcast i8 %8 to <{ %T3sam6IntKeyO, [7 x i8], %TSi }>**
%.elt = getelementptr inbounds <{ %T3sam6IntKeyO, [7 x i8], %TSi }>, <{ %T3sam6IntKeyO, [7 x i8], %TSi }>* %11, i32 0, i32 0
%.elt1 = getelementptr inbounds <{ %T3sam6IntKeyO, [7 x i8], %TSi }>, <{ %T3sam6IntKeyO, [7 x i8], %TSi }>* %11, i32 0, i32 2
...
%30 = getelementptr inbounds { i64, i64 }, { i64, i64 }* %27, i32 0, i32 1

%32 = bitcast %T3sam6IntKeyO_SitSg* %3 to i8*
call void @llvm.lifetime.end.p0i8(i64 16, i8* %32)
%33 = and i64 %29, 255
%34 = icmp eq i64 %33, 2
br i1 %34, label %38, label %35

For the success case: (let tt = [(1, IntKey.a)]) - (note that only tuple values are swapped) IR generated looks like:
...
%TSi_3sam6IntKeyOtSg = type <{ [9 x i8], [1 x i8] }>
...
%31 = load i8, i8* %30, align 8
%32 = getelementptr inbounds %TSi_3sam6IntKeyOtSg, %TSi_3sam6IntKeyOtSg* %3, i32 0, i32 1
%33 = bitcast [1 x i8]* %32 to i1*
%34 = load i1, i1* %33, align 1
%35 = bitcast %TSi_3sam6IntKeyOtSg* %3 to i8*

On s390x architecture, (in the failing case) it seems like "%34 = icmp eq i64 %33, 2" fails because GEP %32 may have read the contents in big-endian format causing icmp down the chain to fail.

I wonder if folks can help me understand why IR generated is so different for the two tuple cases where the only difference is swapping of the values in the tt declaration. Specifically,

  1. Failing case: %T3sam6IntKeyO_SitSg = type <{ [16 x i8] }> vs. Passing case: %TSi_3sam6IntKeyOtSg = type <{ [9 x i8], [1 x i8] }>
    Why do IntKey structure looks so different for tuple [(IntKey.a,1)]) vs [(1, IntKey.a)]) - seems like a simple swap?
    Why is one <{ [16 x i8] }> vs the other <{ [9 x i8], [1 x i8] }>? Where are the numbers (16 and 9) coming from?

  2. Failing case: %30 = getelementptr inbounds { i64, i64 }, { i64, i64 }* %27, i32 0, i32 1
    It seems that it is trying to read the 7-bytes (why 7 bytes?) from this storage and then masks the values to get the number of cases in the enum which is then sent for comparison. In the passing case this test is completely different where compiler compares 1 on the stack.
    Could someone please tell me where and why this is happening.
    EnumPayload.cpp's emitCompare() method seems to do the the masking and comparison in failing case.

Thanks in advance for any insights.