How does init_existential_addr work in SIL?

Alvae · February 24, 2021, 9:34am

Hello everyone,

I'm trying to make sense the way existential containers are represented/handled at the SIL level. Unfortunately, I am struggling to understand a couple of things, and were hoping for your enlightenment.

Consider the following code in Swift:

protocol Pr {
  func foo()
}
struct St: Pr {
  func foo() {}
  var bar = 123
}
func fn() {
  let a: Pr = St()
  let b = a as! St
}

The part of the SIL in which I am interested relates to the two statements of the function fn.

The initialization of a looks something like this:

%0 = alloc_stack $Pr, let, name "a" 
// ...
%4 = init_existential_addr %0 : $*Pr, $St
store %3 to [trivial] %4 : $*St

My current understanding of init_existential_addr (synthetized from the docs) is that it initializes an existential container prepared to store a value of type St at the given address. This would be consistent with the store instruction that follows.

Now, the initialization of b looks something like this:

%6 = alloc_stack $Pr
copy_addr %0 to [initialization] %6 : $*Pr
%8 = alloc_stack $St
unconditional_checked_cast_addr Pr in %6 : $*Pr to St in %8 : $*St

My interpretation of this snippet is as follows.

%6 is a temporary "container" that will be use to copy the value of a.
copy_addr copies the whole container at %0, i.e., the data, the witness and the protocol witness tables.
%8 represents the storage of b, a stack allocation of type St.
unconditional_checked_cast_addr casts the address of the container in %6 to that of value of type St and moves? it in %8, destroying the temporary in the process.

What I am not sure to get is how unconditional_checked_cast_addr is able to determine that %6 (its first operand) is indeed the address of an St value. This, in turn, challenges my understanding of almost all other instructions.

My first assumption is that alloc_stack allocates enough memory to store an "instance" of its type argument. If the latter is a concrete type (e.g., St), then this amounts the type's stride. But for an existential type (e.g., Pr), then this amounts to the size of an existential container (3 words + pointer to witness + pointer to protocol witness tables).

In this particular example, I can imagine that the contents of a value of type St "fits" within the 3 words of the container. Hence, it would make sense that %6 is the address of a block that can be reinterpreted as a value of type St. But the same SIL will be generated if I bloat the size of St (e.g., redeclaring bar as a (Int, Int, Int, Int)). In this case, my understanding is that the contents of the existential container will be allocated on the heap, and its 3 words of data will just contain a pointer to this memory. From this assumption, I would guess that init_existential_addr would be responsible for allocating this heap memory (while destroy_addr would be responsible to deallocate it) and returning its address, which would be %4 here. This is consistent with the first store instruction, which initializes a. But then the cast instruction no longer makes any sense, as%6 would be the pointer to a pointer of an St value.

Could any one point to me which of my assumptions are incorrect?

johannesweiss · February 25, 2021, 9:44am

So from my understanding (which isn't perfect, I'm not on the compiler team), almost everything you write is spot on. But I think there's a little misunderstanding right here:

So %6 as you correctly point out is not of type St but of type Pr which (also as you say) is of different size. I think what throws you off here is the word ..._cast_.... I reckon you read cast like a cast in C, ie (St *)pointer_to_pr. That is not the case here, SIL's unconditional_checked_cast_addr will in most cases actually invoke the runtime function swift_dynamicCast which does all of the hard work. So this operation may look like a very cheap cast but it's actually more of a potentially expensive-ish type conversion.

Some references in the compiler source:

IRGen handling unconditional_checked_cast_addr which mostly does emitCheckedCast
Implementation of emitCheckedCast which contains the crucial IGF.Builder.CreateCall(IGF.IGM.getDynamicCastFn(), args); which emits swift_dynamicCast

You can also kinda see that if you ask the compiler:

-emit-sil:

sil hidden [noinline] @test.THE_FUNCTION() -> () : $@convention(thin) () -> () {
bb0:
  %0 = alloc_stack $Pr, let, name "a"             // users: %13, %12, %7, %4
  %1 = integer_literal $Builtin.Int64, 123        // user: %2
  %2 = struct $Int (%1 : $Builtin.Int64)          // user: %3
  %3 = struct $St (%2 : $Int)                     // user: %5
  %4 = init_existential_addr %0 : $*Pr, $St       // user: %5
  store %3 to %4 : $*St                           // id: %5
  %6 = alloc_stack $Pr                            // users: %11, %9, %7
  copy_addr %0 to [initialization] %6 : $*Pr      // id: %7
  %8 = alloc_stack $St                            // users: %10, %9
  unconditional_checked_cast_addr Pr in %6 : $*Pr to St in %8 : $*St // id: %9
  dealloc_stack %8 : $*St                         // id: %10
  dealloc_stack %6 : $*Pr                         // id: %11
  destroy_addr %0 : $*Pr                          // id: %12
  dealloc_stack %0 : $*Pr                         // id: %13
  %14 = tuple ()                                  // user: %15
  return %14 : $()                                // id: %15
} // end sil function 'test.THE_FUNCTION() -> ()'

-emit-ir

[...]
  %11 = call %swift.type* @__swift_instantiateConcreteTypeFromMangledName({ i32, i32 }* nonnull @"demangling cache variable for type metadata for test.Pr") #10
  %12 = call i1 @swift_dynamicCast(%swift.opaque* nonnull %9, %swift.opaque* nonnull %10, %swift.type* %11, %swift.type* bitcast (i64* getelementptr inbounds (<{ i8**, i64, <{ i32, i32, i32, i32, i32, i32, i32 }>*, i32, [4 x i8] }>, <{ i8**, i64, <{ i32, i32, i32, i32, i32, i32, i32 }>*, i32, [4 x i8] }>* @"full type metadata for test.St", i64 0, i32 1) to %swift.type*), i64 7) #5
  call void @llvm.lifetime.end.p0i8(i64 8, i8* nonnull %8)
  call void @llvm.lifetime.end.p0i8(i64 40, i8* nonnull %6)
  %13 = bitcast %T4test2PrP* %0 to %__opaque_existential_type_1*
  call void @__swift_destroy_boxed_opaque_existential_1(%__opaque_existential_type_1* nonnull %13) #5
  call void @llvm.lifetime.end.p0i8(i64 40, i8* nonnull %3)
  ret void

-emit-assembly

[...]
	callq	outlined init with copy of test.Pr
	leaq	demangling cache variable for type metadata for test.Pr(%rip), %rdi
	callq	___swift_instantiateConcreteTypeFromMangledName
	leaq	-32(%rbp), %rdi
	movl	$7, %r8d
	movq	%rbx, %rsi
	movq	%rax, %rdx
	movq	%r14, %rcx
	callq	_swift_dynamicCast
	movq	%r15, %rdi
[...]

See how we go from unconditional_checked_cast_addr to @swift_dynamicCast to swift_dynamicCast. And swift_dynamicCast is complicated enough to basically get its own file.

Regarding what happens when you "blow up" the size of St (such that doesn't fit into the existential container anymore):In that case, the SIL actually doesn't change (meaningfully) but in IR and assembly you will see extra calls to @swift_allocObject(%swift.type* getelementptr inbounds (%swift.full_boxmetadata, %swift.full_boxmetadata* @metadata, i64 0, i32 2) / _swift_allocObject which will allocate a reference counted, correctly sized object on the heap. Then, the existential will just have the pointer to that heap storage instead of the St value itself. And swift_dynamicCast will handle all these cases.

I think the terminology that is usually used for an actual (C-style) cast is "bitcast". If you read bitcast it really means just reinterpreting the same bits as a different type.

Alvae · February 25, 2021, 10:47am

Thank you very much for the detailed explanation.
It makes perfect sense now!