Assertion failure in emitMetadataAccessByMangledName on i16 pointers

I've rebased my changes to be derived from release/5.3 and now I'm getting an assertion failure in the above function on line 2583 in my branch:
auto call = IGF.Builder.CreateCall(instantiationFn, cache);

It looks like this is a new(?) part of IR emission that creates a magic SIL function of some sort (__swift_instantiateConcreteTypeFromMangledName) and emits a call to this function into the current function being emitted.

Best I can work out, llvm traps because the function signature does not match the parameters being passed in cache. The function __swift_instantiateConcreteTypeFromMangledName seems to expect a tuple of a pointer to the mangled string and the length of the mangled string. On my platform (AVR), this is effectively { i16, i32 }.

However, cache consists of { i32, i32 }.

Looking at how this is set in getAddrOfTypeMetadataDemanglingCacheVariable, this ultimately calls LinkEntity::getDefaultDeclarationType with type LinkEntity::Kind::TypeMetadataDemanglingCacheVariable and that is hard coded to return llvm::StructType::get(IGM.Int32Ty, IGM.Int32Ty), which causes the discrepancy. (git blame shows @Joe_Groff on these lines if it helps :slight_smile: )

Why wouldn't this be defined instead as return llvm::StructType::get(IGM. RelativeAddressTy, IGM.Int32Ty)?

Wouldn't that be more generally accurate? Also, I'm confused how this isn't a problem on 64 bit systems too. It makes me think I must be missing something!

I can add a little more colour. For every platform except AVR (or <32 bit ptr platforms), relative address pointers are 32 bit. So for 64 bit, 128 bit, 32 bit, whatever, this tuple is { i32, i32 }. The code is "correct" or good enough as-is. For 16 bit pointers, that doesn't hold true.

But the problem only happens the second time emitMetadataAccessByMangledName is called.

The first time round, cache is defined like this...

(lldb) e cache->dump()
@"$sSiSgMD" = external hidden global { i32, i32 }, align 8

...then this code block runs...

  if (cast<llvm::GlobalVariable>(cache->stripPointerCasts())->isDeclaration()) {
    ConstantInitBuilder builder(IGM);
    auto structBuilder = builder.beginStruct();
    
    // A "negative" 64-bit value in the cache indicates the uninitialized state.
    // Which word has that bit in the {i32, i32} layout depends on endianness.

    if (IGM.getModule()->getDataLayout().isBigEndian()) {
      structBuilder.addInt32(-mangledStringSize);
      structBuilder.addRelativeAddress(mangledString);
    } else {
      structBuilder.addRelativeAddress(mangledString);
      structBuilder.addInt32(-mangledStringSize);
    }

    auto init = structBuilder.finishAndCreateFuture();
    cache = IGM.getAddrOfTypeMetadataDemanglingCacheVariable(type, init);
  }

...and cache is changed to be...

(lldb) e cache->dump()
@"$sSiSgMD" = linkonce_odr hidden global { i16, i32 } { i16 sub (i16 ptrtoint (<{ [4 x i8], i8 }>* @"symbolic SiSg" to i16), i16 ptrtoint ({ i16, i32 }* @"$sSiSgMD" to i16)), i32 -4 }, align 8

...which is correct. getOrInsertFunction does not find an existing function in the module named __swift_instantiateConcreteTypeFromMangledNameAbstract so it creates a new one, then defines the input parameters of the IR function being built based on the tuple that the expression being built in cache resolves to, i.e. { i16, i32 }...

  auto instantiationFn = cast<llvm::Function>(
      IGM.getModule()
          ->getOrInsertFunction(instantiationFnName, IGF.IGM.TypeMetadataPtrTy,
                                cache->getType())
          .getCallee()
          ->stripPointerCasts());

....and fills out the function body, etc.

When the call is constructed, the final type of the expression in cache is the same as the parameters for the IR function in auto instantiationFn and it all works.

The second time through emitMetadataAccessByMangledName, this line sets up the expression in cache...

  auto cache = IGM.getAddrOfTypeMetadataDemanglingCacheVariable(type,
                                                                ConstantInit());

...which resolves to { i32, i32 }.

Because cast<llvm::GlobalVariable>(cache->stripPointerCasts())->isDeclaration() resolves to false this time, the next block is skipped and cache is left resolving to { i32, i32 }. But getOrInsertFunction retrieves the function it created on the first pass, which takes a parameter of { i16, i32 }. When llvm is called to emit this function call, the function's formal parameters and the actual parameters being passed do not match, so llvm traps and the compiler crashes.

I don't really understand enough of what's going on here to see the perfect fix for this buggy behaviour.

Two patches that would probably work are...

  1. changing LinkEntity::getDefaultDeclarationType to return llvm::StructType::get(IGM.Int32Ty, IGM.Int32Ty) for LinkEntity::Kind::TypeMetadataDemanglingCacheVariable (but corrected for endianness).
  2. hacking out the condition in `` that is skipped on the second pass, i.e. make
    if (cast<llvm::GlobalVariable>(cache->stripPointerCasts())->isDeclaration()) always true and always execute the block below it. But that feels like it introduces an inefficiency.

The think I'm struggling a bit to understand is, does this represent a larger bug?

Does it matter that...

  if (cast<llvm::GlobalVariable>(cache->stripPointerCasts())->isDeclaration()) {
    ConstantInitBuilder builder(IGM);
    auto structBuilder = builder.beginStruct();
    
    // A "negative" 64-bit value in the cache indicates the uninitialized state.
    // Which word has that bit in the {i32, i32} layout depends on endianness.

    if (IGM.getModule()->getDataLayout().isBigEndian()) {
      structBuilder.addInt32(-mangledStringSize);
      structBuilder.addRelativeAddress(mangledString);
    } else {
      structBuilder.addRelativeAddress(mangledString);
      structBuilder.addInt32(-mangledStringSize);
    }

    auto init = structBuilder.finishAndCreateFuture();
    cache = IGM.getAddrOfTypeMetadataDemanglingCacheVariable(type, init);
  }

Is only ever run on first pass and somehow the cache variable in GlobalVars isn't updated? Should getAddrOfTypeMetadataDemanglingCacheVariable do something like return llvm::Constant & instead of llvm::Constant *, so that when the cache is updated, the version in GlobalVars is corrected?

To the extent type metadata makes any sense to have at all on AVR, you can change the type of the cache variable to whatever's correct for the platform. It's set up so that the contents of the cache can be loaded with a single 64-bit atomic operation on multicore CPUs, and so the value looks like a negative number when the cache is uninitialized, but a positive one once initialized. As such, it's really a union { struct { int negative_size; int offset_of_string }; const Metadata *metadata; }. If you don't want to bother coming up with something that makes sense for AVR here, you could set DisableConcreteTypeMetadataMangledNameAccessors to true in IRGenOptions when targeting AVR.

I would love to turn off type metadata on AVR. It is bulky and causes me lots of problems. I’m never likely to allow reflection on the platform. The overhead is too great! :slight_smile: