On Windows, we need to emit symbols which may be multiply defined in COMDATs to ensure that the symbol is uniqued by the linker. This is normally fine, and in a couple of sites where we generate declarations only, we remove the COMDAT associated with the declaration. However, when generating the type metadata, we are failing to remove the COMDAT associated with a declaration only which results in invalid IR.
Now, the interesting thing is that we generate the symbol for the type metadata multiple times, but, because the type metadata may be emitted lazily, we don't necessarily know if we are going to actually emit the symbol. It may just be a nominal type which is going to be lazy emitted or it could be eagerly emitted. I can’t seem to identify which case we are currently in. Erasing the COMDAT eagerly fails as the symbol is not emitted, and leaving the COMDAT means that we end up with cases of declarations only getting marked as COMDAT, which is invalid IR. Erasing the COMDAT at the time of the construction of the variable declaration doesn’t really work very well due to the definition being attached later.
One idea that I had (but failed) was to eagerly erase the COMDAT, and associate the COMDAT by re-applying the IR Linkage at the lazy definition time. However, that seems to abort during the standard library build (I’ve not thoroughly analyzed why).
This results in the multithreaded builds failing (which I had seen once but had disappeared). Now that this is reproducible, it would be great to fix this. I am rather perplexed with this at this point and would appreciate some pointers.
Note that getAddrOfLLVMVariable already does this, by calling updateLinkageForDefinition if there's an existing forward declaration that we're now providing a definition for. It sounds like the fundamental issue is that we're either calling into getAddrOfLLVMVariable with the wrong ForDefinition argument, or else going behind getAddrOfLLVMVariable's back in some place and trying to manually create declarations incorrectly.
Are multithreaded IRGen builds doing lazy emission for non-foreign type metadata? If so, it's possible that that we aren't really passing down ForDefinition right, but that it only blows up for COFF because the COMDAT assertions are stronger than the assertions around linkage/visibility that we have to do for non-COFF targets.
I think what I saw was the multithreaded IRGen trying to do lazy emission for non-foreign type metadata. It really is too bad that we are using gold for ELF, since we could actually use COMDATs there as well (gold just has a bug, not that it doesn't support it). Is there a good way to track down where we are going wrong?
It's really a philosophy thing. You've identified a call site; ask yourself if what that call site is doing makes sense, or if you need to be propagating more information down to the callee so that it can make the right adjustments. I don't remember in particular how the laziness is managed for lazy type metadata, but compare it with how it's being managed for foreign type metadata.
So, looking a bit further into this, I think that my initial approach may be the better solution here. Whenever we create a type symbol, we erase the COMDAT from it. However, when the metadata is then re-materialised, we re-apply the IR linkage to the symbol reconstructing the COMDAT group for the symbol. This seems to be causing a problem with linking on Windows though, I suspect due to the alias that is constructed which is WeakODR'ed.
Hmm. It looks like the pattern of calls we use for type metadata is somewhat different from the pattern we use for other kinds of declaration — we have IGM::getAddrOfTypeMetadata for emitting a reference and IGM::defineTypeMetadata for emitting a definition. Still, that should be good enough to either set or not set a COMDAT based on whether we're emitting a definition.
Ugh, I think that's what I wasn't thinking about properly. I was using the fact that we split that up that way to determine what to do, that is, I was always erasing the COMDAT association in getAddrOfTypeMetadata since that is the reference, but then re-applying the IRLinkage in defineTypeMetadata. However, we already know that defineTypeMetadata should only occur for "strong" definitions of the symbol, and thus, it should never need to be COMDATed. Maybe that will fix the issue for PE/COFF (where the linker fails to find a definition which is preserved).
$constant = comdat any
@constant = linkonce_odr hidden constant i32 4
@alias = weak_odr hidden alias i32, i32* @constant
Emitting an object file here and looking at the symbol table via link indicates the following:
(link -dump -symbols reduced.obj)
00B 00000000 SECT4 notype External | constant
00C 00000000 UNDEF notype WeakExternal | alias
Default index E Library search
00E 00000000 SECT4 notype External | .weak.alias.default
This is interesting because the auxiliary symbol for alias indicates that the characteristics is set to IMAGE_WEAK_EXTERN_SEARCH_LIBRARY rather than IMAGE_WEAK_EXTERN_SEARCH_ALIAS.