Canonical equivalence for mangled symbol names

the more i work with SymbolGraphGen, the more mangled-named weirdness i run into.

recently, i discovered that SymbolGraphGen emits symbol descriptors with mangled names that include the extension context, but it emits references to those symbols with mangled names that omit the extension context.

for example, SymbolGraphGen emits a page for RandomNumberGenerator.next() under the identifier sSG4nexts6UInt64VyF, but it refers to that symbol by the identifier sSGsE4nexts6UInt64VyF, which doesn’t exist when performing identifier lookup by byte-comparison.

when i manually demangle sSGsE4nexts6UInt64VyF, i get

(extension in Swift):Swift.RandomNumberGenerator.next() -> Swift.UInt64

symbolgraph tooling really ought to treat sSGsE4nexts6UInt64VyF the same as sSG4nexts6UInt64VyF, since the "sE" infix is redundant (both module identifiers are Swift). so we really should be canonicalizing the first form to the second.

but since i don’t have a mangled name parser implementation available, i’m at a loss for how to actually implement this kind of normalization. any ideas?

1 Like