Could `StringProtocol` require `Self.SubSequence == Substring` without breaking ABI?

it always boggles my mind why parsers like these don’t Just Work.

init?(_ string:some StringProtocol)

instead, you need to name the generic parameter and write out the constraints.

init?<StringType>(_ string:StringType)
    where StringType:StringProtocol, StringType.SubSequence == Substring

this has been noticed before. the explanation is that StringProtocol predates the ability to declare such constraints. which then begs the question, could StringProtocol be retroactively given such a constraint?

1 Like

No, sadly it cannot. Existing code needs to continue working. (Including code that somehow found a way to illegally conform their own types to StringProtocol.)

I believe StringProtocol is very likely to be a lost cause. The right move is to gradually phase it out and to replace it with a properly designed alternative. (The replacement isn't even necessarily going to be a protocol, at least not entirely: most of it can be about a universal, non-generic view over pieces of raw string memory, with the role of the protocol relegated to facilitating conversion.)

(I expect this potential future replacement will need to fully embrace the full extent of all ownership control features that are being designed and implemented today -- e.g., it will need to cover various noncopyable string types. Accordingly, while prototyping can begin even now, the eventual design will need to build on language/stdlib concepts, constructs and conventions that do not exist yet -- it's too early to try to make concrete pitches.)

5 Likes

Expanding a little bit on why we can't change this: even if we believed that there was no valid code written against the existing StringProtocol that wouldn't work with the new constraint, such that this wasn't a source-breaking change, adding new constraints like this changes constraint minimization, which changes symbol mangling, which turns it into an ABI-breaking change.

We've looked at other similar missing constraints in the past and concluded that they simply couldn't be added, even though we believed that there were no conforming types in the wild that wouldn't satisfy the added constraint, because of the resulting changes to constraint minimization and symbol mangling.

The simplest such example I know of off the top of my head is that we'd really like to have required that Magnitude == Self for UnsignedInteger; adding that constraint probably wouldn't break any existing UnsignedInteger types, but it would change the mangling of any generic code written against FixedWidthInteger & UnsignedInteger (to T: FixedWidthInteger where T = T.Magnitude). There's a lot of code written against such a constraint, and we can't break its ABI.

8 Likes

In particular the mangling of @taylorswift's original declaration will change:

init?<StringType>(_ string:StringType)
    where StringType:StringProtocol, StringType.SubSequence == Substring

The same-type requirement becomes redundant, so it's no longer part of the generic signature of this init.

2 Likes

I would think it is technically possible to have constraints which we tell constraint minimisation are never redundant. Or is there a reason that is fundamentally not possible?

At least this specific aspect of the problem seems like something that we could work around if there was sufficient motivation (and this likely isn’t enough).

Of course, it doesn’t help with the other aspect, which is the “what if existing code already violates this constraint” problem…

3 Likes