I agree with the proposed direction to add init?(validating:as:) to the standard library, and I think the rationale given passes the bar we've previously stated for additions to the standard library (namely: "It is possible to compose it using existing public API, but only at the cost of extra memory copies and allocations. The standard library is uniquely positioned to implement this functionality in a performant way."). I also agree that separate APIs for input validation (rather than validation-as-part-of-initialization) as well as normalization are properly separate topics that can be separated from this proposal.
As to fit with the Swift standard library, however, I would echo @wadetregaskis's concern about adding additional overloads, particularly with respect to the convenience API init?(validatingAsUTF8:). If this API were a standalone proposal, I do not think the justification that there are a lot of overloads and, thus with the most appropriate being hard to find, adding this one to make it easier to pick out from the bunch would pass muster. Indeed, it is adding to the very problem that it is purportedly trying to alleviate. It calls to mind that phrase from Virgil's Aeneid (XII:45): aegrescitque medendo ("and by the remedy he grows sicker").
I am also concerned that the distinction between the new init?(validatingAsUTF8:) and the now to-be-deprecated init?(validatingUTF8:) is entirely too subtle. Leaving the word "as" to be the user-visible distinguishing mark between the old and new APIs which have a difference in expectation with regard to nul-termination does not strike me as a principled move here; instead, I'd think that the resulting set of String APIs would be more coherent if this proposal added only init?(validating:as:) APIs—including versions that take CChar—and renamed the existing init?(validatingUTF8:).
Regarding the overload that takes CChar elements: the rationale that there would be ambiguity if an overload that takes some Sequence<CChar> is used on platforms where CChar is aliased to UInt8 seems solvable to me in ways other than restricting to UnsafeBufferPointer<CChar>:
- Such an overload could be
#if'd such that it is present only on platforms that alias CChar to Int8
- Such an overload could instead be written to take
some Sequence<Int8> so that the same functionality is present on all platforms regardless of what CChar is aliased to
- Such an overload could be designated
@_disfavoredOverload
It may still be the case that, of all these alternatives, restricting to UnsafeBufferPointer is the best, but the proposal does not make the case.
Regarding the renamed init?(validatingCString:), a small nit and a suggestion:
The nit: the internal parameter name isn't part of the API, but many years ago I commented on the standard library spelling of nul to indicate the NUL character and was told that this was indeed the intended house style; it is inconsistent therefore that this proposal adopts the spelling nullTerminatedCodeUnits.
The suggestion: The renamed API nowhere tells us that it's validating the input as UTF-8: the proposed renaming adds the crucial detail about nul-termination but then loses the detail about encoding. The existing String API includes the property utf8CString (full disclosure: I was the one who wrote the proposal to rename nulTerminatedUTF8CString to its present name) and I think it would be cromulent to parallel that here: init?(validatingUTF8CString:).