[Accepted with modifications] SE-0464: Safe UTF-8 processing over contiguous bytes

Hello, Swift community!

The review of SE-0464: Safe UTF-8 processing over contiguous bytes ended on April 7.

Feedback on this proposal was positive overall, with most of the discussion centered around the shapes and names of various APIs that were proposed. The Language Steering Group agrees with the overall direction of this proposal while providing a couple extensions of the review period to ensure the community had sufficient opportunities to discuss and converge on the specifics of those APIs. We accept this proposal with the following modifications (diffing based on what was in the original proposal text at the start of the review):

  • UTF8Span adds an @unsafe initializer to create an instance from an externally-known-to-be-valid Span of UTF8 code units, with the option to pre-set the "known ASCII" flag for optimizing that common use case:

    struct UTF8Span {
      @unsafe public init(
        unchecked codeUnits: Span<UInt8>,
        isKnownASCII: Bool = false
      )
    }
    
  • Similarly, the unchecked reset methods in the iterator types are renamed as follows:

    extension UTF8Span {
      public struct UnicodeScalarIterator {
        public mutating func reset(toUnchecked offset: Int)
    
        // other members unchanged
      }
    
      public struct CharacterIterator {
        public mutating func reset(toUnchecked offset: Int)
    
        // other members unchanged
      }
    }
    
  • String adds an initializer to create a copy of the contents of a UTF8Span:

    extension String {
      public init(copying codeUnits: UTF8Span)
    }
    
  • UTF8Span.isCanonicallyLessThan(_:) is renamed UTF8Span.canonicallyPrecedes(_:).

  • The error type Unicode.UTF8.EncodingError is renamed Unicode.UTF8.ValidationError.

  • The range property of the aforementioned error type is renamed byteOffsets.

  • The pattern match operator UTF8Span.~= is removed.

Thanks to everybody who participated in the review, and thanks to the proposal authors for their patience on an extended review as we all worked to converge on the best possible APIs!

—Tony Allevato
Review manager

6 Likes