SE-0241: Explicit Encoded Offsets for String Indices

The original proposal sought to solve 3 problems:

  1. SE-0180’s encodedOffset, meant for serialization purposes, needs to be parameterized over the encoding in which the string will be serialized in
  2. Existing uses of encodedOffset need a semantics-preserving off-ramp for Swift 5, which is expressed in terms of UTF-16 offsets
  3. Existing misuses of encodedOffset, which assume all characters are a single UTF-16 code unit, need a semantics-fixing alternative

After discussing with the core team, the hard reality of the Swift 5.0 release is that it is simply too late to be adding a slew of new API. At this point, we can only apply urgent fixes, and only problem #2 qualifies.

However, this review thread has been very valuable and I would like to continue the discussion surrounding how best to address problems #1 and #2, which unfortunately will have to arrive after Swift 5.0.

I spun off another thread for further discussion, and will make a “Public Service Announcement”-like post highlighting the issues and various mitigation strategies for current uses. This proposal is being gutted to a minimal, semantics-preserving off-ramp for current uses of encodedOffset.

3 Likes