Why don't string views (`UTF8View`, etc.), conform to `Codable`, `Hashable`, etc.?

Regarding String views/Substrings and Codable specifically, I suspect some amount of discussion and exploration will need to happen around deciding what behavior is expected regarding their encoding format — specifically, if there's a mismatch between the view type and the expected text encoding of the Codable format itself. For example, some formats might be constrained to UTF-8, so, e.g., String.UTF16View might need some indirect representation.

  • The naive implementation for these views (encoding into an UnkeyedContainer as integers) would take care of this, but some might be surprised to see a String.UTF8View encode as an array of integers in a UTF-8 compatible format (or a UTF16View in a UTF-16 compatible format)
    • It's possible to expect that some encoders would special-case certain views based on their known output format, but it's not something you can necessarily rely on
  • Another possibility is that views don't encode themselves directly, but pass off their underlying String to the encoder and allow the encoder to grab whatever representation it can best deal with (effectively transcoding). This could work, but there's also some potential for surprise (encoding a UTF-16 view but getting UTF-8 data out)
    • This could potentially be compatible with what Substring does — one "easy" Codable conformance for Substring and its views could be to create Strings out of them [potentially made more performant by wrapping up the Substring in some way, rather than allocating a new buffer] and encoding those

There's also a bit of a philosophical question regarding decoding — specifically, it might feel a little bit funny to request decoding a Substring or a view ("a substring of what? who owns the original?"). The way String/Substring/views are written, this isn't actually a problem at all, but there may be a semantic question here of might be considered semantically meaningful. (In terms of implementation, at least, it's pretty trivial to have these types decode a String and initialize from that String directly.)

Either way, I think it's worth considering and seeing what feels right!

3 Likes