ASCIIString


(Jacob Bandes-Storch) #1

StaticString provides an "isASCII" boolean property, but manipulating
strings still requires the use of UnicodeScalarView / CharacterView, even
if the strings are statically known to be ASCII-only.

I think it would be nice to have an ASCIIString in the standard library,
similar to StaticString but with the following improvements:

- ASCIIString itself would be MutableCollectionType, with Index == Int for
easy access.

- Its Generator.Element would be something which works with simple + and -
operators (either UInt8, or perhaps a repurposed UnicodeScalar, or a new
ASCIIScalar).

- The ability to create new ASCIIStrings at runtime, by appending/removing
bytes, or by concatenating other ASCIIStrings.

Would anyone else find this useful?

Jacob Bandes-Storch


(Austin Zheng) #2

Pragmatically, I would love this. I've run into use cases before where a sequence of ASCII characters would have sufficed, and the additional complexity of the Unicode model is unnecessary and undesired.

On a philosophical level, I think there's a discussion to be had as to whether the language/stdlib is the right place to delineate subsets of Unicode for programmer use, and whether or not having ASCIIString will encourage lazy programmers to avoid Unicode support altogether and/or 'misuse' this type as a means of storing raw bytes.

Personally, I'm +1 but I think we should carefully consider the ramifications.

Best,
Austin

···

On Dec 13, 2015, at 1:47 PM, Jacob Bandes-Storch via swift-evolution <swift-evolution@swift.org> wrote:

StaticString provides an "isASCII" boolean property, but manipulating strings still requires the use of UnicodeScalarView / CharacterView, even if the strings are statically known to be ASCII-only.

I think it would be nice to have an ASCIIString in the standard library, similar to StaticString but with the following improvements:

- ASCIIString itself would be MutableCollectionType, with Index == Int for easy access.

- Its Generator.Element would be something which works with simple + and - operators (either UInt8, or perhaps a repurposed UnicodeScalar, or a new ASCIIScalar).

- The ability to create new ASCIIStrings at runtime, by appending/removing bytes, or by concatenating other ASCIIStrings.

Would anyone else find this useful?

Jacob Bandes-Storch
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Paul Cantrell) #3

I hit a related problem here:

https://github.com/bustoutsolutions/siesta/blob/master/Source/Resource.swift#L427

I know I have an ASCII-only string because I’ve just applied escaping, but still have to force-unwrap the result of dataUsingEncoding(NSASCIIStringEncoding). Annoying!

However, I’m not sure that a new type would be a good way to solve this. A string may have many attributes that make it suitable or unsuitable for use in different situations: single line, valid identifier in a given language, no surrogate chars, no spaces … plus, of course, “valid in encoding X” for every encoding.

Handling these cases via the type system would result in a combinatorial explosion of API methods like dataUsingEncoding.

IMO, this is a perfect example of where force unwrapping is the right tool: I’m able to make guarantees about the correctness of the code that the compiler can’t verify. I’m OK with that.

• • •

If there’s a compelling use case for ASCIIString, it’s performance. In that case, I wonder whether:

• perhaps this would better be implemented as a set of extension methods on [UInt8] that do string-like things, instead of being a separate type, and

• perhaps String could — or already does — provide internal optimizations when the string is internally representable as single-byte chars.

Cheers, P

···

On Dec 13, 2015, at 3:47 PM, Jacob Bandes-Storch via swift-evolution <swift-evolution@swift.org> wrote:

StaticString provides an "isASCII" boolean property, but manipulating strings still requires the use of UnicodeScalarView / CharacterView, even if the strings are statically known to be ASCII-only.

I think it would be nice to have an ASCIIString in the standard library, similar to StaticString but with the following improvements:

- ASCIIString itself would be MutableCollectionType, with Index == Int for easy access.

- Its Generator.Element would be something which works with simple + and - operators (either UInt8, or perhaps a repurposed UnicodeScalar, or a new ASCIIScalar).

- The ability to create new ASCIIStrings at runtime, by appending/removing bytes, or by concatenating other ASCIIStrings.

Would anyone else find this useful?

Jacob Bandes-Storch
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution