String without characters in set

Does anyone else find themselves writing the following extension on String?

func withoutCharacters(in characterSet: CharacterSet) -> String {
    guard !isEmpty else { return self }
    return String(unicodeScalars.filter { !characterSet.contains($0) })
}

I'm struggling to find the definition for the bar that a proposal must vault itself over in order to get a change into the Standard Library. Perhaps, when it is found, it could be pinned to this Topic.

With this in mind, I'm quite embarrased to dare publicly wonder if this small extension could possibly be included in the Standard Library because of this intimidating and mysterious high bar. I don't want to waste anyone's time and I really tried to RTFM before posting (maybe my search terms were wrong?).

I would say this extension should not be added because it cannot be added to NSString/Foundation without Apple's internal approval. Specifically, the Standard Library's String type seems shackled to some corresponding NSString API and the interia of proposing a change to both the open Standard Library and the closed Foundation framework is too great for such a small change.

On the other hand, having a useful string processing API guaranteed by someone who isn't the developer, reducing redundancy across codebases, reducing the possibilities of extension collisions, a smaller scope for bugs in a code base and (arguably) performance benefits seem good reasons to include this in the Standard Library. Consider the leading answer to this SO question.

Thoughts?

Kiel

EDIT: The use case I was dealing with today where I needed this functionality was extracting a telephone number from a string entered by the user.

1 Like

Can’t you just use filter with the various character properties available on Character? These are meant to be substitutes for CharacterSet which should really be deprecated anyway. If you want to get rid of whitespace you could just do

String(string.filter{ !$0.isWhitespace })
1 Like

I would say this is just the tip of a very large String API iceberg. Some of it will be helped in Swift 5 with the addition of the Unicode scalar properties and Character properties APIs in Swift 5, but those proposals still don't address the most common needs for higher level String APIs, filtering being just one.

I would imaging your suggest could be a small part of a much larger String API overhaul to provide things like validation and formatting natively on String.

Unfortunately, for human text like phone numbers, we need a whole separate API. I like PhoneNumberKit but this should really be handled by Apple. I know they have equivalent functionality, they've just never exposed it. One amongst many missing parts in their text handling and formatting APIs.

That API only exists in Swift 5, so it's not quite useful yet.

yeah but it sounds like he’s trying to pitch something for evolution, which would be Swift 6 at best,, so

This is somewhat different than the proposed API, as the proposed API allows filtering any CharacterSet, not just preset properties like isWhitespace. Hopefully isWhitespace matches CharacterSet.whitespaces, otherwise this will be very awkward. It does seem like there should be a native Swift replacement type vended by the standard library though.

Not sure I buy the argument that we shouldn't have some of the higher level API sooner because sometime in the undefined future there'll be better and alternative ways of doing things. It could always be true and we'd never have these things.

Perhaps it's closer than I think, so maybe it is wiser to wait and minimise the factors the overhaul needs to consider. Given what you've suggested, I'm thinking functions like components(separatedBy:), trimmingCharacters(in:), rangeOfCharacter(from:) etc will be affected. On the other hand, it might simply be the case, for example, the overhaul entails changes to the implementations of these and some useful, higher level API and developers would have suffered without the advantages of having these small parts from the iceberg sooner.

Yes, this might just make the extension so trivial you don't even need it.

This would be out of place on string because CharacterSet isn't a Set of Characters, hence the need to drop down to the Unicode scalars. I think we still await a suitable CharacterSet replacement in Swift, though character properties handle some (most?) of the use cases, as @taylorswift mentions.

Edit: Speaking of, CharacterSet is definitely one of those types that should have stayed as NSCharacterSet instead of losing the prefix, because the name is so misleading.

7 Likes

That wasn't my argument at all. I was just saying that, for a real proposal, you'd want pick some of the most common string manipulations that aren't currently in Swift and propose them all at the same time. A proposal for a single method, as useful as it is, seems unlikely to be worth the effort. But go ahead. It would be interesting to see what the response would be.

I’m deeply sorry to have misunderstood you.

I doubt I’ll make a proposal. Swift 5 makes this so easy it doesn’t seem worth it.

tbh what I’ve found is the new Character properties in 5, combined with String’s restored Collection conformance does basically 99% of the string operations i want. What’s sorely missing though is string formatting tools, like to center text, pad left/right, and print common types like Int and Double with an arbitrary number of leading 0s or digits of precision after the decimal point

3 Likes

I won't argue for the ergonomics, but doesn't String(format: ...) handle most of that?

that is a foundation method, not a standard library method

1 Like

I did not realize that. Thanks.

Centered in what? And what happens if I set the a left-aligned UILabel’s text value to be equal to a centered string?

I agree about the leading/trailing zeros, though. A way to specify that you want to stringify a number as hex, octal, or binary would be great, too.

Like padding left or right, I assume this would just be a center version. e.g. "a".pad(.center, filler: "0", length: 3) would return "0a0". Of course, many of these operations are similar to operations on collections in general, so just getting more algorithms into the standard library would help.

1 Like

Ah, ok. I see what you mean. I thought @taylorswift was talking about having print(“Hello, World!”.centered) print the output centered in the terminal or something.

Is anyone still interested in exploring more robust string processing APIs?

If so, maybe we can start a thread to determine specifically what people need/want.

Sure, that'd be interesting (though I guess it'd be mainly about padding).


Maybe I'm missing something, but String(_: radix:) has been in the stdlib as long as I can remember:

let x = 12648430
let hex = String(x, radix: 16)
let oct = String(x, radix: 8)
let bin = String(x, radix: 2)
print(hex, oct, bin) // c0ffee 60177756 110000001111111111101110
3 Likes