String hygiene

+1 for OptionSet from me. I think the benefits of OptionSet outweigh any detriment from being able to express [] (which I personally don't think is a detriment either).

I disagree — I actually think it would be reasonable to include .trimMiddle, which a lot of implementations out there do allow for.

Having this type be an OptionSet both allows us to add .trimMiddle later should we choose to, and even now, prevents the need for

enum TrimOptions {
    case trimLeading
    case trimMiddle
    case trimTrailing
    case trimLeadingAndMiddle
    case trimTrailingAndMiddle
    case trimAll
}

In fact, if we did offer

enum TrimOptions {
    case trimLeading
    case trimTrailing
    case trimAll
}

and decided to add trimMiddle later, trimAll would be misnamed as it really just trims from the ends (unless we'd be willing to break behavior, which we aren't). If we call it trimBoth instead to avoid this, "both" would be strange when considering three possible options.

This conclusion doesn't follow — being able to represent [] in this case doesn't mean anything about any other API.

I don't think the benefit here should come for the UI case specifically, but in my mind, this is a non-issue. In the vast majority of cases, the value passed to trim(_:) will be a compile-time literal; I don't see much harm in being able to construct [] dynamically should the need arise.

3 Likes

The reason one would not want to use option set is, as @DevAndArtist points out, is that it creates legal access to this API using the full range of raw values, plus a variety of option set expressions. With option sets, you can call this function with .start, .end, [.start], [.end], [.start, .end], (rawValue: 127), etc etc etc.

  • The call vocabulary is overly large for the needs of the API.
  • It sanctions poor call-site hygiene
  • The number of customizations will never be more than 2 plus the default behavior.
  • The user should never use multiple members. Calling with no options is preferable to [.start, .end] (or [.end, .start]). The latter is unnecessarily verbose.

An enumeration limits the legal calls to three, of which only two should ever be publicly used, .start and .end.

I gave in on the OptionSet discussion because it is completely tangential to the actual pitch and overwhelmingly supported by the participants in this thread even though I thoroughly disagree with it. I am currently inclined to go back to my original design and explain in the write-up why I feel option sets are not the correct approach.

2 Likes

I think there is no need to dive that deep into the fight between the OptionSet or an enumeration since you just can mention one in the alternative section of the proposal and let it be. The review process will go over it again anyways so there is no real need for you wasting time with our two parties discussing why one is better the other. ;-) I think I can speak for everyone that it's just fair to mention both designs in the formal proposal and keep the focus on one design while mention a few things that we discussed here about the other design (pros/cons).

1 Like

Looks like the OptionSet thing is somewhat settled, so I won't reply to those points.

What does this mean? I tried to find a string API that trimmed from the middle in another language but only found an example of a function that trimmed strings to a particular length, which is a separate function entirely. If you mean that it could filter out all whitespace from the string, then there is already the (better named) filter(_:) method on String.

1 Like

I agree. This is more or less how .net does it String.Trim Method (System) | Microsoft Learn

On that note, If we want to give the option to trim the start or the end then just introduce different methods for those.

I do not see the need for a mutating version of this.

I would prefer the name to be trimming() for the non mutating version. trimmingStart() and trimmingEnd() would also be helpful. There already some kind of precedence in Foundation for this name.

I very much would like the option to provide a collection of Chracters so I can choose what to trim. I do not like the enum as the configuration point.

1 Like

Everyone got caught up in the debate of enum vs OptionSet and nobody asked why this has to be one function in the first place. Even from the implementation you can see that left and right are not sharing any code. For me the obvious starting point would be to have 3 functions and then discuss if it makes sense to merge them (in my opinion it doesn't).

I see this as analogous to hasPrefix() / hasSuffix(). There is no has(.prefix, "foo").

1 Like

Well the answer is obvious here, the design uses a configuration type to reduce redundant API surface because there is a need for two types of functions, one mutating the other non-mutating. If you‘d go the other route you‘ll get 6 functions instead if just 2.

This is almost a tautology: "The reason the proposal has 2 functions instead of 6, is so that we have 2 functions instead of 6".

With enough determination you could fit the whole standard library into 1 function.

Sorry but where is this comming from? I never wrote something like that.

To sum up it for you again:

  • We need two types of functions mutating/non-mutating (I‘m speaking about two types not the amount of functions here.)
  • The current desgin uses a conf. type to reduce the API surface from 6 to 2 functions (which IMHO is more elegant)
  • [NEW] If this API will also be supported on different views then we‘d need to multiply the API surface by the amount of supporting views (if this API is not moved into a protocols default implementation)

I don‘t see where you got the idea of what you just wrote, no tautology here at all. ;-)

If you wish for pure functions solution, then please ask Erica kindly to include it into the alternative section of the proposal, no further debate required.

1 Like

OK, we agree then that it is a matter of aesthetics.

So let's continue the debate about enum vs OptionSet which apparently is required?

I already made my statement at post #63 that we don‘t have to. It‘s the same. Just a few pros/cons in the alternative section about the other design would be enough. The review process will discuss it again anyways and choose the one that fits more into stdlib.

There are more important questions that still need an answer:

  • If the conf. type is an enum should it be frozen?
  • Which views should support that API?
  • Should this API be moved in some protocol with default implementation to better reuse it on different views?
1 Like

Stepping back and looking at StringProtocol, should the protocol itself be updated to be more aware of Substrings?

I strongly disagree. We don’t have a mutating enumated for array. We do not have a mutating lowercased() for string. Most languages I ever used do not have a mutating version of trim. Including trim as a mutating function will be extremely confusing because the term of art is non mutating. This would be the same as insisting that .map should be mutating because it looks like a verb.

That's the thing you use methods in your argument that have no mutating counterpart but there are other that do have it sort/sorted, reverse/reversed etc.

Would you mind to share your thoughts about those methods being an example for the debated API?

We're not trying to mimic the API from other languages but rather to cover most of the use-cases established by the swift community in their own modules.


Edit: @armcknight could you provide some informations wether or not the most popular trim() -> String function is (or partly?) mutating in the wild?

It seems to me that it may be mutating but just returning self at the end, just like Erica was proposing in the first draft.

24 "trim() -> String"
10 "trimmed() -> String"
9 "trim()"

Those methods are on collections which string may happens to inherit. Are you suggesting we should add trimming operations to collection?

String is an immutable type. to ignore previous work in this space would be not wise. If one is to reject other languages designs then one must have a really good reason other than purity.

Can you clarify what you mean by that exactly? What kind of immutability are you referring to?

1 Like
var letters = "abc"
letters += "def"
print(letters) // abcdef

How exactly is it an immutable type?

I second this, and furthermore String conforms to TextOutputStream which requires the type to grow.

I'm not ignoring it, all I was trying to say is that not every design in other languages fits into Swift (remember the removal of ++ and -- operators) and the initial start point of this talk was to cover the use cases by the Swift community out in the wild for trimming strings. I don't see any harm that can be done by a mutating function like mutating func trim() which clearly signals what it will do to your string.

No I'm not suggesting to add anything to collection protocol. Dictionary is also a collection, but it has it's own mutable convenient APIs that build on top of the MutableCollection, so why can String not do the same here?

I'm open minded, but I'm not yet convinced why we should drop the mutating version entirely.

1 Like

I think @masters3d means does not conform to RangeReplaceble. But that's not an issue here anyway…

Good question, here are the top five implementations:

8 return self.trimmingCharacters(in: NSCharacterSet.whitespaces)
6 return self.stringByTrimmingCharactersInSet(NSCharacterSet.whitespaceAndNewlineCharacterSet())
5 return self.trimmingCharacters(in: CharacterSet.whitespacesAndNewlines)
5 return self.stringByTrimmingCharactersInSet(NSCharacterSet.whitespaceCharacterSet())
3 return trimmingCharacters(in: CharacterSet.whitespacesAndNewlines)

Looks like immutability is the winner.