Swift is still sometimes so cumbersome

I'm sure it has been discussed before, but how big are strings in typical applications that it makes any sort of difference whether substring operations are linear instead of constant? Is everyone loading gigabytes of textual data into memory and running advanced string algorithms with lots of subscript operations?

This is a case where I feel that Swift is sacrificing ease of use and clarity for some theoretical performance gains. Maybe I'm wrong, but I don't know of any other language that has such a peculiar String API (which IMHO essentially makes it wholly unsuitable for scripting or some tasks like NLP).

4 Likes

The problem is rarely with a single large string, but with looping over a string calling linear-time operations. It is very easy to inadvertently write an algorithm that should be linear, but performs a linear-time operation in the body of the loop that moves the algorithm to the quadratic complexity domain. Even for quite small strings, quadratic algorithms get very slow very quickly.

5 Likes

Remember Swift aims to be a general purpose language, where it can be used for System Programming, Frameworks, low level stuff, Server Side (where there’s a lot of Strings processing) and much more.
The point being, maybe some Swift use cases would not benefit a lot from this performance optimization, but it's critical for other areas.


Swift have “such a peculiar String API” because it’s always trying to strike a balance between ease of use, correctness and performance; we barely find a language with such goals for their String type.

I’d recommend watching this awesome talk by @Michael_Ilseman The Philosopher's String

1 Like

That does sound like a documentation problem then, in that case.

I still think the reasons we don't want String to be its own slice apply generally to types like Data and ByteBuffer, too: that it becomes difficult to reason about how big the underlying allocation is. Either you have fast slicing at the risk of keeping lots of unnecessary data around, or you're making independent copies in O(n) time, every time.


Out of interest, is there some reason you need to create an independent Data object from the buffer? In my code, initialisers create copies (e.g. Data(_: ByteBuffer)), and I typically do dependent lifetimes by scoping to a closure, like Array.withUnsafeBufferPointer does. The only time that doesn't work is when you need to store the data after you've returned (e.g. in a class' instance variable or something), but in those cases you can usually decide which type the storage will have.

So in other words, you would always only store ByteBuffers and the only way to view it as a Data would be through some kind of ByteBuffer.withData<T>(_: (Data)->T) -> T method. So the API would discourage keeping the Data instances alive for very long.

Only that it is more useful, and helpful in the case where you want the entire ByteBuffer as a Data. We unquestionably could simply scope the lifetime, but the API is a bit less than ideal (and users have a tendency to escape refs with scoped lifetimes).

1 Like

What's wrong with this:

((inResponse.result.value as? String)?.prefix(1024)).map(String.init)

The map is Optional.map, not Sequence.map.

Along these lines, it'd be interesting if RangeReplaceableCollections like String allowed writing into slices. That would let you use string[...] to update a value with a substring without any explicit conversion, as in:

body[...] = body.prefix(1024)
1 Like

This would be very nice to have. It would require the ability to provide just one accessor in a subscript declaration (or alternatively have some forwarding-to-next-best-overload mechanism).

1 Like

It's still pretty cumbersome.

guard let fullBody = ... as? String else { return }
let truncatedBody = String(fullBody.prefix(1024))

This does something different than the original code.

I think the gist of the problem is that there is an optionality gap in Swift. In order to write this in a concise way, you need to either:

  1. Have type transformation methods, instead of initializers, e.g.:
body = body?.prefix(1024)?.asString

or

  1. Initializers that accept nil, e.g.:
body = String(body?.prefix(1024))

or

  1. An operator to make this easy, e.g.:
body = body?.prefix(1024) => String.init

Or use Optional.map? => is just sugar for Optional.map, after all.

Right, the operator I used is just a shortcut, but the whole post is about shortcuts I think (i.e. not being cumbersome).

I'm probably missing the point here but I think if you can't guard let or if let your way around the problem you're "doing it wrong". Swift is opinionated and its way of doing things makes sense in the context of its idioms.

func myMainFunction() {
  let optionalBody = inResponse.result.value as? String
  doSomeDebuggingLogging(bodyString: optionalBody)
  // do some other stuff with optionalBody here if desired (I won't try to understand this use case)
}

func doSomeDebuggingLogging(bodyString: String?) {
  guard let fullBody = bodyString else {
    return
  }

  let truncatedBody = String(fullBody.prefix(1024))
  // do whatever with truncatedBody
}

I would be genuinely curious to learn of a use-case where this structure isn't the best way forward (in Swift)

I disagree: the post is not about shortcuts, it's about the code being cumbersome. Here are the three code blocks from the original post. The first, does not compile due to prefix returning Substring:

var body = inResponse.result.value as? String
body = body?.prefix(1024)

The second, does not compile because there is no String.init(_: Substring?):

var body = inResponse.result.value as? String
body = String(body?.prefix(1024))

The third, compiles, rejected because "readability suffers":

var body = inResponse.result.value as? String
if let b = body { body = String(b.prefix(1024)) }

The question is whether => is clearer than Optional.map, so let's rewrite the third case to use either.

var body = inResponse.result.value as? String
body = body?.prefix(1024) => String.init

or

var body = inResponse.result.value as? String
body = body?.prefix(1024).map(String.init)

Maybe this is just me, but I don't find either reads more clearly. In fact, I understand the map code better, though I'm going to choose to dismiss that because if the => operator was widely used that familiarity issue would go away.

For my part, the real cumbersome problem here is actually the as?. That operator is awkward because it forces the use of parentheticals or intermediates in order to chain (you either write (x as? Y)?.foo() or var z = x as? Y; z.foo(). When you combine it with the very nested access to the property there is simply no way to write this clearly: any line of code you write is going to be ugly as sin, regardless of whether you use => or map.

Consider, for a moment, if inResponse.result.value was known to be a String. Then we'd have this:

let body = String(inResponse.result.value.prefix(1024))

So clear! Even if it was String? we'd get this:

let body = inResponse.result.value?.prefix(1024).map(String.init)

This is still pretty readable. More importantly, a bunch of the lack of clarity surrounds the Law of Demeter violating inResponse.result.value, so if we pulled that out to a separate location our life would be easier:

let fullBody = inResponse.result.value
let truncatedBody = fullBody?.prefix(1024).map(String.init)

All of a sudden this looks pretty good to me. So it seems to me that the best code would be to do this:

extension ResponseResult {  // I don't know what the original name of this type is
    var resultString: String? {
        return self.value as? String
    }
}

// Elsewhere
let fullBody = inResponse.resultString
let truncatedBody = fullBody?.prefix(1024).map(String.init)

Seems pretty clear to me, and doesn't feel particularly cumbersome IMO.

5 Likes

in readability terms, im not a huge fan of the map here, mainly because if i didn't know better, i'd think this was making an array of single-character Strings, since intuitively, a String prefix is a collection. you have to be pretty astute to immediately notice it's being called on a Sequence? and not a Sequence. granted the problem is we overload the word map to mean two different things in swift, but this doesn't seem to help anyone but professors trying to come up with lazy multiple choice questions.

Your intuition is right here; @lukasa's use of map in the final line would be Sequence.map:

let string: String? = "hello"
string?.prefix(4).map(String.init)    // Optional<[String]>.some(["h", "e", "l", "l"])
(string?.prefix(4)).map(String.init)  // Optional<String>.some("hell")

This subtlety is the reason that optional chaining isn't pure sugar for Optional.map—as far the remainder of the calling chain is concerned, the value isn't an Optional.
The extra layer of parentheses ends the chain, creating an Optional upon which methods can be called.

1 Like

****, this really is confusing. i guess this just shows even the experience among us get confused when map, Sequence, and ? come together. count me in for the => operator, though i might suggest the spelling ?>

string?.prefix(4) ?> String.init

Although I agree with you that the use of .map is somewhat confusing in the example given, I don't agree with the premise that we're overloading the word "map" two mean two different things in Swift.

We're using it to mean one, and only one, thing: To transform from Foo<A> to Foo<B>, using a closure (A) -> B. It's used to convert from an optional A to an optional B, or from a collection of As to a collection of Bs. Or a result of A to a result of B. And so on. It's the same mental model in all these cases.

Also, this isn't anything Swift-specific, the naming has a very long history in both programming and maths.

1 Like

I feel the same way about ??. I love the operator, don't get me wrong. It's short and sweet. But it doesn't play nice in long chains (because now the whole expression needs to be bracketed, before you can continue with other method calls).

It'd be nice to have methods that implement the same functionality as as (and friends) and ??.