Why is Substring a TextOutputStream?

Consider

var x: String = "Hello"
var y: Substring = x[x.startIndex...]

y.write(" World")

x // "Hello"
y // "Hello World"

What parent String does y point to?

1 Like

var y should point to x since Substring is a view IIRC, but when you call write on y you're about to perform a mutation on y which causes it to copy the original string and mutate the copy (value semantics with copy on write optimization). Please correct me if I'm wrong.

Well from the implementation perspective it appears to be a little different then COW but a fairly similar approach, except that it will always make a copy.

@DevAndArtist I'm aware of how copy-on-write works. I'm trying to understand what this operation means from a semantic POV. I believe Substring should disallow such mutations, otherwise what is the point of it?

Well a String conforms to TextOutputStream similar to real streams like stdout (you can route print into a string), so does a Substring conform to that protocol which is a logical consequence, even though a Substring is technically a view to a portion of a String.

You can make a similar experiment with arrays and array slices and it will behave very similar.

var array = [0, 1, 2, 3]
var slice = array[array.startIndex...]
slice.append(4)
array  // [0, 1, 2, 3]
slice  // [0, 1, 2, 3, 4]

The behavior of ArraySlice concerns me as well. I can't see why ArraySlice allows append. You can effectively use an ArraySlice as an Array, which should be disallowed, right?

Well from my user perspective, an array slice and a substring are the same as an array and a string because they share the internal storage until you need to perform mutation. In that sense we're viewing a part of the original storage, but if we need to work with that part we can mutate it without fear of modifying the original storage because of COW.

When you create a slice of a string, a Substring instance is the result. Operating on substrings is fast and efficient because a substring shares its storage with the original string. The Substring type presents the same interface as String, so you can avoid or defer any copying of the string's contents.

The ArraySlice type makes it fast and efficient for you to perform operations on sections of a larger array. Instead of copying over the elements of a slice to new storage, an ArraySlice instance presents a view onto the storage of a larger array. And because ArraySlice presents the same interface as Array, you can generally perform the same operations on a slice as you could on the original array.

Mutating operations such as removeFirst() and removeLast() make sense. My concern is that anything that requires the parent String to be mutated should be disallowed, as Substrings are simply views. The scope of the view should be mutable, but it should still remain a view. If they are disallowed, the user is forced to explicitly initialize a new String/Array (as is done anyway, internally). I feel that this better reflects the intent.

1 Like

I think this was previously discussed but I'm not hundred percent sure about that, you can search for it on the forums, and if you don't find anything feel free to open a discussion/pitch thread in the #evolution category. :slight_smile: I mean I understand that you want views to be immutable in that sense, but I don't really remember the rationale why they're not.

Ah, yes, I should probably do that. I asked for the sake of posterity (in case I was missing something fundamental) but I'm now sure that this is a debatable topic. Thank you for taking the time out for your responses!

Put it the other way round. What if you mutate the original? Should this mutation be reflectet in the slice? Or should a mutation of the original string be forbidden, if you happen to have a substring of it floating around?
That doesn't work out, right?

Slice types are allowed to share storage with a parent value, but are otherwise completely independent values. Appending to a slice is useful when modifying a slice through an inout parameter, which will write the slice back to the parent:

var x = [1, 2, 3, 5]

func insert(after: inout ArraySlice<Int>) {
  after += [4]
}

insert(after: &x[1..2])
print(x)
7 Likes

This example definitely makes sense. I think I get it now, thank you.