Same-case mutations are optimized, right?

CTMacUser · September 26, 2019, 11:38pm

extension DivisibilityIterator: IteratorProtocol {
    mutating public func next() -> Bool? {
        switch state {
        case .zero: return false
        case .one: return true
        case .other(var iterator):
            defer { state = .other(iterator) }
            return iterator.next().map { $0 == 0 }
        }
    }
}

Where that last iterator is of a helper IteratorProtocol type for modulo loops. Since the state goes back to the same case, the object code just mutates the inner iterator in-place, right? It doesn't do a copy-out, mutate, complete-replace instead? Or do those two possibilities come out the same in the object code?

Extra: what about switching between cases, but both have the same payload tuple type? Would the two cases have the same binary layout (besides the value of the case tag)?

lukasa · September 27, 2019, 8:46am

In any released version of Swift, no. In a future version of Swift, maybe.

In any released version of Swift, it does. In a future version of Swift, maybe.

They do not.

This is the subject of SR-10605, which I just noticed may be closable since apple/swift#26803 got merged (@Erik_Eckstein, do you think you fixed it?).

In general the distinction between "copy out, mutate, copy back" and "mutate in place" does not matter much in most native languages. In Swift, there are two cases where it matters:

When the size of the object being mutated is very large.
When the object being mutated requires refcounting operations on a copy.

In the first case, if the size of the object being mutated is very small then the copies may involve only copying from memory to registers and back. In that case, the "copy" is essentially meaningless: it looks identical to the modify-in-place case. However, when there are not enough registers to hold the object, or when the object needs to be stored somewhere in memory for other, opaque code to operate on it, then the copy involves memory-to-memory copying, which does impose a cost. For very large objects this cost can become quite substantial, especially if the mutation happens frequently.

But the second case is the one that imposes the most damage. Reference counting is substantially more expensive than copying small amounts of memory, especially on complex types. For example, if your implementation of IteratorProtocol is a struct holds three String objects, you will perform six refcount operations on a copy-out-copy-in model: one retain for each String on the copy, one release for each String on the save. This is separate from any further retain/release operations required for your actual computation.

The most dangerous part, as noted in SR-10605, is when you are mutating CoW data structures. Here, copying-out will cause a retain of the backing storage, meaning it is no longer uniquely referenced. Any mutation of that will cause a CoW. Then, when the enum payload is saved back the original storage will be released, which will likely free it. This can be a really nasty performance trap for enums with associated data.

Happily, the Swift performance team has done the work to optimise this, so future Swift releases will be better at this. In the short term, the workaround is to temporarily store a different value into self:

extension DivisibilityIterator: IteratorProtocol {
    mutating public func next() -> Bool? {
        switch state {
        case .zero: return false
        case .one: return true
        case .other(var iterator):
            state = .zero  // Temporary store, not visible to other code
            defer { state = .other(iterator) }
            return iterator.next().map { $0 == 0 }
        }
    }
}

This forces the destruction of the original memory location, and thus causes releases of any CoW data structures. Note that this is only safe because you are using a mutating func, and so have unique ownership of the value of state.