Reducing allocations in _modify accessors

davidbalbert · March 6, 2024, 10:56pm

I hadn't. That's a good idea.

Here's a modified version of the code that actually compiles, including with your suggestion to put the second storage assignment into a defer.

struct MyAttributedString {
    struct Index: Comparable {
        static func < (_ lhs: MyAttributedString.Index, _ rhs: MyAttributedString.Index) -> Bool {
            true
        }

        static func == (_ lhs: MyAttributedString.Index, _ rhs: MyAttributedString.Index) -> Bool {
            false
        }
    }

    class Storage {
    }
    
    // underlying contents
    var storage: Storage

    subscript(bounds: Range<Index>) -> MyAttributedSubstring {
        _read {
            yield MyAttributedSubstring(storage: storage, bounds: bounds)
        }
        _modify {
            var substring = MyAttributedSubstring(storage: storage, bounds: bounds)
            
            // This allocation is necessary to keep storage's refcount == 1 
            // so the contents can be updated without triggering a copy.
            storage = Storage()
            defer { storage = substring.storage }

            yield &substring
        }
    }
}

struct MyAttributedSubstring {
    let storage: MyAttributedString.Storage
    let bounds: Range<MyAttributedString.Index>
}

Here's the Compiler Explorer output when compiled with -O. I'm admittedly not that experienced with interpreting the Swift runtime calls in the Compiler Explorer disassembly, but it looks like the compiler isn't currently able to optimize out the extra Storage allocation.

output.MyAttributedString.subscript.modify : (Swift.Range<output.MyAttributedString.Index>) -> output.MyAttributedSubstring:
        push    rbx
        mov     rbx, rdi
        mov     qword ptr [rdi + 8], r13
        mov     rax, qword ptr [r13]
        mov     qword ptr [rdi], rax
        lea     rdi, [rip + (full type metadata for output.MyAttributedString.Storage)+24]
        mov     esi, 16
        mov     edx, 7
        call    swift_allocObject@PLT
        mov     qword ptr [r13], rax
        lea     rax, [rip + (output.MyAttributedString.subscript.modify : (Swift.Range<output.MyAttributedString.Index>) -> output.MyAttributedSubstring with unmangled suffix ".resume.0")]
        mov     rdx, rbx
        pop     rbx
        ret

output.MyAttributedString.subscript.modify : (Swift.Range<output.MyAttributedString.Index>) -> output.MyAttributedSubstring with unmangled suffix ".resume.0":
        mov     rax, qword ptr [rdi]
        mov     rcx, qword ptr [rdi + 8]
        mov     rdi, qword ptr [rcx]
        mov     qword ptr [rcx], rax
        jmp     swift_release@PLT

But even if the compiler could optimize this out, the cost of the allocation is only part of the cost. The other part is the cognitive overhead of wondering whether the compiler is able to do what you hope it will, having to verify it on godbolt, and remembering to unset and set storage as necessary everywhere in your codebase.

It's similar to the trouble with getting an inout reference to the associated value of an enum. Sure, you can add an extra null case to your enum, and set theEnum = null after extracting the associated value in a pattern match, and then write back the associated value when you're done, and the compiler might even be good enough to optimize away the copy or retain even if you don't do this, but this is a pain. It sounds like we're going to get mutable borrows of associated values soon, and that's a good thing. You can just express what you're trying to express.

I haven't tested that, so I'm not sure. For me, the benefit of using _modify here is that I'm able to communicate my intent to the compiler (and someone else who reads my code), which is "I want this to be borrowed, not copied, for the duration of the read-modify-write."

Oh, this is good. I'll try this shortly and will report back.

I'd still love it if we could express "the substring that I'm constructing here does a mutable borrow of storage for the duration of the _modify block, and once the substring is deallocated at the end of the block the borrow ends" and not have to set or reset storage at all.

Ah, I had not thought of that. Just looked at the AttributedString source in swift-foundation and can see where it's doing that check. I'll have to add the equivalent to mine.

I wonder if there's a way to incorporate that constraint (you can't replace storage with an unrelated Storage from another string) as part of the mutable borrow + yield that I'm hoping for without needing to do the explicit ID assignment and verification.

AttributedString's subscript has both a _modify and set where the set does a mutating replace on the existing storage, and the _modify does the aforementioned yielding and ID verification to make sure the storage instance hasn't been replaced.

Maybe instead of a mutable borrow, what you'd actually want is an immutable borrow (no retain) of the Storage reference so that contents of storage can still be mutated, but it can't be replaced with a different Storage instance.

I think it's no easier to express a yielded immutable borrow than a yielded mutable borrow at the moment, but perhaps I'm wrong. Is there a way to do it?