Let's say you have a custom copy-on-write AttributedString type, and you want to allow for modification of the attributes of the base string through a substring:
var s: MyAttributedString = "Hello, world!"
s[s.range(of: "Hello")!].foregroundColor = .blue
Here's what it might look like:
protocol MyAttributedStringKey {}
struct MyAttributedString {
class Storage {
}
// underlying contents
let storage: Storage
subscript<R>(bounds: R) where R: RangeExpression<Index> {
_modify {
var substring = MyAttributedSubstring(storage: storage, bounds: bounds.relative(to: self))
// This allocation is necessary to keep storage's refcount == 1
// so the contents can be updated without triggering a copy.
storage = Storage()
yield &substring
storage = substring.storage
}
}
}
struct MyAttributedSubstring {
let storage: Storage
let bounds: Range<MyAttributedString.Index>
subscript<K>(_ attribute: K.Type) -> K.Value? where K: MyAttributedStringKey {
get { ... }
set { ... }
}
}
It would be nice to eliminate the extra Storage allocation. I think this should be possible β the substring is essentially doing a scoped mutable borrow of the storage β but I don't think there's a way to express this now.
Maybe [Pitch] Non-Escapable Types and Lifetime Dependency already fixes this, though there's nothing here that's non-escapable. Maybe a non-escapable wrapper around Storage that's passed to the substring's initializer. I'm not sure.
Either way, I hope that the more expressive ownership control that's coming down the pipe will be able to handle this situation.
Yeah, I think it could. I was reflecting on that after I posted.
It means you probably have to mark it as nonisolated(unsafe) and have to be careful to not accidentally modify emptyStorage, e.g. by forgetting to unset it with storage = substring.storage (you might have multiple _modify accessors) and then calling a mutating method the MyAttributedString.
That said, I'd love a solution that would make it unnecessary to do any of this, and it seems like it should be possible, as long as use cases like this are taken into account in the design of all the ownership features.
have you tried godbolting this with the storage assigment in a defer block? i havenβt tested it, but i believe that should enable the compiler to prove that the empty instance never outlives the accessor call.
You can use consume to move the storage in to your substring view.
struct Foo {
class Storage {
var data: [UInt8] = []
}
var storage = Storage()
var someProp: [UInt8] {
get {
storage.data
}
_modify {
let x = (consume self).storage
defer { self = Foo(storage: x) }
yield &x.data
}
}
}
Beware, though - it is possible that somebody could assign a substring to another substring:
s[s.range(of: "Hello")!] = someOtherSubstring
If they do that, you'll end up replacing the entire storage of s, not just the range you care about. And since you handed them a unique reference, you also have no idea what contents used to be in s.
Working around that is tricky. You need to assign some kind of ID to the storage at the start of the mutating operation, and check that it is the same at the end of the operation.
Here's a modified version of the code that actually compiles, including with your suggestion to put the second storage assignment into a defer.
struct MyAttributedString {
struct Index: Comparable {
static func < (_ lhs: MyAttributedString.Index, _ rhs: MyAttributedString.Index) -> Bool {
true
}
static func == (_ lhs: MyAttributedString.Index, _ rhs: MyAttributedString.Index) -> Bool {
false
}
}
class Storage {
}
// underlying contents
var storage: Storage
subscript(bounds: Range<Index>) -> MyAttributedSubstring {
_read {
yield MyAttributedSubstring(storage: storage, bounds: bounds)
}
_modify {
var substring = MyAttributedSubstring(storage: storage, bounds: bounds)
// This allocation is necessary to keep storage's refcount == 1
// so the contents can be updated without triggering a copy.
storage = Storage()
defer { storage = substring.storage }
yield &substring
}
}
}
struct MyAttributedSubstring {
let storage: MyAttributedString.Storage
let bounds: Range<MyAttributedString.Index>
}
Here's the Compiler Explorer output when compiled with -O. I'm admittedly not that experienced with interpreting the Swift runtime calls in the Compiler Explorer disassembly, but it looks like the compiler isn't currently able to optimize out the extra Storage allocation.
But even if the compiler could optimize this out, the cost of the allocation is only part of the cost. The other part is the cognitive overhead of wondering whether the compiler is able to do what you hope it will, having to verify it on godbolt, and remembering to unset and set storage as necessary everywhere in your codebase.
It's similar to the trouble with getting an inout reference to the associated value of an enum. Sure, you can add an extra null case to your enum, and set theEnum = null after extracting the associated value in a pattern match, and then write back the associated value when you're done, and the compiler might even be good enough to optimize away the copy or retain even if you don't do this, but this is a pain. It sounds like we're going to get mutable borrows of associated values soon, and that's a good thing. You can just express what you're trying to express.
I haven't tested that, so I'm not sure. For me, the benefit of using _modify here is that I'm able to communicate my intent to the compiler (and someone else who reads my code), which is "I want this to be borrowed, not copied, for the duration of the read-modify-write."
Oh, this is good. I'll try this shortly and will report back.
I'd still love it if we could express "the substring that I'm constructing here does a mutable borrow of storage for the duration of the _modify block, and once the substring is deallocated at the end of the block the borrow ends" and not have to set or reset storage at all.
Ah, I had not thought of that. Just looked at the AttributedString source in swift-foundation and can see where it's doing that check. I'll have to add the equivalent to mine.
I wonder if there's a way to incorporate that constraint (you can't replace storage with an unrelated Storage from another string) as part of the mutable borrow + yield that I'm hoping for without needing to do the explicit ID assignment and verification.
AttributedString's subscript has both a _modify and set where the set does a mutating replace on the existing storage, and the _modify does the aforementioned yielding and ID verification to make sure the storage instance hasn't been replaced.
Maybe instead of a mutable borrow, what you'd actually want is an immutable borrow (no retain) of the Storage reference so that contents of storage can still be mutated, but it can't be replaced with a different Storage instance.
I think it's no easier to express a yielded immutable borrow than a yielded mutable borrow at the moment, but perhaps I'm wrong. Is there a way to do it?