Hi, Swift users!
I want to share a trick, that I've come up recently and find it very interesting and useful in some cases.
Setting
Lets say we have some state and two functions that can operate on and modify the subset of it:
struct State {
var foo: [Foo]
var bar: [Bar]
var baz: [Baz]
}
func foobazer(data: inout ([Foo], [Baz])) { ... }
func barbazer(data: inout ([Bar], [Baz])) { ... }
The most straightforward way to call those functions (from now assume that we have var state: State
declared somewhere visible to current scope):
var foobazData = (state.foo, state.baz)
foobazer(data: &foobazData)
state.foo = foobazData.0
state.baz = foobazData.1
But that looks a little messy, so we can extract this in computed property (as far as I know it is called lenses in functional programming):
extension State {
var foobaz: ([Foo], [Baz]) {
get { (foo, baz) }
set { foo = newValue.0 ; baz = newValue.1 }
}
var barbaz: ([Bar], [Baz]) { ... }
}
And call foobazer and barbazer like this:
foobazer(data: &state.foobaz)
barbazer(data: &state.barbaz)
Good!
The problem
Now, the interesting part. If foobazer
or barbazer
modifies the array of Foo
, Bar
or Baz
, by law of copy on write, it will be copied, because upper level state still holds the reference to it. Thats pretty bad and useless, as the old copy, that is owned by upper state, will be discarded just after function returns. For that specific purpose swift has special _modify
and yield
keywords, that are now going through pitch phase Modify Accessors to be an official feature and get rid of leading underscore. But, the problem is, that yield can be used with only one value, and you cannot put inout
values into tuples:
var foobaz: ([Foo], [Baz]) {
get { (foo, baz) }
_modify {
yield &(foo, baz) // will not work
yield (&foo, &baz) // nope
yield &(&foo, &baz) // still not, no matter how much & you put
var copy = (foo, baz)
yield © // will work
foo = copy.0; bar = copy.1 // but, again, will trigger copying
}
The solution
So, we somehow need to remove the ownership from upper State
. How can we do it? What if we write some temporary value to State
, and then, when modification is done, restore it to new, shiny value? Lets try:
var foobaz: ([Foo], [Baz]) {
get { (foo, baz) }
_modify {
// internal array buffer will not be copied because of cow
var copy = (foo, baz)
// temporary set dummy values to "borrow" ownership
foo = [] ; baz = []
yield © // modification happens
foo = copy.0; baz = copy.1 // restoring values
}
And that will work! We took the ownership from State
, and now when foobazer
will modify the array, it will be single referenced, and not copied. The last thing, that dummy value is kinda strange, and for some types there is no such dummy values. We can move to unsafe world and deinitialize variable temporary:
var foobaz: ([Foo], [Baz]) {
get { (foo, bar) }
_modify {
var copy = (foo, bar)
withUnsafeMutablePointer(to: &foo) {
_ = $0.deinitialize(count: 1)
}
withUnsafeMutablePointer(to: &baz) {
_ = $0.deinitialize(count: 1)
}
yield ©
withUnsafeMutablePointer(to: &foo) {
$0.initialize(to: copy.0)
}
withUnsafeMutablePointer(to: &baz) {
$0.initialize(to: copy.1)
}
}
}
We can define some helper functions to make things cleaner:
// just like in c++ 🙈
func unsafeMove<T>(_ val: inout T) -> T {
withUnsafeMutablePointer(to: &val) { $0.move() }
}
func unsafeInitialize<T>(_ val: inout T, with source: T) {
withUnsafeMutablePointer(to: &val) { $0.initialize(to: source) }
}
And final solution
var foobaz: ([Foo], [Baz]) {
get { (foo, bar) }
_modify {
var copy = (unsafeMove(&foo), unsafeMove(&baz))
yield ©
unsafeInitialize(&foo, with: copy.0)
unsafeInitialize(&baz, with: copy.1)
}
}
Afterword
What is still bothers me, that we could, theoretically, move initialized memory of copy to uninitialized foo
and bar
, and that will help with gigantic structures to reduce retain/release calls and overall copying. But I couldn't find a way to do it yet, may be someone could help me? And what do you think about all this?
Thanks for reading!