How does Swift COW manage to avoid a dictionary copy in this simple computed property example?

davidbaraff · January 31, 2019, 8:11pm

    class C1 {
        var _dictBacking = [String : String]()
        
        var dict: [String:String] {
            get { return _dictBacking }
            set { _dictBacking = newValue }
        }
    }

    var c = C1()
    //  If I now write:
    //    c._dictBacking["abc"] = "xyz"
    // then I expect the COW behavior of Swift dictionaries to just modify
    // c._dictBacking in place because c._dictBacking is the unique referencer of the internal dictionary data.

But I am pretty sure that the next line also modifies the dictionary in place, i.e. also avoids a copy:

c.dict["abc"] = "xyz

What I don't understand is how this is done. Isn't get returning a new dictionary structure, which internally is also pointing at the same internal data that _dictBacking points at? And then doesn't the subcript operator modify that returned dictionary structure, which would cause a copy, since now the internal count is up to 2?

I want to understand this because I need to do it for myself, in a case where I can't just make use of a dictionary to get COW behavior.

Update: And when I implement a simple COW datastructure, indeed, going through a computed property to access the underlying backing store does in fact make my implementation copy. Maybe this doesn't avoid a copy in the dictionary case? I guess I'd have to run some timing tests to verify this...

Update 2: Oh. Apparently Swift doesn't avoid the extra copy in the dictionary case. That'll teach me to believe in magic...

Nevin · January 31, 2019, 8:38pm

How are you testing this?

davidbaraff · January 31, 2019, 8:43pm

Both with a playground and writing a very short command-line program, compiled directly using swiftc. Basically, I timed executing c._dictBacking["abc"] = "xyz" vs c.dict["abc"] = "xyz" about 10,000 times, after filling up the dictionary with 100K items. Difference was night and day.

jrose · January 31, 2019, 8:50pm

This is one of the motivations for the "generalized accessors" read and modify described in the Ownership Manifesto. The implementations aren't necessarily ready for general use yet, but eventually that would provide alternate access to the property that would allow avoiding the extra copy.

DevAndArtist · January 31, 2019, 8:51pm

The stdlib has read, modify and yield to avoid the extra copy. Right now these are not available for us to use, at least not officially.

Here is a good example on how that works:

Lantua · February 1, 2019, 5:59am

I wonder how inlining interacts in this scenario.

taylorswift · February 1, 2019, 4:42pm

I’ve been wrapping a lot of arrays like this:

struct S 
{
    private 
    var buffer:[Int]

    var array:[Int] 
    {
        get 
        {
            return self.buffer
        }
        set(value)
        {
            self.buffer = value
        }
    }
}

does that mean when i do something like this

let s:S = ...

for i:Int in s.indices 
{
    s.array[i] = ...
}

this loop is effectively quadratic?

John_McCall · February 1, 2019, 5:23pm

Yes. Eliminating that overhead is a large part of the purpose of _read and _modify. I would also like to make it possible to conveniently declare storage as an alias for some other piece of storage, which would amount to defining _read and _modify automatically, but that's separable.

Although I don't know why you'd trivially wrap your own private property this way; it feels like importing the idiom of a different language.

DevAndArtist · February 1, 2019, 5:32pm

I asked it in a recent KeyPath related thread. There I used inout as a function similar to get but that was just bikeshedding. But since for read/modify we‘d require yield maybe we could write something like this:

var value: Value { yielding(\.storage.value) }

as a unified way aliasing a private storage?!

And for get/set I think { inout(\.storage.value) } would produce the current behavior.

John_McCall · February 1, 2019, 5:42pm

We can bikeshed syntax in a pitch thread if you want to work on this.

DevAndArtist · February 1, 2019, 5:50pm

I'd love to be able to work on the Swift compiler one day, but right now my time is fairly limited to start learning C++ and understanding the existing parts of the compiler. I hope in one or two month to find some more spare time to finally dive into the compiler, hopefully with some success. So for now I'll stay on the feedback front, but it's definitely on my to-do list.

But I really appreciate that you and the rest of the Swift team keep us motivated.

John_McCall · February 1, 2019, 6:10pm

Understood.

taylorswift · February 1, 2019, 6:43pm

well, one reason is to provide shorthand access to something that’s buried inside another abstraction

        struct State 
        {
            var model:Latest<Void>
            
            var plane:Latest<ControlPlane>
            var action:Latest<Action> 
            var preselection:Latest<Int?>
        }
        
        private 
        var state:State 

        private 
        var plane:ControlPlane 
        {
            get         { return    self.state.plane.value }
            set(value)  {           self.state.plane.value = value }
        }
        private 
        var action:Action 
        {
            get         { return    self.state.action.value }
            set(value)  {           self.state.action.value = value }
        }
        private 
        var preselection:Int? 
        {
            get         { return    self.state.preselection.value }
            set(value)  {           self.state.preselection.value = value }
        }

where state here is a structure that gets passed to drawing code, and Latest<T> is a generic wrapper that tracks changes to its element

protocol ViewEquatable 
{
    static 
    func viewEquivalent(_:Self, _:Self) -> Bool 
}
struct Latest<T>
{
    private 
    var _value:T, 
        dirty:Bool 
    
    var isDirty:Bool 
    {
        return self.dirty
    }
    
    init(_ value:T) 
    {
        self._value = value 
        self.dirty  = true 
    }
    
    mutating 
    func reset() 
    {
        self.dirty = true
    }
    
    mutating 
    func pop() -> T? 
    {
        if self.dirty 
        {
            self.dirty = false 
            return self._value 
        }
        else 
        {
            return nil 
        }        
    }
    
    mutating 
    func get() -> T 
    {
        self.dirty = false 
        return self._value
    }
}
extension Latest where T:Equatable 
{
    var value:T 
    {
        get 
        {
            return self._value 
        }
        set(value)
        {
            if value != self._value  
            {
                self.dirty  = true 
            }
            self._value = value 
        }
    }
}
extension Latest where T:ViewEquatable
{
    var value:T 
    {
        get 
        {
            return self._value 
        }
        set(value)
        {
            if !T.viewEquivalent(value, self._value)  
            {
                self.dirty  = true 
            }
            self._value = value 
        }
    }
}

John_McCall · February 1, 2019, 6:47pm

Okay, sure, the test case is just reduced to triviality. I certainly understand why it's useful for a non-trivial forwarding.

davidbaraff · February 2, 2019, 2:46am

It’s gratifying to know something I thought was simple was in fact responded to in a fairly deep manner! I think Swift is great, but stuff like this reminds it is still very, very young as languages go!! (I can’t live without it, though, it I look forward to the day in can reliably be as fast and as predictable as c++, without the mind numbing complexity of c++. I programmed in c++ since 1984, and didn’t think of it as complex till recently, so good job, Swift folks!)

I think a lot of peolple would be surprised by the fact that simply putting a property in makes things expensive!!

Separate question: could someone please explain to me what the heck “bike shedding” means and how this came to be as a phrase?

Nevin · February 2, 2019, 2:49am

Parkinson’s law of triviality

Anthony_Miller · August 19, 2021, 9:18pm

Would you be able to get around the copying with a function that takes an inout parameter?

class C1 {
        var _dictBacking = [String : String]()
        
        var dict: [String:String] { _dictBacking }

        public func mutate(_ mutation: (inout [String : String]) -> Void) {
            mutation(&_dictBacking)
        }
}

Would this work properly, or am I missing something?