Swift Arrays, Sets, and Dictionaries are copy-on-write structs. The storage is shared between copies of the collection until one copy modifies the data, then it will receive a private copy that can be modified. I want to do something like that for a collection I’m implementing.
I looked at the Swift stdlib sources and this appears to be implemented using a function called isUniquelyReferenced(). This function appears to do lots of black-magic mojo.
Is there a ‘legit’ way to implement this? If it requires internals, is there a plan to make this an officially supported capability that can be safely used?
I thought the standard Swift collections like Array were thread safe and would maintain the value type illusion, even with multiple threads. If so, what do they use for a locking mechanism? I can probably do something with a pthread mutex but I'd like to use whatever the standard library collections are using.
It is thread safe for multiple arrays instances to share the storage. You have either:
isKnownUniquelyReferenced returns true: there is only one pointer to the storage and thus it belongs to the current thread, so no need to lock for mutations.
isKnownUniquelyReferenced returns false: you can only read the storage because other threads might have access to it (also, other arrays want to preserve their own value); so you get a new storage when mutating and only the current array is affected.
What is not thread safe is to share the same var with multiple threads (unless you synchronize access to that var). That is true for most types, including Array.
The following case is confusing me. It seems like you could have two separate vars but still have a problem. For example, imagine the update above is the array subscript setter.
We start with just one var. Thread #1 enters the setter, checks isKnownUniquelyReferenced and sees that it is true, so it proceeds without copy. It gets interrupted, before it actually updates the storage.
Thread #2 makes a copy of the value to a second var. The two vars point to the same storage so isKnownUniquelyReferenced is false. Thread #2 reads a value from array[0] and decides to set array[1] to match that value. It calls the setter. It gets interrupted.
Thread #1 finishes its update, which sets array[0] in the shared storage.
Thread #2 resumes, copies the storage, and updates array[1]. It thinks its first and second elements are equal, but they're not, because it has a "value" that changed out from under it.
Isn't that a problem? Does it break the illusion that an Array is a value type?
You have two threads accessing the same var here. Thread #2 cannot make a copy of the value without accessing the original var. Swift is not going to protect you against race conditions like this.
If the instance passed as object is being accessed by multiple threads simultaneously, this function may still return true . Therefore, you must only call this function from mutating methods with appropriate thread synchronization. That will ensure that isKnownUniquelyReferenced(_:) only returns true when there is really one accessor, or when there is a race condition, which is already undefined behavior.
A lot of stdlib types already assume single-thread access, and leave the handling about race to things like Dispatch. I suppose you could do the same.