Recommendation for thread-safe dictionary

Hi,

I'm writing some code where I'd like multiple threads to be writing to a
common dictionary object.

Is there a recommended mechanism for doing this?

Thanks,
Lane

1 Like

Wrap mutexes* around dictionary accesses. If you have a lot more reads than writes, a read/write mutex will be more efficient.

Making a thread-safe dictionary class is usually not a good idea. (This is something Java learned in between JDK 1.0 and 1.2.) It adds unavoidable overhead to every single access, and doesn’t solve the higher-level synchronization problems of the code that’s using the dictionary. Instead, use synchronization primitives in your higher-level class at the appropriate points.

—Jens

* which I guess you’ll have to implement using C calls to pthreads, since the Swift concurrency library isn’t ready yet

···

On Dec 10, 2015, at 9:18 AM, Lane Schwartz via swift-users <swift-users@swift.org> wrote:

I'm writing some code where I'd like multiple threads to be writing to a common dictionary object.
Is there a recommended mechanism for doing this?

+1. I learned this the hard way, once upon a time. For code that is primarily CPU bound (like dictionary reads/writes), opt for using your objects in a thread-safe way rather than creating thread-safe objects. Either make sure the callers wrap each access around a mutex/semaphore lock, or (if you’re working in OSX/iOS) ensure that each access is dispatched onto the same serial Grand Central Dispatch queue.

Dan

···

On Dec 10, 2015, at 9:28 AM, Jens Alfke via swift-users <swift-users@swift.org> wrote:

On Dec 10, 2015, at 9:18 AM, Lane Schwartz via swift-users <swift-users@swift.org> wrote:

I'm writing some code where I'd like multiple threads to be writing to a common dictionary object.
Is there a recommended mechanism for doing this?

Wrap mutexes* around dictionary accesses. If you have a lot more reads than writes, a read/write mutex will be more efficient.

Making a thread-safe dictionary class is usually not a good idea. (This is something Java learned in between JDK 1.0 and 1.2.) It adds unavoidable overhead to every single access, and doesn’t solve the higher-level synchronization problems of the code that’s using the dictionary. Instead, use synchronization primitives in your higher-level class at the appropriate points.

—Jens

* which I guess you’ll have to implement using C calls to pthreads, since the Swift concurrency library isn’t ready yet
_______________________________________________
swift-users mailing list
swift-users@swift.org
https://lists.swift.org/mailman/listinfo/swift-users

I'm writing some code where I'd like multiple threads to be writing to a common dictionary object.
Is there a recommended mechanism for doing this?

Wrap mutexes* around dictionary accesses. If you have a lot more reads than writes, a read/write mutex will be more efficient.

Speaking of which, the CleanroomConcurrency project for Swift provides a ReadWriteCoordinator that provides similar functionality using a Grand Central Dispatch feature:

https://github.com/emaloney/CleanroomConcurrency/tree/master/Code#readwritecoordinator

The flip side of the threadsafe dictionary is the thread-local dictionary associated with each NSThread which can be used in a Swifty fashion with:

https://github.com/emaloney/CleanroomConcurrency/tree/master/Code#threadlocalvalue

Disclosure: This open-source code is provided courtesy of my employer, Gilt Groupe, and ships in our "Gilt on TV" app for the new Apple TV.

···

On Dec 10, 2015, at 12:28 PM, Jens Alfke via swift-users <swift-users@swift.org> wrote:

On Dec 10, 2015, at 9:18 AM, Lane Schwartz via swift-users <swift-users@swift.org> wrote:

I'm writing some code where I'd like multiple threads to be writing to a common dictionary object.
Is there a recommended mechanism for doing this?

Wrap mutexes* around dictionary accesses. If you have a lot more reads than writes, a read/write mutex will be more efficient.

Making a thread-safe dictionary class is usually not a good idea. (This is something Java learned in between JDK 1.0 and 1.2.) It adds unavoidable overhead to every single access, and doesn’t solve the higher-level synchronization problems of the code that’s using the dictionary. Instead, use synchronization primitives in your higher-level class at the appropriate points.

—Jens

* which I guess you’ll have to implement using C calls to pthreads, since the Swift concurrency library isn’t ready yet
_______________________________________________
swift-users mailing list
swift-users@swift.org
https://lists.swift.org/mailman/listinfo/swift-users

I have one that I created for Swift on OS X which uses libdispatch to protect access to the underlying data:

/* dictionary that allows thread safe concurrent access */
final class ConcurrentDictionary<KeyType:Hashable,ValueType> : NSObject, SequenceType, DictionaryLiteralConvertible {
/* internal dictionary */
private var internalDictionary : [KeyType:ValueType]

/* queue modfications using a barrier and allow concurrent read operations */
private let queue = dispatch_queue_create( "dictionary access", DISPATCH_QUEUE_CONCURRENT )

/* count of key-value pairs in this dicitionary */
var count : Int {
var count = 0
dispatch_sync(self.queue) { () -> Void in
count = self.internalDictionary.count
}
return count
}

// safely get or set a copy of the internal dictionary value
var dictionary : [KeyType:ValueType] {
get {
var dictionaryCopy : [KeyType:ValueType]?
dispatch_sync(self.queue) { () -> Void in
dictionaryCopy = self.dictionary
}
return dictionaryCopy!
}

set {
let dictionaryCopy = newValue // create a local copy on the current thread
dispatch_async(self.queue) { () -> Void in
self.internalDictionary = dictionaryCopy
}
}
}

/* initialize an empty dictionary */
override convenience init() {
self.init( dictionary: [KeyType:ValueType]() )
}

/* allow a concurrent dictionary to be initialized using a dictionary literal of form: [key1:value1, key2:value2, ...] */
convenience required init(dictionaryLiteral elements: (KeyType, ValueType)...) {
var dictionary = Dictionary<KeyType,ValueType>()

for (key,value) in elements {
dictionary[key] = value
}

self.init(dictionary: dictionary)
}

/* initialize a concurrent dictionary from a copy of a standard dictionary */
init( dictionary: [KeyType:ValueType] ) {
self.internalDictionary = dictionary
}

/* provide subscript accessors */
subscript(key: KeyType) -> ValueType? {
get {
var value : ValueType?
dispatch_sync(self.queue) { () -> Void in
value = self.internalDictionary[key]
}
return value
}

set {
setValue(newValue, forKey: key)
}
}

/* assign the specified value to the specified key */
func setValue(value: ValueType?, forKey key: KeyType) {
// need to synchronize writes for consistent modifications
dispatch_barrier_async(self.queue) { () -> Void in
self.internalDictionary[key] = value
}
}

/* remove the value associated with the specified key and return its value if any */
func removeValueForKey(key: KeyType) -> ValueType? {
var oldValue : ValueType? = nil
// need to synchronize removal for consistent modifications
dispatch_barrier_sync(self.queue) { () -> Void in
oldValue = self.internalDictionary.removeValueForKey(key)
}
return oldValue
}

/* Generator of key-value pairs suitable for for-in loops */
func generate() -> Dictionary<KeyType,ValueType>.Generator {
var generator : Dictionary<KeyType,ValueType>.Generator!
dispatch_sync(self.queue) { () -> Void in
generator = self.internalDictionary.generate()
}
return generator
}
}

···

_____________________________________________________________________________________
Thomas Pelaia II, Ph.D. | Applications Leader, Accelerator Physics, Research Accelerator Division
Spallation Neutron Source | Oak Ridge National Lab, Building 8600, MS-6462, Oak Ridge, TN 37831
phone: (865) 414-7960 | FaceTime: t6p@ornl.gov<mailto:t6p@ornl.gov> | fax: (865) 574-6617 | homepage: http://www.ornl.gov/~t6p

On Dec 10, 2015, at 12:18 PM, swift-users <swift-users@swift.org<mailto:swift-users@swift.org>> wrote:

Hi,

I'm writing some code where I'd like multiple threads to be writing to a common dictionary object.

Is there a recommended mechanism for doing this?

Thanks,
Lane

_______________________________________________
swift-users mailing list
swift-users@swift.org<mailto:swift-users@swift.org>
https://lists.swift.org/mailman/listinfo/swift-users

Please don't do this, unless you have a very special use case. This is
inherently racy on high level, even though it is safe in the language.
Consider two threads operating on a shared ConcurrentDictionary<Int, Int>:

d[42] += 1

Here's what the code compiles into:

var tmp = d.subscript_get(42)
tmp += 1
d.subscript_set(42, tmp)

The 'get' and 'set' operations are atomic, but the whole sequence isn't.
The results of operations that other threads execute during "tmp += 1" will
be overwritten by the following 'subscript_set'.

Dmitri

···

On Thu, Dec 10, 2015 at 9:24 AM, Pelaia II, Tom via swift-users < swift-users@swift.org> wrote:

/* provide subscript accessors */
subscript(key: KeyType) -> ValueType? {
get {
var value : ValueType?
dispatch_sync(self.queue) { () -> Void in
value = self.internalDictionary[key]
}
return value
}

set {
setValue(newValue, forKey: key)
}
}

--
main(i,j){for(i=2;;i++){for(j=2;j<i;j++){if(!(i%j)){j=0;break;}}if
(j){printf("%d\n",i);}}} /*Dmitri Gribenko <gribozavr@gmail.com>*/

But isn’t that really a problem with that use case rather than the concurrent dictionary itself? There are a lot of bad things one can do with almost any code. For example, one can naively use the standard library dictionary in subtle, unsafe ways with concurrency and get into trouble.

Of course “+-" would be a bad idea to do in this context. Indeed, I am using this concurrent dictionary in a special use case, and I would never do the “+=“ operation as you suggest. It’s not even relevant in my code. I am just using this concurrent dictionary to keep track of concurrent events being completed and posted from different threads. When the event completes it gets put into the dictionary with the value being the immutable result. Alternatively, I could have put the concurrency code outside of the dictionary, but in my case that made for awkward access and opened the possibility of accidentally referencing the dictionary directly in an unsafe way. The safest option for me was to create the concurrent dictionary and the subscript allowed the code to be easier to read and write.

···

_____________________________________________________________________________________
Thomas Pelaia II, Ph.D. | Applications Leader, Accelerator Physics, Research Accelerator Division
Spallation Neutron Source | Oak Ridge National Lab, Building 8600, MS-6462, Oak Ridge, TN 37831
phone: (865) 414-7960 | FaceTime: t6p@ornl.gov<mailto:t6p@ornl.gov> | fax: (865) 574-6617 | homepage: http://www.ornl.gov/~t6p

On Dec 10, 2015, at 12:44 PM, Dmitri Gribenko <gribozavr@gmail.com<mailto:gribozavr@gmail.com>> wrote:

On Thu, Dec 10, 2015 at 9:24 AM, Pelaia II, Tom via swift-users <swift-users@swift.org<mailto:swift-users@swift.org>> wrote:
/* provide subscript accessors */
subscript(key: KeyType) -> ValueType? {
get {
var value : ValueType?
dispatch_sync(self.queue) { () -> Void in
value = self.internalDictionary[key]
}
return value
}

set {
setValue(newValue, forKey: key)
}
}

Please don't do this, unless you have a very special use case. This is inherently racy on high level, even though it is safe in the language. Consider two threads operating on a shared ConcurrentDictionary<Int, Int>:

d[42] += 1

Here's what the code compiles into:

var tmp = d.subscript_get(42)
tmp += 1
d.subscript_set(42, tmp)

The 'get' and 'set' operations are atomic, but the whole sequence isn't. The results of operations that other threads execute during "tmp += 1" will be overwritten by the following 'subscript_set'.

Dmitri

--
main(i,j){for(i=2;;i++){for(j=2;j<i;j++){if(!(i%j)){j=0;break;}}if
(j){printf("%d\n",i);}}} /*Dmitri Gribenko <gribozavr@gmail.com<mailto:gribozavr@gmail.com>>*/

The concurrency considerations and safety conditions for Dictionary
are the same as for Int, so I don't think there is anything subtle
here.

Dmitri

···

On Thu, Dec 10, 2015 at 10:06 AM, Pelaia II, Tom <pelaiata@ornl.gov> wrote:

But isn’t that really a problem with that use case rather than the
concurrent dictionary itself? There are a lot of bad things one can do with
almost any code. For example, one can naively use the standard library
dictionary in subtle, unsafe ways with concurrency and get into trouble.

--
main(i,j){for(i=2;;i++){for(j=2;j<i;j++){if(!(i%j)){j=0;break;}}if
(j){printf("%d\n",i);}}} /*Dmitri Gribenko <gribozavr@gmail.com>*/

But isn’t that really a problem with that use case rather than the concurrent dictionary itself?

It’s a problem with _most_ use cases of a concurrent dictionary, unfortunately. The values in such a dictionary can be read and written atomically, but that’s not sufficient for anything that wants to use multiple values in a coordinated way, or update a value, etc. etc.

It’s not even relevant in my code. I am just using this concurrent dictionary to keep track of concurrent events being completed and posted from different threads. When the event completes it gets put into the dictionary with the value being the immutable result.

That sounds like a case where a simple concurrent dictionary would be an appropriate data structure. So go ahead and write one for your own needs. All we’re saying is that a class like this isn’t commonly useful enough to go into a library.

—Jens

···

On Dec 10, 2015, at 10:06 AM, Pelaia II, Tom via swift-users <swift-users@swift.org> wrote:

And too easy to misuse if provided.

Dmitri

···

On Thu, Dec 10, 2015 at 11:20 AM, Jens Alfke <jens@mooseyard.com> wrote:

All we’re saying is that a class like this isn’t commonly useful enough to go into a library.

--
main(i,j){for(i=2;;i++){for(j=2;j<i;j++){if(!(i%j)){j=0;break;}}if
(j){printf("%d\n",i);}}} /*Dmitri Gribenko <gribozavr@gmail.com>*/

I would do something like:

    import Foundation

    class Task { // Must be a class to prevent Swift copy semantics
eliminating the result

        private static let queue = dispatch_get_global_queue(
DISPATCH_QUEUE_PRIORITY_HIGH, 0)

        private let group = dispatch_group_create()

        private var _result: String?

        init(name: String) {

            dispatch_group_async(group, Task.queue) {

                self._result = name // The asynchronous task!

            }

        }

        var result: String { // Provide safe access to result

            dispatch_group_wait(group, DISPATCH_TIME_FOREVER) // Block
until task finished

            return _result!

        }

    }

    var tasks = [String : Task]()

    let names = ["One", "Two"]

    names.forEach {

        tasks[$0] = Task(name: $0)

    }

    tasks.map { (_, task) in // Prints [One, Two] in playground

        task.result

    }

···

On 11 December 2015 at 07:02, Dmitri Gribenko via swift-users < swift-users@swift.org> wrote:

On Thu, Dec 10, 2015 at 11:20 AM, Jens Alfke <jens@mooseyard.com> wrote:
> All we’re saying is that a class like this isn’t commonly useful enough
to go into a library.

And too easy to misuse if provided.

Dmitri

--
main(i,j){for(i=2;;i++){for(j=2;j<i;j++){if(!(i%j)){j=0;break;}}if
(j){printf("%d\n",i);}}} /*Dmitri Gribenko <gribozavr@gmail.com>*/
_______________________________________________
swift-users mailing list
swift-users@swift.org
https://lists.swift.org/mailman/listinfo/swift-users

--
  -- Howard.