Thread safety of weak properties

Mike_Ash · December 11, 2015, 2:50am

Hello, world!

First, congratulations on the whole open source thing. I'm really pleased to see how the team set it up and how well it's going. Blew away my expectations.

Anyway, on to the thing.

I was looking through the standard library's implementation of weak references, as one does, and noticed that loading a weak reference is not thread safe. I'm not sure if this is by intent or by omission. On one hand, loading a weak reference *looks like* just reading a stored property, which should be thread safe if not done concurrently with any writes. On the other hand, loading a weak reference is *actually* a potential mutation of that stored property, from its original content to nil, which one would not expect to be thread safe. It's clear that weak references are supposed to be thread safe with respect to the target object being destroyed, but less clear whether they're supposed to be thread safe for concurrent accesses of the same weak reference.

Is there an explicit intent as to whether loading a weak reference should be thread safe or not?

The root of the problem (or not-problem, if it's supposed to be this way) is in the swift_weakLoadStrong function in HeapObject.cpp. If two threads simultaneously enter this function with the same weak reference and object->refCount.isDeallocating() is true, the threads will race. This can result in calling swift_unownedRelease() twice resulting in a double-free or worse, or it could result in one thread calling swift_tryRetain() on a deallocated object.

If making this thread safe is desired, there are a couple of potential fixes.

One fix would be to just remove these two lines:

swift_unownedRelease(object);
ref->Value = nullptr;

This would cause the weak reference to never become nil in memory, and the target object's memory to stay allocated until the weak reference is destroyed or reassigned, but it would eliminate the potential for a race.

Another potential fix would be to add a lock to make this function thread safe. This could be done with low overhead by stealing a bit in ref->Value and using that bit as a spinlock. It pains me to think of adding a lock to this lovely lock-free code, but at least it would be specific to the individual reference.

I haven't filed a bug yet, since I didn't know if it's supposed to be like this or not. If a fix is desired, I'd be happy to file, and maybe make the fix too.

In case it's helpful, here is my test code which demonstrates the crash:

import Foundation

class Target {}

class WeakHolder {
weak var weak: Target?
}

for i in 0..<1000000 {
    print(i)
    let holder = WeakHolder()
    holder.weak = Target()
    dispatch_async(dispatch_get_global_queue(0, 0), {
        let _ = holder.weak
    })
    dispatch_async(dispatch_get_global_queue(0, 0), {
        let _ = holder.weak
    })
}

Mike

John_McCall · December 11, 2015, 3:55am

Hello, world!

First, congratulations on the whole open source thing. I'm really pleased to see how the team set it up and how well it's going. Blew away my expectations.

Anyway, on to the thing.

I was looking through the standard library's implementation of weak references, as one does, and noticed that loading a weak reference is not thread safe. I'm not sure if this is by intent or by omission. On one hand, loading a weak reference *looks like* just reading a stored property, which should be thread safe if not done concurrently with any writes. On the other hand, loading a weak reference is *actually* a potential mutation of that stored property, from its original content to nil, which one would not expect to be thread safe. It's clear that weak references are supposed to be thread safe with respect to the target object being destroyed, but less clear whether they're supposed to be thread safe for concurrent accesses of the same weak reference.

Is there an explicit intent as to whether loading a weak reference should be thread safe or not?

The intent is that weak references do not need to be safe against read/write and write/write races. They do need to be safe against read/destroy and write/destroy races (by “destroy”, I mean destruction of the object, not the weak reference). I agree that they should also be safe against read/read races.

The root of the problem (or not-problem, if it's supposed to be this way) is in the swift_weakLoadStrong function in HeapObject.cpp. If two threads simultaneously enter this function with the same weak reference and object->refCount.isDeallocating() is true, the threads will race. This can result in calling swift_unownedRelease() twice resulting in a double-free or worse, or it could result in one thread calling swift_tryRetain() on a deallocated object.

Yes, you’re absolutely right, this is a bug in the current implementation; good catch!

If making this thread safe is desired, there are a couple of potential fixes.

One fix would be to just remove these two lines:

swift_unownedRelease(object);
ref->Value = nullptr;

This would cause the weak reference to never become nil in memory, and the target object's memory to stay allocated until the weak reference is destroyed or reassigned, but it would eliminate the potential for a race.

Another potential fix would be to add a lock to make this function thread safe. This could be done with low overhead by stealing a bit in ref->Value and using that bit as a spinlock. It pains me to think of adding a lock to this lovely lock-free code, but at least it would be specific to the individual reference.

I haven't filed a bug yet, since I didn't know if it's supposed to be like this or not. If a fix is desired, I'd be happy to file, and maybe make the fix too.

That would be great.

There’s actually a higher-level semantic bug in this code, which is that we really do want weak references to make the same deallocation guarantees as Objective-C if possible. That is, while we’re comfortable with the idea of an unowned reference pinning some memory until the unowned reference is destroyed, weak references should really allow the object’s memory to be more-or-less immediately returned to the general pool. It was always our intention to revisit the implementation and do this.

It would be acceptable to increase the inline size of weak references if that makes this more performant.

John.

···

On Dec 10, 2015, at 6:50 PM, Mike Ash via swift-dev <swift-dev@swift.org> wrote:

Mike_Ash · December 11, 2015, 3:00pm

The intent is that weak references do not need to be safe against read/write and write/write races. They do need to be safe against read/destroy and write/destroy races (by “destroy”, I mean destruction of the object, not the weak reference). I agree that they should also be safe against read/read races.

The root of the problem (or not-problem, if it's supposed to be this way) is in the swift_weakLoadStrong function in HeapObject.cpp. If two threads simultaneously enter this function with the same weak reference and object->refCount.isDeallocating() is true, the threads will race. This can result in calling swift_unownedRelease() twice resulting in a double-free or worse, or it could result in one thread calling swift_tryRetain() on a deallocated object.

Yes, you’re absolutely right, this is a bug in the current implementation; good catch!

OK! I filed the bug:

github.com/apple/swift

[SR-192] Weak properties are not thread safe when reading

opened 02:22PM - 11 Dec 15 UTC

closed 04:48AM - 05 May 16 UTC

mikeash

bug runtime

| | | |------------------|-----------------|… |Previous ID | SR-192 | |Radar | None | |Original Reporter | @mikeash | |Type | Bug | |Status | Closed | |Resolution | Done | <details> <summary>Additional Detail from JIRA</summary> | | | |------------------|-----------------| |Votes | 8 | |Component/s | | |Labels | Bug, Runtime | |Assignee | @glessard | |Priority | Medium | md5: e84790d9452452d03d108981ba9ce226 </details> **Issue Description:** The current implementation of weak references has a race condition when multiple threads read the same weak reference to a deallocating object simultaneously. Since this is semantically read-only access (even if writing occurs under the hood), this should be made safe. The bug itself lies in the `swift_weakLoadStrong` function. If two threads simultaneously enter this function with the same weak reference and object-\>refCount.isDeallocating() is true, the threads will race. This can result in calling swift_unownedRelease() twice resulting in a double-free or worse, or it could result in one thread calling swift_tryRetain() on a deallocated object. Here is a test case which demonstrates the problem: ``` java import Foundation class Target {} class WeakHolder { weak var weak: Target? } for i in 0..<1000000 { print(i) let holder = WeakHolder() holder.weak = Target() dispatch_async(dispatch_get_global_queue(0, 0), { let _ = holder.weak }) dispatch_async(dispatch_get_global_queue(0, 0), { let _ = holder.weak }) } ```

There’s actually a higher-level semantic bug in this code, which is that we really do want weak references to make the same deallocation guarantees as Objective-C if possible. That is, while we’re comfortable with the idea of an unowned reference pinning some memory until the unowned reference is destroyed, weak references should really allow the object’s memory to be more-or-less immediately returned to the general pool. It was always our intention to revisit the implementation and do this.

It would be acceptable to increase the inline size of weak references if that makes this more performant.

It's interesting that you say that. I discovered this bug while writing up the weak reference implementation for a blog article. (Which I just posted here: https://mikeash.com/pyblog/friday-qa-2015-12-11-swift-weak-references.html\) My conclusion is that the Swift Way of leaving an object husk lying around in memory for a while is superior to the Objective-C approach of eagerly deallocating objects and zeroing out references. The memory impact is small, and the performance improvement is nice. Zeroing weak references are always good to have but the cost of accessing them in Objective-C always bugged me a bit. (And I say this having written my own implementation that works exactly the same way.)

My own thinking is that object instances are not usually very big. As long as you can destroy the external resources it holds such as arrays and dictionaries and trees of other objects, the cost is small, just one allocation of some dozens of bytes. You have at most one such allocation per weak reference to a dead object, and less than that if there are multiple weak references to the same dead object. (I hope I am understanding the implementation correctly that it does in fact tear down the object completely when the last strong reference is released, and just keeps the instance's own memory around for the weak references.)

Increasing the inline size of weak references opens up the possibilities a bit. I can think of at least four fixes:

1. Delete the zeroing, and otherwise leave things as-is. This extends the life of the object husk (by the way, is there an official term?) possibly indefinitely. This seems fine to me but it sounds like you may disagree. I will, of course, defer to your judgment on that.

2. Add an activity count to the weak reference. WeakReference would become something like struct WeakReference { HeapObject *Value; unsigned long count; }. swift_weakLoadStrong would increment the count when loading the pointer, and decrement the count when done. Zeroing would only happen when decrementing the count to zero. This would require a 16-byte compare-and-swap operation.

3. Borrow a bit from the weak pointer to implement a spinlock. This is really a special case of (2), with the activity count being capped at 1 and additional activity blocking. In fact, you could even do a hybrid approach by borrowing more bits. (I think it could safely steal up to 20 bits with current 64-bit architectures. This may not be wise. As long as targets are pointer-aligned you can safely steal 2/3 bits.)

4. Add additional bookkeeping and synchronization to allow eager zeroing and deallocation of object husks more like the Objective-C weak pointer implementation. This would require a list of weak references to a given object to be maintained somewhere.

I personally would rank my preference as 1, 3, 2, 4. You guys have *slightly* more experience with this code than I do, though, so I put little weight on my own preference here. What do you think?

Mike

···

On Dec 10, 2015, at 10:55 PM, John McCall <rjmccall@apple.com> wrote:

Lily_Ballard · December 12, 2015, 8:01am

I don't mind un-zeroed weak references keeping an object husk around, but if I read a weak reference and get nil back, I expect the reference to now be zeroed and to not prolong the lifetime of the husk. It's probably not that big a deal, but it would be surprising if it behaved differently.

I think swift_weakCopyInit() and swift_weakTakeInit() have to also be updated for whatever change is done to swift_weakLoadStrong() as those two functions are conceptually just reads of the src pointer, but like swift_weakLoadStrong() they zero out and release the object if it's deallocating.

Another possible fix is to just atomically load/store the Value pointer itself (assuming all platforms Swift runs on supports lock-free atomic pointers). This way if the object is deallocating, it would attempt an atomic CAS on the pointer to null it out, and only release the object if it wasn't already null. This means all writes to weak pointers now require an atomic write, but the benefit over the activity count / spinlock is reads don't have to perform an atomic write while the object is still alive, only an atomic read (although I'm not sure offhand if there's any practical performance difference there). And with this approach, it now becomes easy to make read/write and write/write safe. Whether this approach is worth doing depends on how many weak writes there are compared to weak reads (and offhand I don't have any idea there, except I do know I see an awful lot of code that looks like `weakRef?.foo(); weakRef?.bar(); weakRef?.baz()`).

-Kevin Ballard

···

On Fri, Dec 11, 2015, at 07:00 AM, Mike Ash via swift-dev wrote:

Increasing the inline size of weak references opens up the possibilities a bit. I can think of at least four fixes:

1. Delete the zeroing, and otherwise leave things as-is. This extends the life of the object husk (by the way, is there an official term?) possibly indefinitely. This seems fine to me but it sounds like you may disagree. I will, of course, defer to your judgment on that.

Chris_Lattner · December 13, 2015, 3:04am

#3 sounds like a great approach to me. I agree with Kevin that if we keep the object husk approach that any use of a weak pointer that returns nil should drop any reference to a husk.

-Chris

···

On Dec 11, 2015, at 7:00 AM, Mike Ash via swift-dev <swift-dev@swift.org> wrote:

3. Borrow a bit from the weak pointer to implement a spinlock. This is really a special case of (2), with the activity count being capped at 1 and additional activity blocking. In fact, you could even do a hybrid approach by borrowing more bits. (I think it could safely steal up to 20 bits with current 64-bit architectures. This may not be wise. As long as targets are pointer-aligned you can safely steal 2/3 bits.)

Lily_Ballard · December 12, 2015, 8:29am

I take it back, this is still unsafe because the object can get free'd while another thread is in the process of checking its refcount.

At the moment I'm leaning towards the activity count idea, though I'm not sure why you need a 16-byte CAS for that.

-Kevin

···

On Sat, Dec 12, 2015, at 12:01 AM, Kevin Ballard wrote:

Another possible fix is to just atomically load/store the Value pointer itself (assuming all platforms Swift runs on supports lock-free atomic pointers). This way if the object is deallocating, it would attempt an atomic CAS on the pointer to null it out, and only release the object if it wasn't already null. This means all writes to weak pointers now require an atomic write, but the benefit over the activity count / spinlock is reads don't have to perform an atomic write while the object is still alive, only an atomic read (although I'm not sure offhand if there's any practical performance difference there). And with this approach, it now becomes easy to make read/write and write/write safe. Whether this approach is worth doing depends on how many weak writes there are compared to weak reads (and offhand I don't have any idea there, except I do know I see an awful lot of code that looks like `weakRef?.foo(); weakRef?.bar(); weakRef?.baz()`).

John_McCall · December 14, 2015, 5:47pm

#3 sounds like a great approach to me. I agree with Kevin that if we keep the object husk approach that any use of a weak pointer that returns nil should drop any reference to a husk.

Spin locks are, unfortunately, illegal on iOS, which does not guarantee progress in the face of priority inversion.

John.

···

On Dec 12, 2015, at 7:04 PM, Chris Lattner <clattner@apple.com> wrote:

-Chris

On Dec 11, 2015, at 7:00 AM, Mike Ash via swift-dev <swift-dev@swift.org> wrote:

3. Borrow a bit from the weak pointer to implement a spinlock. This is really a special case of (2), with the activity count being capped at 1 and additional activity blocking. In fact, you could even do a hybrid approach by borrowing more bits. (I think it could safely steal up to 20 bits with current 64-bit architectures. This may not be wise. As long as targets are pointer-aligned you can safely steal 2/3 bits.)

Mike_Ash · December 12, 2015, 5:29pm

Yes, I initially thought of the CAS idea as well, but it doesn't work, like you say.

For the activity count, I was thinking that you'd need to update it atomically along with the weak pointer. But thinking more, I believe that's unnecessary. In fact, the weak pointer doesn't even need to be updated at all.

Here's how I envision this one working. Let's say the weak pointer itself looks like `(ptr, count)`. Define a special value for `count` that means "the weak pointer has been zeroed out." 0xFFFFFFFFFFFFFFFF would be a good candidate. Then reading looks like:

read(ref):
  (obj, count) = *ref
  if count == 0xFFFFFFFFFFFFFFFF: return null
  CAS(ref->count, count, count + 1)
  if failed: restart from the top

  if obj.isDeallocating():
    CAS(ref->count, 1, 0xFFFFFFFFFFFFFFFF)
    if success: weakRelease(obj)
    return null
  else:
    result = tryRetain(obj)
    atomic_decrement(ref->count)
    return result

If an atomic increment is faster than a compare-and-swap loop (probably the case on x86-64?) then you could modify this to say that the top bit of count means "reference is null," then read can safely do an atomic increment and check the top bit of the result to know how to proceed.

For what it's worth, I implemented stealing the bottom bit for a spinlock yesterday and it's pretty nice and easy. I'm a bit conflicted on cutting down on concurrency but preserving weak references as a single pointer, versus improving concurrency but taking up more memory to store them. Concurrency is probably more important.

Mike

···

On Dec 12, 2015, at 3:29 AM, Kevin Ballard via swift-dev <swift-dev@swift.org> wrote:

On Sat, Dec 12, 2015, at 12:01 AM, Kevin Ballard wrote:

Another possible fix is to just atomically load/store the Value pointer itself (assuming all platforms Swift runs on supports lock-free atomic pointers). This way if the object is deallocating, it would attempt an atomic CAS on the pointer to null it out, and only release the object if it wasn't already null. This means all writes to weak pointers now require an atomic write, but the benefit over the activity count / spinlock is reads don't have to perform an atomic write while the object is still alive, only an atomic read (although I'm not sure offhand if there's any practical performance difference there). And with this approach, it now becomes easy to make read/write and write/write safe. Whether this approach is worth doing depends on how many weak writes there are compared to weak reads (and offhand I don't have any idea there, except I do know I see an awful lot of code that looks like `weakRef?.foo(); weakRef?.bar(); weakRef?.baz()`).

I take it back, this is still unsafe because the object can get free'd while another thread is in the process of checking its refcount.

At the moment I'm leaning towards the activity count idea, though I'm not sure why you need a 16-byte CAS for that.

gparker42 · December 14, 2015, 8:19pm

There is a spinlock algorithm that does work (in practice if not in theory), but it requires a full word of storage instead of a single bit.

···

On Dec 14, 2015, at 9:47 AM, John McCall via swift-dev <swift-dev@swift.org> wrote:

On Dec 12, 2015, at 7:04 PM, Chris Lattner <clattner@apple.com> wrote:
#3 sounds like a great approach to me. I agree with Kevin that if we keep the object husk approach that any use of a weak pointer that returns nil should drop any reference to a husk.

Spin locks are, unfortunately, illegal on iOS, which does not guarantee progress in the face of priority inversion.

--
Greg Parker gparker@apple.com Runtime Wrangler

Lily_Ballard · December 15, 2015, 3:26am

Is that what OSSpinLock uses?

-Kevin Ballard

···

On Mon, Dec 14, 2015, at 12:19 PM, Greg Parker via swift-dev wrote:

> On Dec 14, 2015, at 9:47 AM, John McCall via swift-dev <swift-dev@swift.org> wrote:
>
>> On Dec 12, 2015, at 7:04 PM, Chris Lattner <clattner@apple.com> wrote:
>> #3 sounds like a great approach to me. I agree with Kevin that if we keep the object husk approach that any use of a weak pointer that returns nil should drop any reference to a husk.
>
> Spin locks are, unfortunately, illegal on iOS, which does not guarantee progress in the face of priority inversion.

There is a spinlock algorithm that does work (in practice if not in theory), but it requires a full word of storage instead of a single bit.

Mike_Ash · December 15, 2015, 3:51am

Do you have a pointer (unintentional pun, oops) to this algorithm?

In this case, if we're going to dedicate a whole word to it, then we might as well go with the activity count implementation, but I'd be curious to read more just the same.

Mike

···

On Dec 14, 2015, at 3:19 PM, Greg Parker via swift-dev <swift-dev@swift.org> wrote:

On Dec 14, 2015, at 9:47 AM, John McCall via swift-dev <swift-dev@swift.org> wrote:

On Dec 12, 2015, at 7:04 PM, Chris Lattner <clattner@apple.com> wrote:
#3 sounds like a great approach to me. I agree with Kevin that if we keep the object husk approach that any use of a weak pointer that returns nil should drop any reference to a husk.

Spin locks are, unfortunately, illegal on iOS, which does not guarantee progress in the face of priority inversion.

There is a spinlock algorithm that does work (in practice if not in theory), but it requires a full word of storage instead of a single bit.

gparker42 · December 15, 2015, 3:34am

It does not. OSSpinLock is unsafe unless you can guarantee that all users have the same priority.

···

On Dec 14, 2015, at 7:26 PM, Kevin Ballard via swift-dev <swift-dev@swift.org> wrote:

On Mon, Dec 14, 2015, at 12:19 PM, Greg Parker via swift-dev wrote:

On Dec 14, 2015, at 9:47 AM, John McCall via swift-dev <swift-dev@swift.org> wrote:

On Dec 12, 2015, at 7:04 PM, Chris Lattner <clattner@apple.com> wrote:
#3 sounds like a great approach to me. I agree with Kevin that if we keep the object husk approach that any use of a weak pointer that returns nil should drop any reference to a husk.

Spin locks are, unfortunately, illegal on iOS, which does not guarantee progress in the face of priority inversion.

There is a spinlock algorithm that does work (in practice if not in theory), but it requires a full word of storage instead of a single bit.

Is that what OSSpinLock uses?

--
Greg Parker gparker@apple.com Runtime Wrangler

gparker42 · December 15, 2015, 9:38pm

The iOS scheduler maintains several different priority levels / QoS classes: background, utility, default, user-initiated, user-interactive. If any thread in a higher class is runnable then it will always run before every thread in lower classes. A thread's priority will never decay down into a lower class. (I am told that I/O throttling has similar effects, but I don't know the details there.)

This breaks naïve spinlocks like OSSpinLock. If a lower priority thread acquires the lock and is scheduled out, and then enough high-priority threads spin on the lock, then the low-priority thread will be starved and will never run again.

This is not a theoretical problem. libobjc saw dozens of livelocks against its internal spinlocks until we stopped using OSSpinLock.

One solution is to use truly unbounded backoff. This prevents permanent livelock (assuming the rest of the system ever goes quiescent), but can still block a high-priority thread for tens of seconds depending on system load.

Another solution is to use a handoff lock algorithm. This is what libobjc does now. The lock owner stores its thread ID in the lock. Each lock waiter yields to the owner thread specifically, donating its priority and resolving the inversion. This scheme has theoretical holes when multiple locks are involved, but in practice we haven't seen any problems.

As far as I know the spinlock that libobjc now uses is not API. We can't change OSSpinLock for binary compatibility reasons (the locked state is no longer 1 but some existing code assumes that).

···

On Dec 14, 2015, at 7:51 PM, Mike Ash <mike@mikeash.com> wrote:

On Dec 14, 2015, at 3:19 PM, Greg Parker via swift-dev <swift-dev@swift.org> wrote:

On Dec 14, 2015, at 9:47 AM, John McCall via swift-dev <swift-dev@swift.org> wrote:

On Dec 12, 2015, at 7:04 PM, Chris Lattner <clattner@apple.com> wrote:
#3 sounds like a great approach to me. I agree with Kevin that if we keep the object husk approach that any use of a weak pointer that returns nil should drop any reference to a husk.

Spin locks are, unfortunately, illegal on iOS, which does not guarantee progress in the face of priority inversion.

There is a spinlock algorithm that does work (in practice if not in theory), but it requires a full word of storage instead of a single bit.

Do you have a pointer (unintentional pun, oops) to this algorithm?

In this case, if we're going to dedicate a whole word to it, then we might as well go with the activity count implementation, but I'd be curious to read more just the same.

--
Greg Parker gparker@apple.com Runtime Wrangler

Lily_Ballard · December 15, 2015, 3:39am

Hmm, that's pretty unfortunate to hear. I've written code with spinlocks on iOS, and I imagine I'm not the only one. Does the system provide an implementation of this "safe in practice" spinlock that's visible to third-party devs?

-Kevin Ballard

···

On Mon, Dec 14, 2015, at 07:34 PM, Greg Parker wrote:

> On Dec 14, 2015, at 7:26 PM, Kevin Ballard via swift-dev <swift-dev@swift.org> wrote:
>
>> On Mon, Dec 14, 2015, at 12:19 PM, Greg Parker via swift-dev wrote:
>>
>>> On Dec 14, 2015, at 9:47 AM, John McCall via swift-dev <swift-dev@swift.org> wrote:
>>>
>>>> On Dec 12, 2015, at 7:04 PM, Chris Lattner <clattner@apple.com> wrote:
>>>> #3 sounds like a great approach to me. I agree with Kevin that if we keep the object husk approach that any use of a weak pointer that returns nil should drop any reference to a husk.
>>>
>>> Spin locks are, unfortunately, illegal on iOS, which does not guarantee progress in the face of priority inversion.
>>
>> There is a spinlock algorithm that does work (in practice if not in theory), but it requires a full word of storage instead of a single bit.
>
> Is that what OSSpinLock uses?

It does not. OSSpinLock is unsafe unless you can guarantee that all users have the same priority.

Lily_Ballard · December 16, 2015, 1:32am

It's not. I checked earlier today against the copy of the objc runtime I have (from opensource.apple.com; my copy is slightly out of date, at version 647, but it's new enough). The runtime defines a type spinlock_t that's backed by something called os_lock_handoff_s. It also imports a header <os/lock_private.h>, which is presumably where this type is defined. I already updated my radar to suggest that perhaps this type would be a good candidate to expose.

Incidentally, if I wanted to implement my own spinlock like this, how can you yield to a specific thread? I'm only aware of pthread_yield_np(), which doesn't take a target thread.

-Kevin Ballard

···

On Tue, Dec 15, 2015, at 01:38 PM, Greg Parker via swift-dev wrote:

Another solution is to use a handoff lock algorithm. This is what libobjc does now. The lock owner stores its thread ID in the lock. Each lock waiter yields to the owner thread specifically, donating its priority and resolving the inversion. This scheme has theoretical holes when multiple locks are involved, but in practice we haven't seen any problems.

As far as I know the spinlock that libobjc now uses is not API.

gparker42 · December 15, 2015, 3:48am

Not that I know of. You should file a bug report.

···

On Dec 14, 2015, at 7:39 PM, Kevin Ballard <kevin@sb.org> wrote:

On Mon, Dec 14, 2015, at 07:34 PM, Greg Parker wrote:

On Dec 14, 2015, at 7:26 PM, Kevin Ballard via swift-dev <swift-dev@swift.org> wrote:

On Mon, Dec 14, 2015, at 12:19 PM, Greg Parker via swift-dev wrote:

On Dec 14, 2015, at 9:47 AM, John McCall via swift-dev <swift-dev@swift.org> wrote:

On Dec 12, 2015, at 7:04 PM, Chris Lattner <clattner@apple.com> wrote:
#3 sounds like a great approach to me. I agree with Kevin that if we keep the object husk approach that any use of a weak pointer that returns nil should drop any reference to a husk.

Spin locks are, unfortunately, illegal on iOS, which does not guarantee progress in the face of priority inversion.

There is a spinlock algorithm that does work (in practice if not in theory), but it requires a full word of storage instead of a single bit.

Is that what OSSpinLock uses?

It does not. OSSpinLock is unsafe unless you can guarantee that all users have the same priority.

Hmm, that's pretty unfortunate to hear. I've written code with spinlocks on iOS, and I imagine I'm not the only one. Does the system provide an implementation of this "safe in practice" spinlock that's visible to third-party devs?

--
Greg Parker gparker@apple.com Runtime Wrangler

gparker42 · December 16, 2015, 2:01am

It uses Mach's thread_switch() and passes a Mach thread port. It also uses a private option flag; I don't know if any of the public ones are good enough to solve the problem.

···

On Dec 15, 2015, at 5:32 PM, Kevin Ballard <kevin@sb.org> wrote:

On Tue, Dec 15, 2015, at 01:38 PM, Greg Parker via swift-dev wrote:

Another solution is to use a handoff lock algorithm. This is what libobjc does now. The lock owner stores its thread ID in the lock. Each lock waiter yields to the owner thread specifically, donating its priority and resolving the inversion. This scheme has theoretical holes when multiple locks are involved, but in practice we haven't seen any problems.

As far as I know the spinlock that libobjc now uses is not API.

It's not. I checked earlier today against the copy of the objc runtime I have (from opensource.apple.com; my copy is slightly out of date, at version 647, but it's new enough). The runtime defines a type spinlock_t that's backed by something called os_lock_handoff_s. It also imports a header <os/lock_private.h>, which is presumably where this type is defined. I already updated my radar to suggest that perhaps this type would be a good candidate to expose.

Incidentally, if I wanted to implement my own spinlock like this, how can you yield to a specific thread? I'm only aware of pthread_yield_np(), which doesn't take a target thread.

--
Greg Parker gparker@apple.com Runtime Wrangler

Lily_Ballard · December 15, 2015, 6:41am

Filed as rdar://problem/23896366.

-Kevin Ballard

···

On Mon, Dec 14, 2015, at 07:48 PM, Greg Parker wrote:

> On Dec 14, 2015, at 7:39 PM, Kevin Ballard <kevin@sb.org> wrote:
>
>> On Mon, Dec 14, 2015, at 07:34 PM, Greg Parker wrote:
>>
>>> On Dec 14, 2015, at 7:26 PM, Kevin Ballard via swift-dev <swift-dev@swift.org> wrote:
>>>
>>>> On Mon, Dec 14, 2015, at 12:19 PM, Greg Parker via swift-dev wrote:
>>>>
>>>>> On Dec 14, 2015, at 9:47 AM, John McCall via swift-dev <swift-dev@swift.org> wrote:
>>>>>
>>>>>> On Dec 12, 2015, at 7:04 PM, Chris Lattner <clattner@apple.com> wrote:
>>>>>> #3 sounds like a great approach to me. I agree with Kevin that if we keep the object husk approach that any use of a weak pointer that returns nil should drop any reference to a husk.
>>>>>
>>>>> Spin locks are, unfortunately, illegal on iOS, which does not guarantee progress in the face of priority inversion.
>>>>
>>>> There is a spinlock algorithm that does work (in practice if not in theory), but it requires a full word of storage instead of a single bit.
>>>
>>> Is that what OSSpinLock uses?
>>
>> It does not. OSSpinLock is unsafe unless you can guarantee that all users have the same priority.
>
> Hmm, that's pretty unfortunate to hear. I've written code with spinlocks on iOS, and I imagine I'm not the only one. Does the system provide an implementation of this "safe in practice" spinlock that's visible to third-party devs?

Not that I know of. You should file a bug report.

John_McCall · December 16, 2015, 2:12am

So, just to complete the loop here: absent Darwin granting public and backwards-compatible access to an internal API, we need to write this in a way that falls back on using a heavyweight lock in the presence of contention. I’m fine with that being a global lock.

I’m also fine with using a spin lock on other platforms.

Note that Darwin platforms need this to interoperate with the unknownWeak entrypoints.

John.

···

On Dec 15, 2015, at 6:01 PM, Greg Parker via swift-dev <swift-dev@swift.org> wrote:

On Dec 15, 2015, at 5:32 PM, Kevin Ballard <kevin@sb.org> wrote:

On Tue, Dec 15, 2015, at 01:38 PM, Greg Parker via swift-dev wrote:

Another solution is to use a handoff lock algorithm. This is what libobjc does now. The lock owner stores its thread ID in the lock. Each lock waiter yields to the owner thread specifically, donating its priority and resolving the inversion. This scheme has theoretical holes when multiple locks are involved, but in practice we haven't seen any problems.

As far as I know the spinlock that libobjc now uses is not API.

It's not. I checked earlier today against the copy of the objc runtime I have (from opensource.apple.com; my copy is slightly out of date, at version 647, but it's new enough). The runtime defines a type spinlock_t that's backed by something called os_lock_handoff_s. It also imports a header <os/lock_private.h>, which is presumably where this type is defined. I already updated my radar to suggest that perhaps this type would be a good candidate to expose.

Incidentally, if I wanted to implement my own spinlock like this, how can you yield to a specific thread? I'm only aware of pthread_yield_np(), which doesn't take a target thread.

It uses Mach's thread_switch() and passes a Mach thread port. It also uses a private option flag; I don't know if any of the public ones are good enough to solve the problem.

Lily_Ballard · December 16, 2015, 2:27am

So, just to complete the loop here: absent Darwin granting public and backwards-compatible access to an internal API, we need to write this in a way that falls back on using a heavyweight lock in the presence of contention. I’m fine with that being a global lock.

Well, no, the activity count idea isn't a spinlock and is perfectly safe. It's actually basically a retain count, but it's protecting write access to the field rather than protecting an object.

Note that Darwin platforms need this to interoperate with the unknownWeak entrypoints.

What does that interoperation look like?

-Kevin Ballard

···

On Tue, Dec 15, 2015, at 06:12 PM, John McCall wrote: