Implementation details of new Mutex type

ph1ps · May 19, 2024, 6:16pm

Would someone, who understands the details of the Mutex implementation, mind help me understand what _Cell is doing and why it is being used?

What exactly is this type doing? I understand that os_unfair_lock_lock takes a pointer to an unfair lock structure but what is the difference of using _Cell (and I guess Builtin.addressOfRawLayout) instead of UnsafeMutablePointer<os_unfair_lock> which has been initialized like this:

let lock: UnsafeMutablePointer<os_unfair_lock>.allocate(capacity: 1)
lock.initialize(to: .init())

Why is _Cell also used to wrap Value in the Mutex struct? What is the difference between passing the pointee of the address passed and not just &value like this (simplified):

func withLock<R>(_ body: (inout Value) throws -> R) rethrows -> R {
    os_unfair_lock_lock(lock)
    defer { os_unfair_lock_unlock(lock) }
    return try body(&value)
}

Could a not-Swift-stdlib implementation of _Cell (or a wrapper of os_unfair_lock) even be done correctly, since this implementation is using Builtin.addressOfRawLayout(self) which is not available to normal developers (as far as I know)?

Nobody1707 · May 19, 2024, 7:08pm

Swift structs don't have stable addresses by default, so I would assume _Cell and Builtin.addressOfRawLayout() are required to get a stable address to self.

John_McCall · May 19, 2024, 7:28pm

We definitely want to provide those as general-purpose tools, but we need to figure out the exact design to expose.

ph1ps · May 19, 2024, 8:14pm

Is this documented somewhere? Also, in which case does the address change?

Alejandro · May 19, 2024, 9:17pm

I'd be glad to answer these questions

ph1ps:

What exactly is this type doing? I understand that os_unfair_lock_lock takes a pointer to an unfair lock structure but what is the difference of using _Cell (and I guess Builtin.addressOfRawLayout) instead of UnsafeMutablePointer<os_unfair_lock> which has been initialized like this:
let lock: UnsafeMutablePointer<os_unfair_lock>.allocate(capacity: 1)
lock.initialize(to: .init())

So Mutex stores its value inline instead of the usual out of line strategy. Swift really didn't give us the tools necessary to provide these sorts of types because it wanted to copy everything all the time (os_unfair_lock is just a UInt32, so passing a value of this to a function would just copy it). We could get pretty close with classes because a value is just the reference itself, but referencing ivars inside a class introduces runtime exclusivity checks that we have to avoid. So typically, the only safe way to do this in Swift was manually allocating the storage with a pointer like you have or using something like ManagedBuffer to give us some control of the tail allocated storage.

Mutex on the other hand uses an underscored attribute @_rawLayout that guarantees we pass a value of it by address when borrowing it. So calling a borrowing function or passing a borrowing Mutex<T> will always be by address (getting us a stable address). Builtin.addressOfRawLayout is just a way to get this address. _Cell is just the generalization of this where it stores a value T inline and has an API that lets you have a direct pointer to that storage (because passing a _Cell is by address, so anything that stores a _Cell is also by address).

ph1ps:

Why is _Cell also used to wrap Value in the Mutex struct? What is the difference between passing the pointee of the address passed and not just &value like this (simplified):
func withLock<R>(_ body: (inout Value) throws -> R) rethrows -> R {
    os_unfair_lock_lock(lock)
    defer { os_unfair_lock_unlock(lock) }
    return try body(&value)
}

_Cell is used to wrap the value of the mutex because the withLock API is a borrowing function, so you cannot access the inout operator & within this scope. If value was a regular stored property we wouldn't be able to say &value because the compiler knows you cannot do this operation in a borrowing scope. _Cell solves this problem because it lets us get a direct pointer to the value that we can convert into an inout. Note this is usually super unsafe, but we can guarantee runtime exclusivity due to enforcing that the mutex is locked while accessing this value.

Anyone can access the Builtin module and any underscored attributes, but generally they are unsafe, hard to use correctly, and may change at a moments notice. John already mentioned that we're looking to figure out how best to expose some of this functionality so that you can actually implement Mutex yourself (perhaps we just need to expose a public version of _Cell? but it's super unsafe with just giving you a raw pointer )

Of course, anyone can use ManagedBuffer or the pointer APIs to do your own out-of-line implementation which is what most of the community has been doing for over a decade now. (Those APIs are also pretty unsafe and hard to use)

ph1ps · May 20, 2024, 3:45pm

Wow, thank you for your detailed explanation. This helped a lot .

shawnthroop · May 6, 2025, 3:06pm

This thread was really informative, thank you.

I'm currently attempting to use Mutex in some caching logic and wondered if a property wrapper would also be viable considering the _read and _modify accessors.

Since it was mentioned that anyone could replicate Mutex, I gave it a shot. These experimental features were needed: BuiltinModule, RawLayout, and BuiltinAddressOfRawLayout. I rebuilt Mutex nearly line for line (though I renamed _Cell to UnsafeStablePointer and _MutexHandle to UnfairLock for my own brain):

@propertyWrapper public struct Mutex<Value: ~Copyable> : ~Copyable {
    
    let value: UnsafeStablePointer<Value>
    let handle = UnfairLock()
    
    public init(wrappedValue: consuming sending Value) {
        value = .init(wrappedValue)
    }
    
    public var wrappedValue: Value {
        _read {
            handle.lock()
            defer {
                handle.unlock()
            }
            yield value._address.pointee
        }
        _modify {
            handle.lock()
            defer {
                handle.unlock()
            }
            yield &value._address.pointee
        }
    }
}

extension Mutex: @unchecked Sendable where Value: ~Copyable {}

Implementations Continued...

@frozen
@_rawLayout(like: Value, movesAsLike)
public struct UnsafeStablePointer<Value: ~Copyable> : ~Copyable {
    
    @_alwaysEmitIntoClient
    @_transparent
    public var _address: UnsafeMutablePointer<Value> {
        .init(pointer)
    }
    
    @_alwaysEmitIntoClient
    @_transparent
    var pointer: Builtin.RawPointer {
        Builtin.addressOfRawLayout(self)
    }
    
    
    @_alwaysEmitIntoClient
    @_transparent
    public init(_ initialValue: consuming Value) {
        _address.initialize(to: initialValue)
    }
    
    @_alwaysEmitIntoClient
    @inlinable
    deinit {
        _address.deinitialize(count: 1)
    }
}


public struct UnfairLock: ~Copyable {
    
    let value: UnsafeStablePointer<os_unfair_lock>
    
    public init() {
        value = .init(os_unfair_lock())
    }
    
    public borrowing func lock() {
        os_unfair_lock_lock(value._address)
    }
    
    public borrowing func tryLock() -> Bool {
        os_unfair_lock_trylock(value._address)
    }

    public borrowing func unlock() {
        os_unfair_lock_unlock(value._address)
    }
}

Are there any upsides to this take rather than the closure base API that seems to be preferred? I ran into some issues while trying to implement a withLock(_ body:) function:

public var projectedValue: Mutex<Value> { /// 🛑 'self' is borrowed and cannot be consumed
    self
}

public borrowing func withLock<R: ~Copyable, E: Error>(_ body: (inout sending Value) throws(E) -> sending R) throws(E) -> sending R {
    handle.lock()
    defer { handle.unlock() }
    return try body(&value._address.pointee)
}

I know that here be dragons, and that I'm kinda standing right in the mouth of the cave.

My implementation is nearly the same as Synchronization's Mutex type so I'm ready to fallback to it. However... in the same vein of the original question, I'm curious as to why the API is closure based?

When I was doing some research I found out some caveats regarding atomic read and write. For example, subscripts and mutating properties that would use separate lock/unlock calls for get and set. Dictionary's key subscript being the go-to example.

To my naive eyes, this is something _modify and property wrappers were designed to fix. Would I be shooting myself in the foot with this property wrapper implementation?

Joe_Groff · May 6, 2025, 3:32pm

withLock { } serves a couple of purposes that are essential for Mutex's safety:

It ensures that the mutex remains borrowed for the duration of the access to the value, which prevents the lock from being destroyed, replaced, or moved while in use.
Its closure is intentionally not async, which ensures that a task cannot suspend while holding the lock, preventing deadlocks. With lightweight lock implementations like os_unfair_lock or futex, this is also an implementation requirement that the same thread that locked the mutex also unlocks it.

Independent lock() and unlock() cannot enforce either invariant, so are completely unsafe, which is why they are not public API. The read and modify coroutines can enforce the first invariant, but not the second, since an async caller can begin a coroutine then get suspended while the coroutine has yielded, and then potentially get resumed on a different thread, corrupting the lock.

From an API design perspective, hiding synchronization in a wrapper has historically tended to lead to brittle code with subtle logical race conditions, since interesting transactions tend to involve more than one load or store operation at a time, and such designs only provide synchronization around each individual access of the property. This is why similar features in other languages such as Java's synchronized properties or Objective-C's @property(atomic) have fallen out of favor. Explicit locking scopes make it clear how much work occurs within one critical section.

shawnthroop · May 6, 2025, 9:08pm

This question from my searching makes more sense. Thanks for the clear explanation, I hadn’t thought about it in terms of actual scope. I was assuming it was just a convenience for the lock/unlock dance where yield was sugar for body: (inout sending Value) throws(E) -> Result, I’m gonna have to re-read that proposal.

The insights are much appreciated.

shawnthroop · May 8, 2025, 7:53am

As I went to switch to Synchronization's Mutex in my code base, I had forgotten it's restricted to iOS 18.0 and later. However, with my rebuilding of Mutex and the underlying mechanisms (literally line for line now) I realized that there is nothing besides the experimental swift features that enable this implementation.

In my current project I'm using a class so ensuring the pointer uniqueness through class ownership of a UnsafeMutablePointer<os_unfair_lock> (the old unsafe ways like @Alejandro mentioned above). However, the ~Copyable Mutex implementation seems less prone to mistakes/pitfalls now that I understand what going on.

If I'm only building for Darwin platforms using the Swift 6.0 toolchain, how risky would it be to back deploy a private _Cell<os_unfair_lock> based implementation in my personal code base?

Joe_Groff · May 8, 2025, 2:06pm

The experimental features that _Cell relies on are entirely compile-time, so there shouldn't be any back deployment issues using them in your own reimplementation. The main issue with that approach would be that anywhere these types appear in stable public API, you would be obligated to use your own implementation forever, but if you keep it hidden as an implementation detail, or don't need to provide API stability at all, then that shouldn't be a problem.